Tracksolid deployment with timescale & grafana with backup
Background
----------
A field audit of liveposition.rahamafresh.com on 2026-05-21 surfaced two
freshness gaps that share a single root cause: tracksolid.live_positions
was being written by only one path (the 60s polled sweep), and that path
silently omits devices that don't have a "current" fix in Jimi's
location.list response. Effect on the dashboard:
* 18 vehicles show OFFLINE for days-to-months — last fix is whatever
the sweep wrote before Jimi dropped them.
* 3 vehicles (KDK 780K, KCQ 618K, KCZ 476E) depend on dashcam fallback
because their dedicated tracker has been silent; the camera's lat/lng
arrives via /pushalarm webhooks (5,287/day, 100% lat/lng fill) but
we discard it after writing to tracksolid.alarms.
Verified upstream subscription state: only /pushalarm is registered with
Jimi; the n8n forwarders for /pushgps, /pushtripreport, /pushobd are
inactive. This change uses only data that already arrives.
What's in this PR
-----------------
ts_shared_rev.py
* upsert_live_position(cur, imei, lat, lng, gps_time, ..., extras=None)
— single time-guarded upsert all three writers will share. Guards on
is_valid_fix() (filters Zero-Island and out-of-range) and
EXCLUDED.gps_time > stored.gps_time so late-arriving alarms or
webhook retries can't rewind a fresher marker. COALESCE on optional
columns so sparse callers don't blank dense ones' values.
* get_stale_imeis(stale_minutes=30) — SELECT enabled_flag=1 devices
whose live_positions.gps_time is NULL or older than the threshold,
ordered NULLS FIRST so worst-offenders are in batch #1.
* ensure_device(cur, imei, device_name=None) — relocated from
webhook_receiver_rev so every live_positions writer can satisfy the
FK without re-defining the helper. The original underscore-prefixed
name in webhook_receiver_rev becomes a backwards-compat alias.
webhook_receiver_rev.py
* /pushalarm — after the alarm row insert, call upsert_live_position
with the alarm's lat/lng and alarmTime. Sits inside the existing
per-item SAVEPOINT, so a cross-feed failure rolls back only that
one alarm's cross-feed, not the alarm row.
ingest_movement_rev.py
* poll_live_positions — inline INSERT replaced with upsert_live_position
(extras dict carries the sweep-only columns). Same data, time-guarded.
* get_device_locations — inline INSERT replaced; also gains an
ensure_device call so it can be safely fed arbitrary IMEIs.
* poll_stale_locations() — new wrapper. Pulls get_stale_imeis() and
hands it to get_device_locations. Scheduled every 10 minutes plus a
startup catch-up call. Uses jimi.device.location.get which returns
*last-known* fix, so devices the 60s sweep drops can be re-warmed.
Expected post-deploy effect (estimates, see
260521_timescale_location_upgrade_major.md §4)
* ~1,100-1,600 additional live_positions upserts/day from the alarm
cross-feed, after the time-guard rejects ~70-80% of races vs the
fresher 60s sweep.
* The 3 camera-fallback plates flip to "seconds-after-alarm" cadence
(JC400P emits ~107 alarms/day per device).
* 8-14 of the 24 OFFLINE plates expected to recover via location.get's
last-known-fix path within the first 30 minutes.
* Dashboard's "Offline 24h+" KPI: 24 → 10-14 within the first hour.
* No 06_live_location code changes required — reads through
reporting.v_live_positions transparently.
Tests
-----
12 webhook integration tests pass (3 new: cross-feed fires on valid fix;
skips without lat/lng; skips Zero-Island). 8 new unit tests in
test_stale_imeis.py cover the stale selector, the poll wrapper, and the
time-guard contract on upsert_live_position. Full suite: 77 passed.
Deployment
----------
No schema migration. Both webhook_receiver and ingest_movement
containers must be rebuilt — source is image-baked, not bind-mounted.
Rollback is git revert + rebuild.
Plan & monitoring SQL: 06_live_location/260521_timescale_location_upgrade_major.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .forgejo/workflows | ||
| administration | ||
| backup | ||
| db_audit | ||
| docs | ||
| grafana | ||
| n8n-workflows | ||
| tests | ||
| .env | ||
| .gitignore | ||
| .python-version | ||
| 01_BusinessAnalytics.md | ||
| 02_tracksolid_docker_commands.md | ||
| 02_tracksolid_full_schema_rev.sql | ||
| 03_webhook_schema_migration.sql | ||
| 04_bug_fix_migration.sql | ||
| 05_enhancement_migration.sql | ||
| 06_business_analytics_migration.sql | ||
| 07_analytics_views.sql | ||
| 260410_baseline_report.md | ||
| 260412_baseline_report.md | ||
| 20260414_FS__Logistics - final_fixed.csv | ||
| CLAUDE.md | ||
| docker-compose.yaml | ||
| Dockerfile | ||
| grafanaDeployment.md | ||
| grafanaOperationalManual.md | ||
| import_drivers_csv.py | ||
| ingest_events_rev.py | ||
| ingest_movement_rev.py | ||
| OPERATIONS_MANUAL.md | ||
| pyproject.toml | ||
| README.md | ||
| run_migrations.py | ||
| run_migrations.sh | ||
| sync_driver_audit.py | ||
| tracksolid_DB_manual.md | ||
| tracksolidApiDocumentation.md | ||
| ts_shared_rev.py | ||
| webhook_receiver_rev.py | ||