Tracksolid deployment with timescale & grafana with backup
Find a file
david kiania 2309464ab8
Some checks failed
Static Analysis / static (push) Has been cancelled
Tests / test (push) Has been cancelled
FIX-M21: alarm cross-feed + stale-IMEI recovery for live_positions
Cherry-pick of c8f5907 (originally FIX-M20 on main) onto
quality-program-2026-04-12 — renamed to FIX-M21 here to avoid clashing
with this branch's existing [FIX-M20] (trip enrichment, commit 144dede).
Behaviour and code are unchanged from the main-branch original; the
annotation tag is the only difference.

Background
----------
A field audit of liveposition.rahamafresh.com on 2026-05-21 surfaced two
freshness gaps that share a single root cause: tracksolid.live_positions
was being written by only one path (the 60s polled sweep), and that path
silently omits devices that don't have a "current" fix in Jimi's
location.list response. Effect on the dashboard:

  * 18 vehicles show OFFLINE for days-to-months — last fix is whatever
    the sweep wrote before Jimi dropped them.
  * 3 vehicles (KDK 780K, KCQ 618K, KCZ 476E) depend on dashcam fallback
    because their dedicated tracker has been silent; the camera's lat/lng
    arrives via /pushalarm webhooks (5,287/day, 100% lat/lng fill) but
    we discard it after writing to tracksolid.alarms.

Verified upstream subscription state: only /pushalarm is registered with
Jimi; the n8n forwarders for /pushgps, /pushtripreport, /pushobd are
inactive. This change uses only data that already arrives.

What's in this commit
---------------------
ts_shared_rev.py
  * upsert_live_position(cur, imei, lat, lng, gps_time, ..., extras=None)
    — single time-guarded upsert all three writers will share. Guards on
    is_valid_fix() (filters Zero-Island and out-of-range) and
    EXCLUDED.gps_time > stored.gps_time so late-arriving alarms or
    webhook retries can't rewind a fresher marker. COALESCE on optional
    columns so sparse callers don't blank dense ones' values.
  * get_stale_imeis(stale_minutes=30) — SELECT enabled_flag=1 devices
    whose live_positions.gps_time is NULL or older than the threshold,
    ordered NULLS FIRST so worst-offenders are in batch #1.
  * ensure_device(cur, imei, device_name=None) — relocated from
    webhook_receiver_rev so every live_positions writer can satisfy the
    FK without re-defining the helper. The original underscore-prefixed
    name in webhook_receiver_rev becomes a backwards-compat alias.

webhook_receiver_rev.py
  * /pushalarm — after the alarm row insert, call upsert_live_position
    with the alarm's lat/lng and alarmTime. Sits inside the existing
    per-item SAVEPOINT, so a cross-feed failure rolls back only that
    one alarm's cross-feed, not the alarm row.

ingest_movement_rev.py
  * poll_live_positions — inline INSERT replaced with upsert_live_position
    (extras dict carries the sweep-only columns). Same data, time-guarded.
  * get_device_locations — inline INSERT replaced; also gains an
    ensure_device call so it can be safely fed arbitrary IMEIs.
  * poll_stale_locations() — new wrapper. Pulls get_stale_imeis() and
    hands it to get_device_locations. Scheduled every 10 minutes plus a
    startup catch-up call. Uses jimi.device.location.get which returns
    *last-known* fix, so devices the 60s sweep drops can be re-warmed.

Expected post-deploy effect (estimates, see
06_live_location/260521_timescale_location_upgrade_major.md §4)
  * ~1,100-1,600 additional live_positions upserts/day from the alarm
    cross-feed, after the time-guard rejects ~70-80% of races vs the
    fresher 60s sweep.
  * The 3 camera-fallback plates flip to "seconds-after-alarm" cadence
    (JC400P emits ~107 alarms/day per device).
  * 8-14 of the 24 OFFLINE plates expected to recover via location.get's
    last-known-fix path within the first 30 minutes.
  * Dashboard's "Offline 24h+" KPI: 24 → 10-14 within the first hour.
  * No 06_live_location code changes required — reads through
    reporting.v_live_positions transparently.

Tests
-----
12 webhook integration tests pass (3 new: cross-feed fires on valid fix;
skips without lat/lng; skips Zero-Island). 8 new unit tests in
test_stale_imeis.py cover the stale selector, the poll wrapper, and the
time-guard contract on upsert_live_position. Full suite: 77 passed.

Deployment
----------
No schema migration. Both webhook_receiver and ingest_movement
containers must be rebuilt — source is image-baked, not bind-mounted.
Rollback is git revert + rebuild.

Plan & monitoring SQL: 06_live_location/260521_timescale_location_upgrade_major.md
Verification playbook:  06_live_location/260521_timescale_location_upgrade_verification.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 22:33:21 +03:00
.claude feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
.forgejo/workflows feat: add db_audit health checks, runner, and scheduled Forgejo workflow 2026-04-12 21:40:29 +03:00
administration Add DB connection string to ops manual, add administration notes, remove stale deploy guide 2026-04-10 22:34:56 +03:00
backup feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
db_audit feat: add db_audit health checks, runner, and scheduled Forgejo workflow 2026-04-12 21:40:29 +03:00
docs feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
dwh feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
grafana fix(grafana): raise geomap maxZoom from 12 to 22 for full-resolution drill-in 2026-04-27 18:32:14 +03:00
n8n-workflows Add n8n workflow templates and change webhook port to 8888 2026-04-08 18:54:42 +03:00
tests FIX-M21: alarm cross-feed + stale-IMEI recovery for live_positions 2026-05-21 22:33:21 +03:00
***tracksolid_DB_manual.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
**01_BusinessAnalytics.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
**02_tracksolid_docker_commands.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
**260410_baseline_report.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
**OPERATIONS_MANUAL.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
.env fix: point DATABASE_URL at timescale_db container (not legacy 31.97.44.246:5888) 2026-04-18 15:43:49 +03:00
.gitignore Add webhook receiver, consolidate shared utilities, expand telemetry coverage 2026-04-08 16:31:17 +03:00
.python-version chore: align .python-version to 3.12.0 (matches Docker image and pyproject.toml) 2026-04-12 21:41:43 +03:00
02_tracksolid_full_schema_rev.sql Add webhook receiver, consolidate shared utilities, expand telemetry coverage 2026-04-08 16:31:17 +03:00
03_webhook_schema_migration.sql Add webhook receiver, consolidate shared utilities, expand telemetry coverage 2026-04-08 16:31:17 +03:00
04_bug_fix_migration.sql Fix alarm field mapping, distance unit bug, parking params; add schema migrations 2026-04-10 22:18:30 +03:00
05_enhancement_migration.sql Fix alarm field mapping, distance unit bug, parking params; add schema migrations 2026-04-10 22:18:30 +03:00
06_business_analytics_migration.sql feat: business analytics expansion + driver CSV import 2026-04-18 08:30:34 +03:00
07_analytics_views.sql feat: Daily Operations dashboard + tracksolid analytics views 2026-04-19 13:44:18 +03:00
08_analytics_config.sql feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
09_trips_enrichment.sql feat(trips): [FIX-M20] enrich tracksolid.trips with coords, route polyline, addresses, plate 2026-05-01 21:30:20 +03:00
10_driver_clock_views.sql feat(analytics): add v_driver_clock_daily/today views for tardiness monitoring 2026-05-04 14:03:40 +03:00
10_pgbouncer_auth.sql feat(infra): add pgbouncer sidecar to cap tracksolid_db connections 2026-05-07 13:21:35 +03:00
55_ts_coolify_gemini_prod.code-workspace feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
260412_baseline_report.md Add 260412 baseline report — first trip data, FIX-M16 confirmed 2026-04-12 00:14:27 +03:00
260427_audit_output.txt feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
260427_device_reconciliation.md feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
260507_pgbouncer_deployment.md revert(infra): remove pgAdmin4 sidecar and configs 2026-05-08 00:34:10 +03:00
20260414_FS__Logistics - final_fixed.csv fix: auto-register devices on push + allow CSV import to insert new rows 2026-04-21 12:29:32 +03:00
20260427_FSG_Vehicles_mitieng.csv feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
audit_device_reconciliation.py feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
backfill_trips_enrichment.py feat(trips): add --skip-geocode flag to backfill script 2026-05-01 22:12:07 +03:00
CLAUDE.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
connecting_python_tracksolid.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
docker-compose.yaml revert(infra): remove pgAdmin4 sidecar and configs 2026-05-08 00:34:10 +03:00
Dockerfile Fix migration failures: switch to full TimescaleDB + use psql runner 2026-04-08 17:17:58 +03:00
documents.txt feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
DWH_Execution_Manual.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
fireside_logistics_cleaned_v2.csv feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
grafanaDeployment.md Add Grafana NOC fleet dashboard with provisioning 2026-04-09 00:01:52 +03:00
grafanaOperationalManual.md Add Grafana NOC operational manual 2026-04-09 00:12:48 +03:00
import_drivers_csv.py feat(analytics): Phase 0 — analytics-config migration and CSV importer rewrite 2026-04-27 23:42:37 +03:00
ingest_events_rev.py perf+fix: SAVEPOINT-per-item pollers, batched GPS inserts, parallel detail fetch 2026-04-18 00:33:55 +03:00
ingest_movement_rev.py FIX-M21: alarm cross-feed + stale-IMEI recovery for live_positions 2026-05-21 22:33:21 +03:00
new_feature.txt feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
push_webhook.md feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
pyproject.toml ci: add ruff + mypy static analysis config and Forgejo workflow 2026-04-12 21:32:33 +03:00
README.md first commit 2026-04-07 20:41:16 +03:00
run_migrations.py feat(infra): add pgbouncer sidecar to cap tracksolid_db connections 2026-05-07 13:21:35 +03:00
run_migrations.sh Add idempotent migration runner script 2026-04-10 23:31:57 +03:00
sync_driver_audit.py perf+fix: SAVEPOINT-per-item pollers, batched GPS inserts, parallel detail fetch 2026-04-18 00:33:55 +03:00
tracksolid_analytics_pipeline.txt feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
tracksolid_extract.py feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
tracksolid_ingestion_pipeline.txt feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
tracksolid_update_v2.py fix(api): map new Mitieng CSV columns in tracksolid_update_v2 2026-04-27 23:42:20 +03:00
tracksolid_vehicle_update.py feat(dwh): bronze pipeline migrations, runbook, and execution manual 2026-04-25 01:07:53 +03:00
tracksolidApiDocumentation.md Update tracksolidApiDocumentation.md with live implementation findings 2026-04-11 07:52:28 +03:00
ts_shared_rev.py FIX-M21: alarm cross-feed + stale-IMEI recovery for live_positions 2026-05-21 22:33:21 +03:00
webhook_receiver_rev.py FIX-M21: alarm cross-feed + stale-IMEI recovery for live_positions 2026-05-21 22:33:21 +03:00