Migration 11 was applied out of order on 2026-06-10 (it had never been recorded
applied on prod — the reporting objects were hand-created, then migrations 14/15/16
modified them). Re-running 11 recreated reporting.v_live_positions and
reporting.fn_live_positions at their BASE definitions; 15/16 were skipped as
already-applied, so the live map silently lost migration 15's cost-centre exclusion
(personal/management/mtn) and migration 16's vehicle_type/fleet_segment GeoJSON
fields — the live-map vehicle count jumped 74 -> 80.
Migration 20 re-asserts both objects' intended final definitions (verbatim union of
15 + 16). As the highest-numbered migration it always runs last, so the correct
state wins regardless of apply order. Already hot-fixed on prod by re-running 15+16;
this captures it durably. Idempotent.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
These subsystems are retired and replaced by better alternatives (FleetNow /
FleetOps SPAs via dashboard_api; in-process pooling; reporting.v_ingest_health).
Remove them so the repo reflects the live stack only. Nothing running depends
on the deleted artifacts.
Deleted (dead artifacts):
- n8n-workflows/ (retired webhook exports), grafana/ (provisioning for the
removed service), dwh/ (migrations for the decommissioned external warehouse)
- runbooks: DWH_PIPELINE.md, DWH_Execution_Manual.md, grafanaDeployment.md,
grafanaOperationalManual.md
Code/config:
- run_migrations.py: drop sync_role_passwords() (its only entries were the now
-dead grafana_ro + pgbouncer syncs; the guard already made it inert)
- .env: remove the two unused GRAFANA_* vars
- ingest_movement_rev.py / db_audit / deploy_dashboard_api_staging.sh: reword
stale Grafana/grafana_ro comments
Docs: scrub n8n/Grafana/DWH from CLAUDE.md, CONNECTIONS, DATA_FLOW,
OPERATIONS_MANUAL, docker_commands, KPI_FRAMEWORK, PLATFORM_OVERVIEW,
STAGING_FLEETOPS, and deprecation-banner the two large SQL libraries
(dwh_gold was already dropped 2026-06-05).
Kept deliberately: the grafana_ro DB role (now an unused read-only login),
applied migration history, dated docs/reports/*, and docs/superpowers/* specs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse the backend from 7 Coolify services to 4 app services + the DB.
- Merge ingest_movement + ingest_events into a single ingest_worker:
split each poller's main() into reusable startup_catchup()/register_jobs()
and drive both from one schedule loop in new ingest_worker_rev.py
(standalone entrypoints retained for local debug).
- docker-compose.yaml: replace the two poller services with ingest_worker;
remove the pgbouncer service (dormant; transaction-mode pooling is unsafe
for the advisory-lock'd v_trips refresher) and the grafana service +
grafana-data volume (redundant with the FleetOps SPA).
- Add reporting.v_ingest_health (migration 19) + dashboard_api GET
/health/ingest as the pipeline-freshness surface that replaces Grafana's
health panels.
webhook_receiver stays isolated so a poller fault can't drop inbound pushes.
timescale_db and db_backup are unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Prod bridge now reads via dashboard_ro with the refresher on REFRESH_DATABASE_URL
(privileged). Both webhooks registered. Updated the as-built banner + §6 safety
note accordingly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Enables stage-2: the prod dashboard_api request pool connects as the READ-ONLY
dashboard_ro role (DATABASE_URL) while the v_trips refresher keeps a privileged
connection via REFRESH_DATABASE_URL (falls back to DATABASE_URL when unset, so
single-role/staging deploys are unchanged). Avoids the FIX-D02 trap (a read-only
role cannot REFRESH).
Adds deploy_dashboard_api.sh (the prod bridge deploy, now version-controlled):
strips inherited DATABASE_URL, sets REFRESH_DATABASE_URL=<app role> +
DATABASE_URL=dashboard_ro, CORS incl. fleetops.rahamafresh.com.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- STAGING_FLEETOPS_ARCHITECTURE.md: as-built status banner + phase table marked
delivered; remaining op follow-up = 2 Forgejo webhooks (FleetOps-prod,
FleetNow-staging)
- CLAUDE.md §3: document FleetOps (Caddy SPA) + /analytics/*, the fivetitude.com
staging umbrella, the 8891 staging bridge + dashboard_ro role, /env.js per-env
injection; refresh migration range to 02-18 + new infra files in the map
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Updates STAGING_FLEETOPS_ARCHITECTURE.md to reflect the dedicated read-only
dashboard_ro role (replacing the grafana_ro reuse), the explicit v_trips matview
grant, DEFAULT PRIVILEGES, host-only password, and the two-stage plan (staging
now, live prod connection later). Notes migrations 17+18 applied; Phase 0
read-only role complete, webhook deploys still pending.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaces the grafana_ro reuse with a purpose-built least-privilege login role
that can serve the FULL dashboard_api read surface — so it backs the staging
instance now and can take over the live prod connection later (stage 2).
scripts/dashboard_ro_role.sql (run as postgres, password-free in repo):
- CREATE ROLE dashboard_ro LOGIN, read-only
- SELECT on reporting.* + tracksolid.*; explicit SELECT on the
reporting.v_trips MATERIALIZED VIEW (not covered by GRANT ON ALL TABLES)
- EXECUTE on reporting.fn_* map functions
- ALTER DEFAULT PRIVILEGES so future objects are auto-readable ("dynamic")
scripts/bootstrap_dashboard_ro.sh:
- generates the password into ~/.dashboard_ro.pw (0600), never printed
- applies the DDL via docker exec psql -U postgres -v ro_pw=...
deploy_dashboard_api_staging.sh: build DATABASE_URL from dashboard_ro +
~/.dashboard_ro.pw instead of grafana_ro.
Migrations 17/18 (already applied) are left intact. Not yet executed on host.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
migrations/18_grant_reporting_ro.sql — grants USAGE + SELECT on the reporting.*
layer to grafana_ro, with DEFAULT PRIVILEGES so future reporting views are
auto-readable. grafana_ro is the read-only role the staging dashboard_api
connects as; it read tracksolid.* but never reporting.* (the prod dashboard_api
uses the app role), surfacing as "permission denied for view v_filter_drivers /
v_daily_summary" on the staging /analytics/* endpoints. Read-only only — no
write/REFRESH. Registered 18 in run_migrations.py.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
deploy_dashboard_api_staging.sh — standalone bridge twin of the prod
~/deploy_dashboard_api.sh for the fivetitude.com staging umbrella:
- container dashboard_api_staging on port 8891
- Traefik Host(fleetapi.fivetitude.com), router/service names suffixed -staging
- CORS = fleetnow.fivetitude.com, fleetops.fivetitude.com
- DATABASE_URL derived on-host as a READ-ONLY grafana_ro URL (never printed)
- VTRIPS_REFRESH_INTERVAL_S=0 so the read-only instance never REFRESHes
(prod owns the v_trips materialized-view refresh)
Reuses the webhook_receiver image + app network + WIP bind-mount, exactly like
prod. Not yet executed on the host — awaiting go-ahead.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the read-only /analytics/* surface the FleetOps SPA will consume, plus
the migration that backs the fuel roll-up. All endpoints SELECT the indexed
reporting.* / tracksolid.v_* views and never write, so the forthcoming staging
instance can serve them against the prod DB as grafana_ro.
dashboard_api_rev.py:
- GET /analytics/fleet-summary per-vehicle + per-cost-centre roll-up
- GET /analytics/utilisation per-vehicle utilisation + daily fleet trend
- GET /analytics/driver-behaviour per-driver speeding / harsh index
- GET /analytics/fuel actual vs estimated litres (data-gated flags)
- GET /analytics/filters dropdown options (alias of GET /webhook/fleet-dashboard)
- responses run through jsonable_encoder (Decimal->float, date->ISO)
- VTRIPS_REFRESH_INTERVAL_S<=0 now DISABLES the v_trips refresher, so a
read-only staging instance never attempts REFRESH (prod still owns it).
migrations/17_fleetops_fuel_view.sql:
- reporting.v_fuel_daily encapsulates the v_trips->devices join (so the
read-only role needs SELECT only on the view) and grants it to grafana_ro.
Registered 17 in run_migrations.py. Note: live migration head is 16, not 13
as CLAUDE.md implies. Endpoints are unit-compilable but untested live until
the staging bridge (Phase 1) exists.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds docs/STAGING_FLEETOPS_ARCHITECTURE.md — the project doc for splitting
fleet tracking (FleetNow, frozen prod) from fleet operations (FleetOps, new
SPA) and standing up a staging environment under the fivetitude.com wildcard.
Covers: target topology + env matrix, the two dashboard_api instances
(prod 8890 / staging 8891, read-only role, refresher off), Forgejo->Coolify
webhook deploy + branch promotion model, FleetOps SPA on Caddy (Traefik still
terminates TLS), shared prod read-layer safety model, 6-phase rollout, and a
verification checklist.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- scripts/export_osm_pois.py: reproducible OSM .pbf -> GeoJSON+CSV exporter
(amenity/brand filter; pyosmium via uv, no system deps).
- docs/OSM_POI_EXPORT.md: runbook (extract -> export -> FleetNow layer) with
reference counts (1,794 fuel stations; Shell=232).
- shell_stations.geojson/.csv: the Shell export of record (232 pts, kenya-260605).
- docs/reports/260608_fleet_registry_data_quality.*: rewritten as a graded
(Red/Amber/Yellow) action plan with owners.
- .gitignore: ignore *.osm.pbf (331MB, reproducible). CLAUDE.md: index the new docs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Integrates main's PR #15 (FIX-M20 alarm cross-feed + stale-IMEI recovery,
commit c8f5907). The branch already carries that exact commit cherry-picked
as FIX-M21, so the 6 file conflicts were resolved to the branch version
(functional superset; only the FIX-M20->M21 comment tag differs).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two audience-tailored one-pagers (md + matching PDF) making the case for the
merged live+historical platform, plus a shared demo/talking-points script.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fn_live_positions now emits 'vehicle_type' (devices.vehicle_models) and
'fleet_segment' (reporting.fn_fleet_segment) in each GeoJSON feature so FleetNow
can give specialist vehicles (Crane/Motorbike/Pick-Up) their own marker icons.
Additive only — no signature change, STABLE function read immediately by
dashboard_api (no redeploy). Function body reproduced verbatim from prod via
pg_get_functiondef plus the two new properties.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hide personal + management + mtn (Uganda/Kampala) vehicles from the live tracking
map (FleetNow + liveposition SPA). Adds an ops-editable config table
reporting.map_excluded_cost_centres and filters reporting.v_live_positions to drop
any plate whose device(s) carry an excluded cost centre (robust to the tracker/cam
cost_centre inconsistency).
Scope is live-map only; reporting.v_trips (trip history) is intentionally untouched.
The base view feeds reporting.fn_live_positions, so the change propagates to every
live consumer with no dashboard_api redeploy or frontend change. Verified live:
80 -> 74 vehicles, all 6 targets gone (KDU 613A, KDW 781E, UMA 011EK/382EK/418EK/826AB).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add reporting.fn_fleet_segment() and reporting.v_vehicles, splitting the fleet
into ticket-closing field_service vs specialist plant (crane/pick-up/motorbike)
that does not close immediate customer tickets.
The segment is DERIVED from tracksolid.devices.vehicle_models — itself an
authoritative Tracksolid API field (sync_devices maps jimi.user.device.list ->
vehicleModels) — so it stays API-current with no re-seeding; the manual
vehicle_category column is intentionally unused. v_vehicles collapses the
tracker+dashcam device pairs to one row per vehicle by reusing
reporting.normalize_plate() and the same primary-device precedence as
reporting.v_trips / v_live_positions (auto-merges 'KDS 453Y'/'KDS 453 Y',
resolves within-plate model conflicts via the primary tracker).
Verified live: 80 vehicles (61 field_service / 16 specialist / 3 unassigned),
grafana_ro granted. Includes the supporting data-quality report.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- New '3. Map dashboards & read-API' subsection: the three SPAs (liveposition,
fleetintelligence, fleetnow), how dashboard_api is deployed (standalone bridge
container, not Coolify), and that FleetNow lives in its own repo.
- FIX-D03: fleetnow CORS origin + the deploy-script strip/guard fixes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The merged FleetNow dashboard (separate repo, Coolify) reads this read-API, so
its origin must be in DASHBOARD_CORS_ORIGINS. Added to the code default; live
config is set via the env in ~/deploy_dashboard_api.sh on the host.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the dormant ops (workshop / tickets / dispatch / SLA / odometer)
and dwh_gold (nightly ETL aggregates) schemas plus their dependents —
features never implemented, no live writer or scheduled refresh.
- Prod DB (already applied): DROP SCHEMA ops/dwh_gold CASCADE, plus
tracksolid.dispatch_log, v_sla_inflight, v_utilisation_daily.
- migrations/12_drop_ops.sql + 13_drop_dwh_gold.sql (forward, all
IF EXISTS) registered in run_migrations.py for rebuild durability.
- grafana: removed 8 now-broken panels (In-flight SLA, Idle Cost,
Utilisation Heatmap, Row 7 Field-Service SLAs) from daily_operations;
panel count 21 -> 13.
- docs: scrubbed CLAUDE.md, PLATFORM_OVERVIEW.html (-19KB), DATA_FLOW.md;
pre-drop seed snapshot in docs/reports/260605_ops_purge_backup.md.
The separate tracksolid_dwh server (31.97.44.246:5888) is unrelated
and untouched.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Update PLATFORM_OVERVIEW.html (§2 migration, §4 read-API, §5 refresh_log,
§7 ops notes) and CLAUDE.md §7 fix history (FIX-D01, FIX-D02) to reflect
the two 2026-06-05 fixes that closed out the n8n→fleetapi cutover:
form-urlencoded POST body parsing, and moving the reporting.v_trips
matview refresh from the retired n8n job into dashboard_api.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Fleet Trips dashboard reads reporting.v_trips (a materialized view).
Its refresh was a scheduled n8n workflow; when n8n was retired the matview
froze (last refresh 2026-06-01) so the dashboard showed no recent trips
even though tracksolid.trips kept ingesting live.
Move the refresh into the owned stack: a background loop in dashboard_api
runs REFRESH MATERIALIZED VIEW CONCURRENTLY reporting.v_trips every
VTRIPS_REFRESH_INTERVAL_S (default 300s). Safe across uvicorn --workers
via a pg advisory lock (one worker refreshes per tick); runs in a thread
so the ~9s refresh never blocks the event loop; logs to
reporting.refresh_log (source='dashboard_api') for continuity. Uses a
dedicated autocommit connection because REFRESH ... CONCURRENTLY cannot
run inside a transaction block.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Fleet Trips SPA posts application/x-www-form-urlencoded, but the
POST /webhook/fleet-dashboard handler read the body with request.json().
That threw on every request, the except swallowed it to body={}, and all
filters (vehicle_numbers, cost_centre, assigned_city) plus period/dates
were dropped — so every query returned the full unfiltered fleet (1,266
trips) regardless of the dropdowns. The map/KPIs/trips never changed,
which read as "the dropdowns don't work."
Parse by Content-Type: urllib.parse.parse_qs for form bodies (no new
dependency — avoids python-multipart), JSON still accepted defensively
for n8n-compat callers.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds section 6 (Grafana dashboards) to PLATFORM_OVERVIEW.html, generated from
the provisioned dashboard JSON: every panel in the NOC Fleet (9 panels) and
Daily Operations (23 panels) dashboards with type and source view/table.
Renumbers Operational notes to section 7. Links the doc from the CLAUDE.md
codebase map.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Self-contained HTML reference generated from the live DB, documenting the
platform after the maps moved off n8n onto dashboard_api (fleetapi). Covers
architecture/data flow, the n8n→fleetapi migration, deployment topology,
the read-API endpoint reference, and the full database schema — every table
(with columns + row estimates), view, and function across tracksolid /
reporting / ops / dwh_gold / public — plus operational notes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The reporting schema (fn_live_positions/fn_vehicle_track/fn_trips_for_map,
the v_trips materialized view + indexes, filter/summary views, refresh_log)
backs the dashboard_api map endpoints but existed only on the prod DB, in no
migration — a rebuild would have lost it. Captured the live DDL into
migrations/11_reporting_schema.sql (idempotent: IF NOT EXISTS / CREATE OR
REPLACE, search_path set for unqualified base-table refs, guarded grants) and
registered it in run_migrations.py. Verified it applies cleanly against prod
inside a rolled-back transaction.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Live Positions SPA calls GET /webhook/live-positions/track, but the
read-API only exposed /webhook/vehicle-track. Clicking a vehicle to view its
1-hour trail therefore 404'd even after repointing N8N_BASE. Register the SPA's
actual path as a route alias to the same handler (vehicle-track kept as alias),
so the only frontend change remains the base URL. Docstring updated to match.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
n8n was a thin HTTP->SQL proxy for the Live Position and Fleet Trips maps and
proved fragile (credential reloads, :latest drift, shared connection limits).
This service calls the same proven reporting.* functions directly, reusing the
existing psycopg2 pool / Docker image / Coolify deploy.
Endpoints mirror the n8n webhook paths so the only frontend change is N8N_BASE:
GET /webhook/live-positions -> {summary, geojson} (fn_live_positions)
GET /webhook/vehicle-track -> GeoJSON Feature (fn_vehicle_track)
GET /webhook/fleet-dashboard -> filter options
POST /webhook/fleet-dashboard -> trips payload (fn_trips_for_map)
Response shapes replicate the n8n "Build response JSON" nodes exactly; empty
filters/sentinels ('', null, undefined) normalize to SQL wildcards. CORS limited
to the dashboard origins. Added dashboard_api service to docker-compose (port
8890, Coolify-routed). SQL contracts validated against prod.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cherry-pick of c8f5907 (originally FIX-M20 on main) onto
quality-program-2026-04-12 — renamed to FIX-M21 here to avoid clashing
with this branch's existing [FIX-M20] (trip enrichment, commit 144dede).
Behaviour and code are unchanged from the main-branch original; the
annotation tag is the only difference.
Background
----------
A field audit of liveposition.rahamafresh.com on 2026-05-21 surfaced two
freshness gaps that share a single root cause: tracksolid.live_positions
was being written by only one path (the 60s polled sweep), and that path
silently omits devices that don't have a "current" fix in Jimi's
location.list response. Effect on the dashboard:
* 18 vehicles show OFFLINE for days-to-months — last fix is whatever
the sweep wrote before Jimi dropped them.
* 3 vehicles (KDK 780K, KCQ 618K, KCZ 476E) depend on dashcam fallback
because their dedicated tracker has been silent; the camera's lat/lng
arrives via /pushalarm webhooks (5,287/day, 100% lat/lng fill) but
we discard it after writing to tracksolid.alarms.
Verified upstream subscription state: only /pushalarm is registered with
Jimi; the n8n forwarders for /pushgps, /pushtripreport, /pushobd are
inactive. This change uses only data that already arrives.
What's in this commit
---------------------
ts_shared_rev.py
* upsert_live_position(cur, imei, lat, lng, gps_time, ..., extras=None)
— single time-guarded upsert all three writers will share. Guards on
is_valid_fix() (filters Zero-Island and out-of-range) and
EXCLUDED.gps_time > stored.gps_time so late-arriving alarms or
webhook retries can't rewind a fresher marker. COALESCE on optional
columns so sparse callers don't blank dense ones' values.
* get_stale_imeis(stale_minutes=30) — SELECT enabled_flag=1 devices
whose live_positions.gps_time is NULL or older than the threshold,
ordered NULLS FIRST so worst-offenders are in batch #1.
* ensure_device(cur, imei, device_name=None) — relocated from
webhook_receiver_rev so every live_positions writer can satisfy the
FK without re-defining the helper. The original underscore-prefixed
name in webhook_receiver_rev becomes a backwards-compat alias.
webhook_receiver_rev.py
* /pushalarm — after the alarm row insert, call upsert_live_position
with the alarm's lat/lng and alarmTime. Sits inside the existing
per-item SAVEPOINT, so a cross-feed failure rolls back only that
one alarm's cross-feed, not the alarm row.
ingest_movement_rev.py
* poll_live_positions — inline INSERT replaced with upsert_live_position
(extras dict carries the sweep-only columns). Same data, time-guarded.
* get_device_locations — inline INSERT replaced; also gains an
ensure_device call so it can be safely fed arbitrary IMEIs.
* poll_stale_locations() — new wrapper. Pulls get_stale_imeis() and
hands it to get_device_locations. Scheduled every 10 minutes plus a
startup catch-up call. Uses jimi.device.location.get which returns
*last-known* fix, so devices the 60s sweep drops can be re-warmed.
Expected post-deploy effect (estimates, see
06_live_location/260521_timescale_location_upgrade_major.md §4)
* ~1,100-1,600 additional live_positions upserts/day from the alarm
cross-feed, after the time-guard rejects ~70-80% of races vs the
fresher 60s sweep.
* The 3 camera-fallback plates flip to "seconds-after-alarm" cadence
(JC400P emits ~107 alarms/day per device).
* 8-14 of the 24 OFFLINE plates expected to recover via location.get's
last-known-fix path within the first 30 minutes.
* Dashboard's "Offline 24h+" KPI: 24 → 10-14 within the first hour.
* No 06_live_location code changes required — reads through
reporting.v_live_positions transparently.
Tests
-----
12 webhook integration tests pass (3 new: cross-feed fires on valid fix;
skips without lat/lng; skips Zero-Island). 8 new unit tests in
test_stale_imeis.py cover the stale selector, the poll wrapper, and the
time-guard contract on upsert_live_position. Full suite: 77 passed.
Deployment
----------
No schema migration. Both webhook_receiver and ingest_movement
containers must be rebuilt — source is image-baked, not bind-mounted.
Rollback is git revert + rebuild.
Plan & monitoring SQL: 06_live_location/260521_timescale_location_upgrade_major.md
Verification playbook: 06_live_location/260521_timescale_location_upgrade_verification.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Cross-feed alarm lat/lng into live_positions; schedule stale-IMEI rescue every 10 min.
See 06_live_location/260521_timescale_location_upgrade_major.md for the plan and 260521_timescale_location_upgrade_verification.md for post-deploy checks.
Background
----------
A field audit of liveposition.rahamafresh.com on 2026-05-21 surfaced two
freshness gaps that share a single root cause: tracksolid.live_positions
was being written by only one path (the 60s polled sweep), and that path
silently omits devices that don't have a "current" fix in Jimi's
location.list response. Effect on the dashboard:
* 18 vehicles show OFFLINE for days-to-months — last fix is whatever
the sweep wrote before Jimi dropped them.
* 3 vehicles (KDK 780K, KCQ 618K, KCZ 476E) depend on dashcam fallback
because their dedicated tracker has been silent; the camera's lat/lng
arrives via /pushalarm webhooks (5,287/day, 100% lat/lng fill) but
we discard it after writing to tracksolid.alarms.
Verified upstream subscription state: only /pushalarm is registered with
Jimi; the n8n forwarders for /pushgps, /pushtripreport, /pushobd are
inactive. This change uses only data that already arrives.
What's in this PR
-----------------
ts_shared_rev.py
* upsert_live_position(cur, imei, lat, lng, gps_time, ..., extras=None)
— single time-guarded upsert all three writers will share. Guards on
is_valid_fix() (filters Zero-Island and out-of-range) and
EXCLUDED.gps_time > stored.gps_time so late-arriving alarms or
webhook retries can't rewind a fresher marker. COALESCE on optional
columns so sparse callers don't blank dense ones' values.
* get_stale_imeis(stale_minutes=30) — SELECT enabled_flag=1 devices
whose live_positions.gps_time is NULL or older than the threshold,
ordered NULLS FIRST so worst-offenders are in batch #1.
* ensure_device(cur, imei, device_name=None) — relocated from
webhook_receiver_rev so every live_positions writer can satisfy the
FK without re-defining the helper. The original underscore-prefixed
name in webhook_receiver_rev becomes a backwards-compat alias.
webhook_receiver_rev.py
* /pushalarm — after the alarm row insert, call upsert_live_position
with the alarm's lat/lng and alarmTime. Sits inside the existing
per-item SAVEPOINT, so a cross-feed failure rolls back only that
one alarm's cross-feed, not the alarm row.
ingest_movement_rev.py
* poll_live_positions — inline INSERT replaced with upsert_live_position
(extras dict carries the sweep-only columns). Same data, time-guarded.
* get_device_locations — inline INSERT replaced; also gains an
ensure_device call so it can be safely fed arbitrary IMEIs.
* poll_stale_locations() — new wrapper. Pulls get_stale_imeis() and
hands it to get_device_locations. Scheduled every 10 minutes plus a
startup catch-up call. Uses jimi.device.location.get which returns
*last-known* fix, so devices the 60s sweep drops can be re-warmed.
Expected post-deploy effect (estimates, see
260521_timescale_location_upgrade_major.md §4)
* ~1,100-1,600 additional live_positions upserts/day from the alarm
cross-feed, after the time-guard rejects ~70-80% of races vs the
fresher 60s sweep.
* The 3 camera-fallback plates flip to "seconds-after-alarm" cadence
(JC400P emits ~107 alarms/day per device).
* 8-14 of the 24 OFFLINE plates expected to recover via location.get's
last-known-fix path within the first 30 minutes.
* Dashboard's "Offline 24h+" KPI: 24 → 10-14 within the first hour.
* No 06_live_location code changes required — reads through
reporting.v_live_positions transparently.
Tests
-----
12 webhook integration tests pass (3 new: cross-feed fires on valid fix;
skips without lat/lng; skips Zero-Island). 8 new unit tests in
test_stale_imeis.py cover the stale selector, the poll wrapper, and the
time-guard contract on upsert_live_position. Full suite: 77 passed.
Deployment
----------
No schema migration. Both webhook_receiver and ingest_movement
containers must be rebuilt — source is image-baked, not bind-mounted.
Rollback is git revert + rebuild.
Plan & monitoring SQL: 06_live_location/260521_timescale_location_upgrade_major.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverts the Phase 2 pgAdmin web sidecar from bc020cb. pgbouncer (Phase 1)
stays in place. On the instance the pgadmin container has been stopped
and removed and the pgadmin-data volume dropped; Coolify subdomain and
PGADMIN_DEFAULT_* env vars to be removed in the UI separately.
Files:
- docker-compose.yaml: drop pgadmin service block + pgadmin-data volume
- pgadmin/servers.json: delete (directory removed)
- 260507_pgbouncer_deployment.md: strip Phase 2, runbook is pgbouncer-only
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 2 of the pgbouncer + pgAdmin rollout. pgAdmin4 runs as a Coolify-
managed container on the same Docker network as pgbouncer, with a
pre-registered server entry so the tracksolid_db (via pgbouncer) tree
appears immediately on first login.
Net effect: admin tooling moves on-VM (low latency, persistent workspace
in pgadmin-data volume) and connects through pgbouncer:6432 in transaction
mode, so opening many Query Tool tabs no longer exhausts max_connections.
The desktop pgAdmin can be retired once this is verified live, after
which host port 5433 can also be closed.
Requires PGADMIN_DEFAULT_EMAIL and PGADMIN_DEFAULT_PASSWORD in the
Coolify env, plus a subdomain mapping to this service on port 80.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The pinned tag failed to pull on Coolify deploy. Switching to the
untagged edoburu/pgbouncer (rolling latest) so the sidecar can come up.
Will revisit pinning to a known-good tag once verified live.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 1 of the pgbouncer + pgAdmin rollout (runbook:
260507_pgbouncer_deployment.md). pgAdmin4 on the maintainer's laptop has
been exhausting tracksolid_db's max_connections, cascading to pgcli and
operations. Adds an internal-only pgbouncer service in transaction mode
with a small backend pool (default 15) so admin-tool sprawl can no
longer starve the ingest pipeline.
No client cutover this round - ingest, Grafana, webhook, and backup all
keep talking to timescale_db:5432 directly. SCRAM passthrough is wired
via a new pgbouncer role + public.user_lookup() function (migration 10).
The role is created with a placeholder password; sync_role_passwords()
in run_migrations.py replaces it from PGBOUNCER_AUTH_PASSWORD on every
container startup, mirroring the existing grafana_ro convention.
Requires PGBOUNCER_AUTH_PASSWORD to be set in .env before deploy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two read-only views in the tracksolid schema feeding n8n's working-hours
checks: per-IMEI per-Nairobi-day reporting/closing times, start/end
locations + Nominatim addresses, and trip-count/km/drive-hours context.
No policy embedded; cost-centre filtering and tardiness thresholds live
in n8n.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The historical trips table is much larger than the spec assumed (7,634
rows on prod, not the 8 the CLAUDE.md snapshot suggested). Reverse-geocoding
all of them via Nominatim's 1 req/sec TOS throttle would take ~4¼ hours
end-to-end.
--skip-geocode bypasses the Nominatim calls entirely. Geometry, plate, and
idle backfills run in minutes; addresses stay NULL on historical rows and
will only be populated for future trips by poll_trips.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Polling jimi.device.track.mileage does not return start/end coordinates,
fuel, idle, or trip sequence — leaving most trip columns NULL. This change
closes those gaps using data we already have in position_history plus a
best-effort Nominatim lookup.
Migration 09_trips_enrichment.sql adds:
• route_geom (LineString), start_address, end_address, vehicle_plate,
waypoints_count on tracksolid.trips
• GIST indexes on the three geometry columns
• view tracksolid.v_trips_enriched exposing daily_seq + trip_date_eat
(replaces reliance on the device-supplied trip_seq, which is only
populated when /pushtripreport fires)
ingest_movement_rev.py::poll_trips now:
• extracts idleSecond from the poll response (was previously dropped)
• per-trip: SELECTs start fix, end fix, ST_MakeLine route, and waypoint
count from position_history within (start_time, end_time)
• reverse-geocodes start/end via the new ts_shared_rev.reverse_geocode
helper (Nominatim, LRU-cached at ~11m precision, 1 req/sec, never raises)
• caches vehicle_plate from a per-cycle plates dict
• ON CONFLICT preserves webhook-supplied data when /pushtripreport later
delivers native coords/fuel/trip_seq
backfill_trips_enrichment.py is a one-shot script (dry-run by default,
--apply to commit, --imei / --since flags) that runs the same enrichment
against historical NULL rows and COALESCEs only — never overwrites.
DWH bronze mirrors and Grafana panels intentionally not touched (frozen
on this branch until the schema work lands).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 0 of the three-stakeholder analytics redesign:
- 08_analytics_config.sql: ops.cost_rates + ops.kpi_targets with seed
fuel rates (KES 195/L NBO+MBA, UGX 5200/L KLA) and 6 seed KPI
targets (utilisation_pct, idle_pct global+osp-patrol,
fuel_kes_per_100km, mttr_hours, alarms_per_100km). Granted SELECT to
grafana_ro. Wired into run_migrations.py MIGRATIONS.
- import_drivers_csv.py: full rewrite for the new Mitieng CSV
(20260427_FSG_Vehicles_mitieng.csv). Snake_case columns, drops
_infer_city() plate-prefix logic in favour of reading assigned_city
directly. Adds cost_centre, assigned_route, vehicle_category,
vehicle_brand, fuel_100km, depot_address. Treats the literal "NULL"
string as missing. Reuses clean(), clean_num(), clean_ts(),
get_conn(), get_logger() from ts_shared_rev. Special-cases numeric
and timestamptz columns in the UPDATE clause.
- audit_device_reconciliation.py: read-only audit comparing the CSV
against tracksolid.devices. Reports per-account row counts, IMEIs
on one side only, and devices on both sides whose metadata is still
NULL.
- 260427_device_reconciliation.md + 260427_audit_output.txt: Phase 0.2
reconciliation record. First run: DB has 172 devices, CSV has 162,
delta +10 (10 IMEIs in DB-only, mostly fireside-account auto-syncs).
Importer run with --only-null --apply filled 154 rows; coverage now
assigned_city 152/172, cost_centre 150/172.
Applied to stage on 2026-04-27 23:35 UTC.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Switches column references from the old title-case logistics CSV
(IMEI, Device Name, License Plate No., Telephone, Fuel/100km, ...) to
the snake_case Mitieng export shipped on 2026-04-27 (imei, device_name,
vehicle_number, driver_phone, fuel_100km, ...). Without this, the bulk
device-update API tool fails with KeyError: 'IMEI' on the new CSV.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Carto basemap tiles render up to ~19-20; OpenLayers caps at 28. 22 leaves
no practical ceiling for street-level inspection while keeping the EAC-
bounded minZoom in place.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Carto basemap tiles render up to ~19-20; OpenLayers caps at 28. 22 leaves
no practical ceiling for street-level inspection while keeping the EAC-
bounded minZoom in place.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Recentred geomap view from lat -2.0/lon 35.5/zoom 5 to lat -3.0/lon 34.5/
zoom 5.5 (Lake Victoria area, the geographic intersection of the three
countries) and raised minZoom to 5.5 so the dashboard can't be panned out
to show neighbouring countries.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Recentred geomap view from lat -2.0/lon 35.5/zoom 5 to lat -3.0/lon 34.5/
zoom 5.5 (Lake Victoria area, the geographic intersection of the three
countries) and raised minZoom to 5.5 so the dashboard can't be panned out
to show neighbouring countries.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Grafana's lengthkm and h units auto-scale with SI prefixes — fleet km
totals rendered as "Mm" (megametres) and drive-hour totals as days/weeks,
which read as "millions" and "weeks" on the Daily Ops dashboard. Switched
the affected panels (Fleet km today, Drive/Idle hours today, the per-vehicle
roll-up table, the driver leaderboard, and the 7-day distance trend) to
unit "none" with decimals: 1 so values stay in km/h with units carried by
panel titles and column displayNames.
Geomap view recentred to lat -2.0, lon 35.5, zoom 5 with minZoom 5 /
maxZoom 12 so the Active Vehicles map opens on the East African Community
region and cannot zoom out past it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>