Phase 1 of the pgbouncer + pgAdmin rollout (runbook:
260507_pgbouncer_deployment.md). pgAdmin4 on the maintainer's laptop has
been exhausting tracksolid_db's max_connections, cascading to pgcli and
operations. Adds an internal-only pgbouncer service in transaction mode
with a small backend pool (default 15) so admin-tool sprawl can no
longer starve the ingest pipeline.
No client cutover this round - ingest, Grafana, webhook, and backup all
keep talking to timescale_db:5432 directly. SCRAM passthrough is wired
via a new pgbouncer role + public.user_lookup() function (migration 10).
The role is created with a placeholder password; sync_role_passwords()
in run_migrations.py replaces it from PGBOUNCER_AUTH_PASSWORD on every
container startup, mirroring the existing grafana_ro convention.
Requires PGBOUNCER_AUTH_PASSWORD to be set in .env before deploy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two read-only views in the tracksolid schema feeding n8n's working-hours
checks: per-IMEI per-Nairobi-day reporting/closing times, start/end
locations + Nominatim addresses, and trip-count/km/drive-hours context.
No policy embedded; cost-centre filtering and tardiness thresholds live
in n8n.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The historical trips table is much larger than the spec assumed (7,634
rows on prod, not the 8 the CLAUDE.md snapshot suggested). Reverse-geocoding
all of them via Nominatim's 1 req/sec TOS throttle would take ~4¼ hours
end-to-end.
--skip-geocode bypasses the Nominatim calls entirely. Geometry, plate, and
idle backfills run in minutes; addresses stay NULL on historical rows and
will only be populated for future trips by poll_trips.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Polling jimi.device.track.mileage does not return start/end coordinates,
fuel, idle, or trip sequence — leaving most trip columns NULL. This change
closes those gaps using data we already have in position_history plus a
best-effort Nominatim lookup.
Migration 09_trips_enrichment.sql adds:
• route_geom (LineString), start_address, end_address, vehicle_plate,
waypoints_count on tracksolid.trips
• GIST indexes on the three geometry columns
• view tracksolid.v_trips_enriched exposing daily_seq + trip_date_eat
(replaces reliance on the device-supplied trip_seq, which is only
populated when /pushtripreport fires)
ingest_movement_rev.py::poll_trips now:
• extracts idleSecond from the poll response (was previously dropped)
• per-trip: SELECTs start fix, end fix, ST_MakeLine route, and waypoint
count from position_history within (start_time, end_time)
• reverse-geocodes start/end via the new ts_shared_rev.reverse_geocode
helper (Nominatim, LRU-cached at ~11m precision, 1 req/sec, never raises)
• caches vehicle_plate from a per-cycle plates dict
• ON CONFLICT preserves webhook-supplied data when /pushtripreport later
delivers native coords/fuel/trip_seq
backfill_trips_enrichment.py is a one-shot script (dry-run by default,
--apply to commit, --imei / --since flags) that runs the same enrichment
against historical NULL rows and COALESCEs only — never overwrites.
DWH bronze mirrors and Grafana panels intentionally not touched (frozen
on this branch until the schema work lands).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 0 of the three-stakeholder analytics redesign:
- 08_analytics_config.sql: ops.cost_rates + ops.kpi_targets with seed
fuel rates (KES 195/L NBO+MBA, UGX 5200/L KLA) and 6 seed KPI
targets (utilisation_pct, idle_pct global+osp-patrol,
fuel_kes_per_100km, mttr_hours, alarms_per_100km). Granted SELECT to
grafana_ro. Wired into run_migrations.py MIGRATIONS.
- import_drivers_csv.py: full rewrite for the new Mitieng CSV
(20260427_FSG_Vehicles_mitieng.csv). Snake_case columns, drops
_infer_city() plate-prefix logic in favour of reading assigned_city
directly. Adds cost_centre, assigned_route, vehicle_category,
vehicle_brand, fuel_100km, depot_address. Treats the literal "NULL"
string as missing. Reuses clean(), clean_num(), clean_ts(),
get_conn(), get_logger() from ts_shared_rev. Special-cases numeric
and timestamptz columns in the UPDATE clause.
- audit_device_reconciliation.py: read-only audit comparing the CSV
against tracksolid.devices. Reports per-account row counts, IMEIs
on one side only, and devices on both sides whose metadata is still
NULL.
- 260427_device_reconciliation.md + 260427_audit_output.txt: Phase 0.2
reconciliation record. First run: DB has 172 devices, CSV has 162,
delta +10 (10 IMEIs in DB-only, mostly fireside-account auto-syncs).
Importer run with --only-null --apply filled 154 rows; coverage now
assigned_city 152/172, cost_centre 150/172.
Applied to stage on 2026-04-27 23:35 UTC.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Switches column references from the old title-case logistics CSV
(IMEI, Device Name, License Plate No., Telephone, Fuel/100km, ...) to
the snake_case Mitieng export shipped on 2026-04-27 (imei, device_name,
vehicle_number, driver_phone, fuel_100km, ...). Without this, the bulk
device-update API tool fails with KeyError: 'IMEI' on the new CSV.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Carto basemap tiles render up to ~19-20; OpenLayers caps at 28. 22 leaves
no practical ceiling for street-level inspection while keeping the EAC-
bounded minZoom in place.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Carto basemap tiles render up to ~19-20; OpenLayers caps at 28. 22 leaves
no practical ceiling for street-level inspection while keeping the EAC-
bounded minZoom in place.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Recentred geomap view from lat -2.0/lon 35.5/zoom 5 to lat -3.0/lon 34.5/
zoom 5.5 (Lake Victoria area, the geographic intersection of the three
countries) and raised minZoom to 5.5 so the dashboard can't be panned out
to show neighbouring countries.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Recentred geomap view from lat -2.0/lon 35.5/zoom 5 to lat -3.0/lon 34.5/
zoom 5.5 (Lake Victoria area, the geographic intersection of the three
countries) and raised minZoom to 5.5 so the dashboard can't be panned out
to show neighbouring countries.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Grafana's lengthkm and h units auto-scale with SI prefixes — fleet km
totals rendered as "Mm" (megametres) and drive-hour totals as days/weeks,
which read as "millions" and "weeks" on the Daily Ops dashboard. Switched
the affected panels (Fleet km today, Drive/Idle hours today, the per-vehicle
roll-up table, the driver leaderboard, and the 7-day distance trend) to
unit "none" with decimals: 1 so values stay in km/h with units carried by
panel titles and column displayNames.
Geomap view recentred to lat -2.0, lon 35.5, zoom 5 with minZoom 5 /
maxZoom 12 so the Active Vehicles map opens on the East African Community
region and cannot zoom out past it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Grafana's lengthkm and h units auto-scale with SI prefixes — fleet km
totals rendered as "Mm" (megametres) and drive-hour totals as days/weeks,
which read as "millions" and "weeks" on the Daily Ops dashboard. Switched
the affected panels (Fleet km today, Drive/Idle hours today, the per-vehicle
roll-up table, the driver leaderboard, and the 7-day distance trend) to
unit "none" with decimals: 1 so values stay in km/h with units carried by
panel titles and column displayNames.
Geomap view recentred to lat -2.0, lon 35.5, zoom 5 with minZoom 5 /
maxZoom 12 so the Active Vehicles map opens on the East African Community
region and cannot zoom out past it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Default TZ=Africa/Nairobi baked into the sidecar image; override via
compose TZ env var if another region is ever needed.
- Rename BACKUP_TIMES_UTC → BACKUP_TIMES (legacy var still honored for
back-compat). Times are now interpreted in the container's local TZ,
so "02:30" means 02:30 EAT, not UTC.
- Log timestamps and dump filenames use %FT%T%z / %Y%m%d_%H%M%S_%Z
(e.g. tracksolid_db_20260424_115729_EAT.sql.gz) so the TZ is visible
on every artifact.
- Prune cutoff computed in local time; YYYYMMDD regex unchanged so it
still matches legacy UTC filenames during the transition.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the single BACKUP_HOUR/BACKUP_MINUTE slot with a comma-separated
list of UTC times. Scheduler walks all slots and sleeps until the soonest
future one, so four daily backups become a one-line env change:
BACKUP_TIMES_UTC=02:30,08:30,14:30,20:30 (default)
Legacy BACKUP_HOUR/BACKUP_MINUTE still honored as a single slot for
backwards compatibility with existing .env files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The timescale/timescaledb-ha image uses /home/postgres/pgdata/data as
PGDATA, not /var/lib/postgresql/data. The previous mount pointed at an
empty directory that postgres never wrote to, so Coolify redeploys
destroyed all data with the container's overlay filesystem.
Pin PGDATA explicitly and move the named timescale-data volume to
/home/postgres/pgdata so the real data dir is persisted.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fleet lives across three Tracksolid sub-accounts:
fireside — 63 devices
Fireside@HQ — 52 devices
Fireside_MSA — 41 devices
Previously sync_devices / poll_live_positions / poll_parking only
queried a single TARGET_ACCOUNT, so ~64% of the fleet was invisible to
the pipeline.
Changes:
- ts_shared_rev.py: new TARGETS list (env TRACKSOLID_TARGETS,
comma-separated; falls back to the single TARGET_ACCOUNT).
- ts_shared_rev.py: new get_active_imeis_by_target() helper that
groups active IMEIs by their stored account so parking calls can
pass the right account param per batch.
- ingest_movement_rev.py: sync_devices and poll_live_positions loop
over every target and dedupe by IMEI before upserting. poll_parking
loops over imeis_by_target so each batch carries the matching
account.
- CLAUDE.md: FIX-M19 entry.
Requires new env var TRACKSOLID_TARGETS="fireside,Fireside@HQ,Fireside_MSA"
on the ingest services in Coolify.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jimi.user.device.list returns null for vehicleName, vehicleNumber,
driverName, driverPhone, and sim even after those fields are set via
jimi.open.device.update — the values only surface through
jimi.track.device.detail. sync_devices() now reads from dtl first with
d as fallback, which unblocks backfill of the 144 CSV-driven updates
pushed on 2026-04-22.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a `db_backup` sidecar that dumps tracksolid_db every night at
02:30 UTC (configurable via BACKUP_HOUR/BACKUP_MINUTE), gzips the
output, and uploads to s3://fleet-db/daily/<dbname>_<ts>.sql.gz on
the rustfs S3-compatible instance (s3.rahamafresh.com). Prunes
objects older than BACKUP_KEEP_DAYS (default 30).
Required .env additions (Coolify UI):
RUSTFS_ENDPOINT=https://s3.rahamafresh.com
RUSTFS_ACCESS_KEY=...
RUSTFS_SECRET_KEY=...
RUSTFS_BUCKET=fleet-db
Mitigates data loss when Coolify service recreation wipes the
service-ID-scoped timescale-data volume.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three changes that together close the FK-violation loop on /pushalarm:
1. import_drivers_csv.py: when an IMEI is in the CSV but not in
tracksolid.devices, INSERT a new row instead of skipping. Unblocks
the 140 X3/JC400P devices listed as a HIGH open item in CLAUDE.md §10.
2. webhook_receiver_rev.py: new _ensure_device() helper upserts a stub
devices row (status='unknown') before inserting an alarm. Handles the
third class of devices — not in API sync, not in CSV (e.g. the
X3-63282 Kampala device flagged in CLAUDE.md §10).
3. CSV refreshed from Downloads (Apr 21 version, 140 active rows).
Also fixes alarm error log previously showing "None" (read deviceImei
instead of the integration push's imei field).
CSV import already applied live on the instance (63 → 201 devices).
Webhook patch requires a Coolify redeploy to pick up _ensure_device().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diagnostic logging revealed the real Jimi integration push format:
Content-Type: application/x-www-form-urlencoded
Body: msgType=jimi.push.device.alarm&data=<URL-encoded JSON>
Differences from docs:
- data is one JSON object per POST (not a data_list array)
- alarm uses imei+alarmTime, NOT deviceImei+gateTime
_parse_request now reads form field `data` (falls back to `data_list`) and
JSON-decodes a single object or array. push_alarm handler accepts either
field naming for forward-compat.
Removes diagnostic INFO log now that format is confirmed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Temporary diagnostic to see what format Jimi actually sends on /pushalarm.
New container is parsing to empty items (pushes arrive but no DB insert),
so we need to see the real body shape. Remove once format is confirmed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Jimi's integration push API (tracksolidprodocs.jimicloud.com) sends
Content-Type: application/json with body {"token":"...","data_list":[...]},
not form-encoded. FastAPI Form() silently defaulted to "" so all pushes
were discarded with "Failed to parse data_list:" warnings.
Replaces per-endpoint Form() params with a shared _parse_request() helper
that tries JSON body first, falls back to form-encoded. All seven push
endpoints (pushobd, pushfaultinfo, pushalarm, pushgps, pushhb,
pushtripreport, pushevent) updated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a second Grafana dashboard focused on daily operational KPIs and live
dispatch, keeping the NOC Live dashboard untouched.
- grafana/provisioning/dashboards-json/daily_operations_dashboard.json
New dashboard covering §7 Blueprint Panels 3-8 and the §4 dispatch lens:
freshness banner, today-at-a-glance stat row, active vehicles map,
currently-idle table, vehicles-not-moved-today, per-vehicle daily KPI
roll-up, driver behaviour leaderboard, distance trend, alarm frequency,
idle cost MTD, utilisation heatmap, SLA row (collapsed, data-gated).
- 07_analytics_views.sql
Nine views in tracksolid.* wrapping the BA-file [DASHBOARD]-tagged
queries. Each view carries COMMENT ON VIEW with its spec section.
SELECT granted to grafana_ro. Smoke-tested against live DB.
- run_migrations.py
Register 06 and 07 in MIGRATIONS list with idempotent seed checks so
future fresh deploys apply them correctly.
- CLAUDE.md
Retire the tracksolid_2 schema references (schema no longer exists);
§9 Fleet State dated 2026-04-19 with correct pipeline status (running,
875 runs/24h, 0 failures) and accurate position_history row counts
(hypertable stats don't show in pg_stat_user_tables).
- docs/superpowers/specs/2026-04-19-daily-operations-dashboard-design.md
Design spec covering architecture, views, panel layout, deployment,
rollback, and known data gaps.
Full slide-by-slide copy for elicitation pitch: 6 pain questions, feature
reveal, business case, optional add-ons (RustFS + DuckDB), and one-pager.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
RustFS (S3-compatible blob) and DuckDB (historical analytics) added as
optional add-on tiers with elicitation pain questions and tier model.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ingest scripts were connecting to the old tracksolid_2 database instead of
the timescale_db container in this stack. Grafana was already correct
(uses service name timescale_db:5432). Also strip leading space and quotes
from DATABASE_URL and API_BASE_URL so os.getenv() returns clean values.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Maps host port 5888 → container port 5432 so the DB can be reached
directly from the MacBook (requires UFW allow 5888/tcp on the server).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- §3: note tracksolid_2 as live schema, tracksolid as empty target;
add DB direct access tip (31.97.44.246:5888, leading space in .env)
- §4: add import_drivers_csv.py and migration 06 to codebase map
- §5: document tracksolid_2 live tables with column differences
(assigned_team vs cost_centre, city vs assigned_city); add ops.*
- §8: add rule 9 (Forgejo API auth via keychain) and rule 10
(always check active schema before querying)
- §9: update fleet state — pipeline stopped Apr 6, CSV fleet pending,
0 driver names, 19 stale positions
- §10: replace driver-name manual item with deploy + CSV import tasks
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Audit fixes across the ingestion stack:
Observability
- Move log_ingestion out of batch loops in poll_alarms and poll_parking
(was emitting N cumulative log rows per run instead of one).
- Add missing log_ingestion + t0 to poll_trips.
- Count inserted via cur.rowcount instead of naive +=1 so ON CONFLICT
DO NOTHING no longer inflates the metric.
Resilience
- SAVEPOINT-per-item added to poll_alarms, poll_live_positions,
poll_trips, poll_parking so one bad row no longer aborts the batch
(webhook handlers already had this; pollers were inconsistent).
Performance
- /pushgps and poll_track_list now use psycopg2.extras.execute_values
with ON CONFLICT DO NOTHING — 10-50x write throughput on larger
batches.
- sync_devices and sync_driver_audit fetch jimi.track.device.detail
concurrently via ThreadPoolExecutor(max_workers=8), cutting the
daily registry sync from ~24s to ~3s for an 80-device fleet.
- poll_track_list split into two phases: parallel API fetch (4 workers,
no DB connection held) then one batched write. Previously the DB
connection was held across every per-IMEI HTTP call, risking pool
starvation.
Security
- _validate_token uses hmac.compare_digest for constant-time token
comparison (closes timing side-channel).
- _parse_data_list caps incoming items at WEBHOOK_MAX_ITEMS (default
5000) so a pathological push cannot blow memory.
Tests
- Fix test_null_alarm_type_skipped: its INSERT-count assertion was
catching the ingestion_log insert written by log_ingestion. Filter
that out so the test checks only data-table inserts.
- Full suite: 66 passed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
57 unit tests covering clean helpers, API signing, and field mapping fixes
(FIX-E06, FIX-M16, BUG-01, BUG-03); integration tests for webhook endpoints
with mocked DB; Forgejo CI workflow with TimescaleDB service container.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>