tracksolid_timescale_grafan.../CLAUDE.md
david kiania 9986d3b411
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
docs(claude): add pending Grafana redeploy to open items (post ops/dwh_gold purge)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 20:39:56 +03:00

18 KiB
Raw Blame History

CLAUDE.md — Fireside Communications · Tracksolid Fleet Intelligence

0. Commands

Package manager: uv (not pip/poetry)

uv sync                    # Install / sync dependencies
pytest tests/              # Run test suite
ruff check .               # Lint
mypy .                     # Type check

Run a query via Docker (DATABASE_URL is internal-only):

DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1)
docker exec $DB psql -U postgres -d tracksolid_db -c "SELECT COUNT(*) FROM tracksolid.devices;"

Run a migration file:

docker exec -i $DB psql -U postgres -d tracksolid_db < migrations/07_your_migration.sql

1. What This Project Is

Fleet telematics ingestion and analytics stack for a telco first-line support client operating in Nairobi, Mombasa, and Kampala. The client dispatches field technicians to install, repair, and maintain home and business broadband, handle LOS signal faults, service migrations, and maintain outside plant infrastructure. The fleet is ~80 vehicles across three cities, all tracked via Tracksolid Pro (Jimi IoT API).

This repository ingests the Tracksolid Pro API into a TimescaleDB/PostGIS database and visualises fleet and operational KPIs in Grafana. The pipeline is deployed on Coolify at stage.rahamafresh.com.

Repository: https://repo.rahamafresh.com/kianiadee/tracksolid_timescale_grafana_prod.git


2. Tech Stack

Layer Technology
Ingestion Python 3.12 — ingest_movement_rev.py, ingest_events_rev.py, webhook_receiver_rev.py
Shared utils ts_shared_rev.py — token cache, DB pool, API signing, clean helpers
Database PostgreSQL 16 + TimescaleDB 2.15 + PostGIS 3 (tracksolid_db)
Orchestration Docker Compose on Coolify
Visualisation Grafana (provisioned via custom image)
Workflow automation n8n
API source Tracksolid Pro / Jimi IoT Open Platform (eu-open.tracksolidpro.com/route/rest)
Backup pg_dump sidecar → rustfs S3 (fleet-db bucket), nightly
Version control Forgejo at repo.rahamafresh.com

3. Instance & Connection Parameters

See docs/CONNECTIONS.md for the full shape. Summary:

  • SSH: ssh -i ~/.ssh/id_ed25519 kianiadee@stage.rahamafresh.com
  • DB name: tracksolid_db · DB user: postgres (internal) · tracksolid_owner (app) · grafana_ro (read-only)
  • DB schemas: tracksolid (live, single source of truth) · reporting (map-dashboard read layer) · infrastructure. The legacy tracksolid_2 schema no longer exists (migrations 0206, 2026-04-18); the ops and dwh_gold schemas were purged 2026-06-05 (migrations 12/13) as unused.
  • DB access: DATABASE_URL points to timescale_db:5432 (internal Docker network — not reachable locally). Use docker exec pattern above. See docs/CONNECTIONS.md for full reference.
  • DWH target DB: tracksolid_dwh at 31.97.44.246:5888 (separate PostGIS instance, public IP). Users: dwh_owner (bronze writes + dwh_control), grafana_ro (reads bronze/silver/gold/dwh_control). Always connect with sslmode=require. Fed by the n8n dwh_extract + dwh_load_bronze workflows — see docs/DWH_PIPELINE.md.
  • Container naming: Coolify appends a random suffix. Always resolve with:
    docker ps --filter name=<service_name> --format "{{.Names}}" | head -1
    
    e.g. docker ps --filter name=timescale_db --format "{{.Names}}" | head -1
  • Env vars: loaded from .env via env_file in docker-compose.yaml. See docs/CONNECTIONS.md for variable names. Never hardcode secrets.

4. Codebase Map

ts_shared_rev.py            # Shared: config, signing, DB pool, token cache, clean helpers
ingest_movement_rev.py      # GPS positions, trips, parking, track-list (high-res trail), device sync
ingest_events_rev.py        # Alarm events polling (fallback for webhook push)
webhook_receiver_rev.py     # FastAPI push receiver: /pushobd /pushevent /pushtripreport etc.
sync_driver_audit.py        # One-shot: API↔DB driver/IMEI gap report + full upsert
import_drivers_csv.py       # One-shot: populate 144 X3/JC400P devices from CSV (--apply to commit)
run_migrations.py           # Applies SQL migrations in order at container startup
docker-compose.yaml         # Services: timescale_db, ingest_movement, ingest_events,
                            #           webhook_receiver, grafana
grafana/                    # Grafana provisioning (baked into image)
n8n-workflows/              # n8n workflow exports (incl. dwh_extract, dwh_load_bronze)
docs/                       # Reference docs (connections, API, KPIs, project context)
docs/PLATFORM_OVERVIEW.html # Current-state platform reference (architecture, deploy, read-API,
                            #   full DB schema, Grafana panels) — open in a browser. Post n8n→fleetapi.
docs/DWH_PIPELINE.md        # DWH pipeline operations runbook (setup, troubleshooting)
docs/superpowers/          # Pitch specs and implementation plans (not deployed code)
dwh/                        # DWH migrations for tracksolid_dwh@31.97.44.246:5888
                            #   260423_dwh_ddl_v1.sql — bronze/silver/gold schemas + roles
                            #   261001_dwh_control.sql — watermarks + run log
                            #   261002_bronze_constraints_audit.sql — ON CONFLICT key assertion
                            #   261003_dwh_roles.sql — role contract assertion
                            #   261004_dwh_observability_views.sql — freshness/failure views
migrations/                 # Numbered SQL migrations 0213, applied in order by run_migrations.py
                            #   02 full schema · 03 webhook · 04 distance fix · 05 enhancements
                            #   06 ops/analytics · 07 views · 08 config · 09 trips enrichment
                            #   10_driver_clock_views.sql · 10_pgbouncer_auth.sql · 11 reporting
                            #   12 drop ops schema · 13 drop dwh_gold schema (both 2026-06-05)
Dockerfile                  # Custom image for ingest/webhook containers
pyproject.toml              # Python project + uv dependency spec
backup/                     # pg_dump sidecar scripts and config
data/                       # Source CSVs (FS Logistics 144-device list, FSG vehicles)
legacy/                     # Superseded pre-_rev scripts + old pipeline notes (NOT deployed)
docs/manuals/               # OPERATIONS_MANUAL, grafana + DWH manuals, docker commands, DB manual
docs/reference/             # 01_BusinessAnalytics.md (SQL library — read before writing queries),
                            #   tracksolidApiDocumentation.md, 260507_pgbouncer_deployment.md
docs/reports/               # Baseline reports, audit output, improvement reviews

5. Database Schema — Key Tables

tracksolid.devices           -- Device / driver / vehicle registry (63 rows; 0 driver_name populated)
                             -- IMEI mix: 353549* AT4 (23), 862798* X3/JC400P (23), 865135* X3/JC400P (10), 359857* (7)
                             -- Full CSV (144 devices) not yet imported — run import_drivers_csv.py --apply
tracksolid.live_positions    -- Current fix per IMEI (19 rows; refreshed every 60s by ingest_movement)
tracksolid.position_history  -- All GPS fixes (hypertable, partitioned by gps_time). ~519 rows (308 track_list + 211 poll).
                             -- pg_stat_user_tables shows 0 for hypertables — always COUNT(*) directly.
                             -- source: 'poll' (60s sweep) | 'track_list' (30m high-res)
tracksolid.trips             -- Trip summaries: distance_km, driving_time_s, avg/max speed
tracksolid.parking_events    -- Stop events with duration and address (0 rows — endpoint returning empty)
tracksolid.alarms            -- Alarm events (alarm_type, alarm_name, alarm_time) — 10 rows, polling healthy
tracksolid.obd_readings      -- OBD diagnostics (push only, awaiting webhook registration)
tracksolid.device_events     -- Power on/off tamper events (push only)
tracksolid.ingestion_log     -- API call audit trail — 875 runs / 24h, 0 failures at last check (2026-04-19)
tracksolid.schema_migrations -- Applied migrations 0213
-- PURGED 2026-06-05 (migrations 12 + 13): the dormant `ops` schema (tickets, service_log,
-- odometer_readings, cost_rates, kpi_targets, vw_service_forecast), tracksolid.dispatch_log,
-- and the `dwh_gold` schema (dim_vehicles, fact_daily_fleet_metrics, refresh_daily_metrics).
-- Those workshop/dispatch/SLA/utilisation features were never implemented. Do NOT reintroduce
-- references to ops.* or dwh_gold.* — they no longer exist. (The separate tracksolid_dwh DB
-- at 31.97.44.246:5888 is unrelated and untouched.)

Full DDL: 02_tracksolid_full_schema_rev.sql + migrations 0306.

Analytics views (migration 07_analytics_views.sql) — one view per BA-file query block, readable by grafana_ro:

tracksolid.v_fleet_today              -- §9 per-vehicle today roll-up
tracksolid.v_vehicles_not_moved_today -- §2.3 alert source
tracksolid.v_active_dispatch_map      -- §4.3 geomap source
tracksolid.v_currently_idle           -- §2.2 idle lens
tracksolid.v_driver_aggregates_daily  -- §3.1 + §3.2 aggression index source
tracksolid.v_fleet_km_daily           -- §7 Panel 5 distance trend
tracksolid.v_alarms_daily             -- §7 Panel 7 alarm frequency
-- v_utilisation_daily (dwh_gold) and v_sla_inflight (ops) were DROPPED 2026-06-05 with
-- their schemas (migrations 12/13); their Grafana panels were removed from the dashboard.

All views carry a COMMENT ON VIEW referencing their spec — \d+ tracksolid.v_* shows the provenance.

DWH bronze layer (separate DB tracksolid_dwh) — populated by the n8n dwh_extract + dwh_load_bronze workflows. Operational details in docs/DWH_PIPELINE.md.

-- bronze schema mirrors tracksolid.* (16 tables, DDL in dwh/260423_dwh_ddl_v1.sql)
bronze.devices, bronze.live_positions           -- snapshot tables (TRUNCATE + reload)
bronze.position_history, bronze.trips,
bronze.alarms, bronze.parking_events,
bronze.device_events, bronze.ingestion_log     -- incremental (watermark + ON CONFLICT DO NOTHING)
-- Schema drift: bronze.trips.distance_km vs source tracksolid.trips.distance_m
-- Extract SQL divides by 1000. Cross-ref FIX-M16.

-- dwh_control schema tracks pipeline state + observability
dwh_control.extract_watermarks   -- one row per incremental table
dwh_control.extract_runs         -- per-run audit log (status lifecycle)
dwh_control.v_table_freshness    -- Grafana: load lag per table
dwh_control.v_recent_failures    -- Grafana: failures in last 24h
dwh_control.v_watermark_lag      -- Grafana: extract vs. load lag per table

6. API Critical Facts

Always read docs/reference/tracksolidApiDocumentation.md before adding a new endpoint call.

Fact Detail
Auth OAuth2 — token cached in tracksolid.api_token_cache, refreshed via jimi.oauth.token.refresh
Signing MD5: secret + sorted(k+v pairs) + secret — see build_sign() in ts_shared_rev.py
Batch limit Max 50 IMEIs per call for most endpoints
distance field Returns METRES, not km despite docs. Always divide by 1000. (FIX-M16)
driverName/driverPhone From jimi.user.device.list — will be NULL if not set in Tracksolid Pro UI
alarm_type field API polling returns alertTypeId/alarmTypeName — NOT alarmType/alarmName (FIX-E06)
durSecond Parking endpoint returns durSecond, not seconds (FIX-M13)
jimi.device.track.mileage startMileage/endMileage are cumulative odometer in metres
Rate limit Code 1006 — back off and retry with re-sign (handled in api_post())
OBD data Push only via /pushobd webhook — no polling endpoint exists

7. Fix History (do not regress)

Fix ID File What it fixed
FIX-M11 ingest_movement_rev.py Removed erroneous ×1000 on distance (was storing km as mm)
FIX-M13 ingest_movement_rev.py Parking: added acc_type=0, account; mapped durSecond
FIX-M14 ingest_movement_rev.py poll_track_list() — high-res GPS trail every 30m
FIX-M15 ingest_movement_rev.py get_device_locations() — on-demand precision refresh
FIX-M16 ingest_movement_rev.py distance from API is metres → divide by 1000 before storing
FIX-M17 ingest_movement_rev.py sync_devices() ON CONFLICT now updates all 26 fields (was 5)
FIX-M18 ingest_movement_rev.py sync_devices() pulls vehicleName/vehicleNumber/driverName/driverPhone/sim from jimi.track.device.detail — list endpoint returns null for these even when set
FIX-M19 ts_shared_rev.py, ingest_movement_rev.py Multi-account support: fleet spans fireside, Fireside@HQ, Fireside_MSA (156 devices total). sync_devices, poll_live_positions, poll_parking iterate TRACKSOLID_TARGETS (comma-separated env var). New helper get_active_imeis_by_target() scopes parking calls to the right account
FIX-E06 ingest_events_rev.py Alarm field mapping: alertTypeId/alarmTypeName/alertTime
BUG-02 Migration 04 Historical distance_m rows ÷1,000,000 → renamed to distance_km
FIX-D01 dashboard_api_rev.py POST /webhook/fleet-dashboard read body as JSON, but the SPA posts x-www-form-urlencodedrequest.json() threw, filters silently dropped, map always returned the whole fleet. Now parsed by Content-Type (parse_qs for form, JSON still accepted). Commit f1387d1
FIX-D02 dashboard_api_rev.py reporting.v_trips matview froze on 2026-06-01 when n8n (which ran the scheduled refresh) was retired → dashboard showed "no trips". Added an in-process background refresher (REFRESH MATERIALIZED VIEW CONCURRENTLY every VTRIPS_REFRESH_INTERVAL_S, default 300s; pg advisory-lock guarded for --workers; logs to reporting.refresh_log source=dashboard_api). Commit 30b3515

8. Working Rules

  1. No prod push without explicit user confirmation. Always state what you are about to push and wait.
  2. Never rewrite a migration that is already applied. Check tracksolid.schema_migrations first. Add a new numbered migration file for any schema change.
  3. Read before writing. Before suggesting any code change, read the relevant source file. Before writing a query, check docs/reference/01_BusinessAnalytics.md for an existing pattern.
  4. Reuse shared utilities. All DB access via get_conn(), all API calls via api_post(), all cleaning via clean() / clean_num() / clean_int() / clean_ts() in ts_shared_rev.py. Do not reinvent these.
  5. Resolve container names dynamically. Never hardcode the Coolify suffix. Use docker ps --filter name=<service>.
  6. SSH only when asked. Default workflow is local code → commit → push. SSH into the instance only when explicitly asked to test or run something live.
  7. Secrets from env only. Connection strings, API keys, and passwords live in .env. Reference variable names from docs/CONNECTIONS.md, never values.
  8. Two developers, one incoming. Write code and docs that a second developer (mixed technical/operations background) can follow without prior context.
  9. Forgejo API auth: credentials stored in macOS keychain. Retrieve with git credential fill (host=repo.rahamafresh.com). Use basic auth against https://repo.rahamafresh.com/api/v1 directly — no tea or gh needed.
  10. Single live schema. All live data lives in tracksolid; the map-dashboard read layer lives in reporting. Do not reintroduce references to the retired tracksolid_2, ops, or dwh_gold schemas (the latter two purged 2026-06-05, migrations 12/13).

9. Fleet State (as of 2026-04-19)

Metric Value
Registered devices (tracksolid.devices) 63 total — 23 × 353549* (AT4), 23 × 862798* + 10 × 865135* (X3/JC400P), 7 × 359857*
Devices in CSV not yet imported 144 (X3/JC400P); import_drivers_csv.py --apply will upsert names + plates
Driver names populated 0 / 63 — pending CSV import
Live positions 19 (latest fix 2026-04-19 10:25 UTC)
Trips recorded 8 (latest 2026-04-19 08:34 UTC)
Alarms recorded 10 (latest 2026-04-19 09:15 UTC)
position_history rows 519 (308 track_list + 211 poll); hypertable stats don't update in pg_stat_user_tables — query directly
Pipeline status Running healthy: 875 runs / 24h, 0 failures across all six endpoints
Cities active Nairobi (primary), Mombasa (deploying), Kampala (4 devices in CSV)
Service flags KDK 829A GP (239,264 km), Belta KCU-647D (235,000 km)

Latest full snapshot: docs/reports/260412_baseline_report.md


10. Open Items (update as resolved)

Priority Item
HIGH Redeploy the Grafana service in Coolify to apply daily_operations_dashboard.json — 5 panel areas (In-flight SLA, Idle Cost, Utilisation Heatmap, Row 7 Field-Service SLAs) that queried the now-dropped v_sla_inflight/v_utilisation_daily were removed. The DB views are already gone, so live Grafana shows errors on those panels until the redeploy (purge commit 8c5a43f, 2026-06-05).
HIGH Run import_drivers_csv.py --apply — 144 X3/JC400P devices with names + plates waiting
HIGH Register webhooks: /pushoil /pushtem /pushlbs (auto-register on push now done — commit 257643c)
HIGH Investigate X3-63282 in Kampala — legitimate or unauthorised?
MEDIUM Set fuel_100km per vehicle type to activate fuel cost calculations
MEDIUM Investigate 44 silent devices (only 19 of 63 reporting) — SIM installed? Activated?
MEDIUM Co-develop client KPI framework (see docs/KPI_FRAMEWORK.md)
LOW Populate geofences — depot boundaries, city zones
HIGH Deploy DWH bronze pipeline: apply dwh/26100{1,2,3,4}.sql to tracksolid_dwh, import + wire the two n8n workflows, verify first run via dwh_control.v_table_freshness. Runbook: docs/DWH_PIPELINE.md
MEDIUM Rotate dwh_owner / grafana_ro passwords on tracksolid_dwh — plaintext in dwh/260423_dwh_ddl_v1.sql is a pre-existing flaw to clean up separately