tracksolid_timescale_grafan.../CLAUDE.md
david kiania d5093f0a1c
Some checks failed
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
chore(cleanup): purge n8n, Grafana, and DWH references + dead artifacts
These subsystems are retired and replaced by better alternatives (FleetNow /
FleetOps SPAs via dashboard_api; in-process pooling; reporting.v_ingest_health).
Remove them so the repo reflects the live stack only. Nothing running depends
on the deleted artifacts.

Deleted (dead artifacts):
- n8n-workflows/ (retired webhook exports), grafana/ (provisioning for the
  removed service), dwh/ (migrations for the decommissioned external warehouse)
- runbooks: DWH_PIPELINE.md, DWH_Execution_Manual.md, grafanaDeployment.md,
  grafanaOperationalManual.md

Code/config:
- run_migrations.py: drop sync_role_passwords() (its only entries were the now
  -dead grafana_ro + pgbouncer syncs; the guard already made it inert)
- .env: remove the two unused GRAFANA_* vars
- ingest_movement_rev.py / db_audit / deploy_dashboard_api_staging.sh: reword
  stale Grafana/grafana_ro comments

Docs: scrub n8n/Grafana/DWH from CLAUDE.md, CONNECTIONS, DATA_FLOW,
OPERATIONS_MANUAL, docker_commands, KPI_FRAMEWORK, PLATFORM_OVERVIEW,
STAGING_FLEETOPS, and deprecation-banner the two large SQL libraries
(dwh_gold was already dropped 2026-06-05).

Kept deliberately: the grafana_ro DB role (now an unused read-only login),
applied migration history, dated docs/reports/*, and docs/superpowers/* specs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:41:27 +03:00

20 KiB
Raw Blame History

CLAUDE.md — Fireside Communications · Tracksolid Fleet Intelligence

0. Commands

Package manager: uv (not pip/poetry)

uv sync                    # Install / sync dependencies
pytest tests/              # Run test suite
ruff check .               # Lint
mypy .                     # Type check

Run a query via Docker (DATABASE_URL is internal-only):

DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1)
docker exec $DB psql -U postgres -d tracksolid_db -c "SELECT COUNT(*) FROM tracksolid.devices;"

Run a migration file:

docker exec -i $DB psql -U postgres -d tracksolid_db < migrations/07_your_migration.sql

1. What This Project Is

Fleet telematics ingestion and analytics stack for a telco first-line support client operating in Nairobi, Mombasa, and Kampala. The client dispatches field technicians to install, repair, and maintain home and business broadband, handle LOS signal faults, service migrations, and maintain outside plant infrastructure. The fleet is ~80 vehicles across three cities, all tracked via Tracksolid Pro (Jimi IoT API).

This repository ingests the Tracksolid Pro API into a TimescaleDB/PostGIS database and serves it to the FleetNow / FleetOps SPAs (own repos) via the dashboard_api read layer. The pipeline is deployed on Coolify at stage.rahamafresh.com. (Grafana was retired 2026-06-10 — FleetOps now owns KPI visualisation.)

Repository: https://repo.rahamafresh.com/kianiadee/tracksolid_timescale_grafana_prod.git


2. Tech Stack

Layer Technology
Ingestion Python 3.12 — ingest_worker_rev.py (merged movement + events poller), webhook_receiver_rev.py
Shared utils ts_shared_rev.py — token cache, DB pool, API signing, clean helpers
Database PostgreSQL 16 + TimescaleDB 2.15 + PostGIS 3 (tracksolid_db)
Orchestration Docker Compose on Coolify
Visualisation FleetOps / FleetNow SPAs (own repos) via dashboard_api. Grafana removed 2026-06-10 (was redundant — service, provisioning, and runbooks all deleted)
API source Tracksolid Pro / Jimi IoT Open Platform (eu-open.tracksolidpro.com/route/rest)
Backup pg_dump sidecar → rustfs S3 (fleet-db bucket), nightly
Version control Forgejo at repo.rahamafresh.com

3. Instance & Connection Parameters

See docs/CONNECTIONS.md for the full shape. Summary:

  • SSH: ssh -i ~/.ssh/id_ed25519 kianiadee@stage.rahamafresh.com
  • DB name: tracksolid_db · DB user: postgres (internal) · tracksolid_owner (app) · dashboard_ro (read-only, used by the staging bridge) · grafana_ro (legacy read-only role, retained but no longer used by any active service)
  • DB schemas: tracksolid (live, single source of truth) · reporting (map-dashboard read layer) · infrastructure. The legacy tracksolid_2 schema no longer exists (migrations 0206, 2026-04-18); the ops and dwh_gold schemas were purged 2026-06-05 (migrations 12/13) as unused.
  • DB access: DATABASE_URL points to timescale_db:5432 (internal Docker network — not reachable locally). Use docker exec pattern above. See docs/CONNECTIONS.md for full reference.
  • Container naming: Coolify appends a random suffix. Always resolve with:
    docker ps --filter name=<service_name> --format "{{.Names}}" | head -1
    
    e.g. docker ps --filter name=timescale_db --format "{{.Names}}" | head -1
  • Env vars: loaded from .env via env_file in docker-compose.yaml. See docs/CONNECTIONS.md for variable names. Never hardcode secrets.

Map dashboards & read-API

The UIs read the dashboard_api service (FastAPI, dashboard_api_rev.py) at https://fleetapi.rahamafresh.com — the stable read-API for the map dashboards. It serves GeoJSON from the reporting.* functions (fn_live_positions, fn_vehicle_track, fn_trips_for_map)

  • filter options, plus the /analytics/* read endpoints (fleet-summary, utilisation, driver-behaviour, fuel, filters) that power FleetOps. dashboard_api is a STANDALONE Traefik-labelled bridge container, NOT Coolify-managed — it bind-mounts the host file ~/dashboard_api/dashboard_api_rev.py and is (re)deployed by ~/deploy_dashboard_api.sh on the host (an env/CORS change needs a recreate, not a restart). The SPAs that consume it:
Dashboard What Hosting
liveposition.rahamafresh.com live positions only index.html in rustfs bucket liveposition behind an nginx proxy
fleetintelligence.rahamafresh.com historical trips only index.html in rustfs bucket fleetintelligence behind an nginx proxy
fleetnow.rahamafresh.com merged live + trips map (fleet tracking) own repo repo.rahamafresh.com/kianiadee/fleetnow.git, Coolify (Dockerfile → nginx)
fleetops.rahamafresh.com fuel / analytics / KPIs (fleet operations) own repo repo.rahamafresh.com/kianiadee/fleetops.git (local ~/Downloads/projects/15_fleetops), Coolify (Dockerfile → Caddy)

All prod origins must be in the API's DASHBOARD_CORS_ORIGINS (see FIX-D03). FleetNow and FleetOps each live in their own repos — edit them there, not here.

Staging (the fivetitude.com wildcard umbrella). A parallel staging stack mirrors the above so the frozen prod apps are never edited directly: fleetnow.fivetitude.com, fleetops.fivetitude.com, and a second dashboard_api bridge at fleetapi.fivetitude.com (port 8891, deploy_dashboard_api_staging.sh in this repo). The staging bridge reads the same prod DB as the dedicated read-only dashboard_ro role (scripts/dashboard_ro_role.sql + bootstrap_dashboard_ro.sh), with the v_trips refresher disabled (VTRIPS_REFRESH_INTERVAL_S=0) — prod owns the refresh. Each SPA's API base is injected per-environment at container start (FleetOps via Caddy templates/env.js; FleetNow via an nginx envsubst entrypoint → /env.js), falling back to the prod API if unset. Deploys are Forgejo → Coolify webhooks. Full topology + runbooks: docs/STAGING_FLEETOPS_ARCHITECTURE.md and the fleetops repo's docs/webhook-auto-deploy.html.


4. Codebase Map

ts_shared_rev.py            # Shared: config, signing, DB pool, token cache, clean helpers
ingest_worker_rev.py        # Merged poller entrypoint — runs movement + events in one process (the deployed `ingest_worker` service)
ingest_movement_rev.py      # GPS positions, trips, parking, track-list (high-res trail), device sync. main() split into startup_catchup()/register_jobs() for reuse; standalone entrypoint still works
ingest_events_rev.py        # Alarm events polling (fallback for webhook push). Same startup_catchup()/register_jobs() split
webhook_receiver_rev.py     # FastAPI push receiver: /pushobd /pushevent /pushtripreport etc.
sync_driver_audit.py        # One-shot: API↔DB driver/IMEI gap report + full upsert
import_drivers_csv.py       # One-shot: populate 144 X3/JC400P devices from CSV (--apply to commit)
run_migrations.py           # Applies SQL migrations in order at container startup
docker-compose.yaml         # Services (4 app + db, was 7): timescale_db, ingest_worker,
                            #           webhook_receiver, dashboard_api, db_backup.
                            #           pgbouncer + grafana REMOVED 2026-06-10.
docs/                       # Reference docs (connections, API, KPIs, project context)
docs/PLATFORM_OVERVIEW.html # Current-state platform reference (architecture, deploy, read-API,
                            #   full DB schema) — open in a browser.
docs/OSM_POI_EXPORT.md     # Runbook: OSM .pbf → POI GeoJSON → FleetNow map layer (Shell stations)
docs/superpowers/          # Pitch specs and implementation plans (not deployed code)
scripts/export_osm_pois.py # OSM .pbf → GeoJSON+CSV POI exporter (amenity/brand filter); see OSM_POI_EXPORT.md
migrations/                 # Numbered SQL migrations 0219, applied in order by run_migrations.py
                            #   02 full schema · 03 webhook · 04 distance fix · 05 enhancements
                            #   06 ops/analytics · 07 views · 08 config · 09 trips enrichment
                            #   10_driver_clock_views.sql · 10_pgbouncer_auth.sql · 11 reporting
                            #   12 drop ops schema · 13 drop dwh_gold schema (both 2026-06-05)
                            #   14 fleet segment · 15 map exclude cc · 16 live feed vehicle_type
                            #   17 reporting.v_fuel_daily (FleetOps) · 18 grant reporting.* to grafana_ro
                            #   19 reporting.v_ingest_health (pipeline freshness; replaces Grafana panels)
deploy_dashboard_api_staging.sh  # Staging dashboard_api bridge (8891, fleetapi.fivetitude.com); see STAGING_FLEETOPS_ARCHITECTURE.md
scripts/dashboard_ro_role.sql + bootstrap_dashboard_ro.sh  # Dedicated read-only DB role for the staging bridge
Dockerfile                  # Custom image for ingest/webhook containers
pyproject.toml              # Python project + uv dependency spec
backup/                     # pg_dump sidecar scripts and config
data/                       # Source CSVs (FS Logistics 144-device list, FSG vehicles)
legacy/                     # Superseded pre-_rev scripts + old pipeline notes (NOT deployed)
docs/manuals/               # OPERATIONS_MANUAL, docker commands, DB manual
docs/reference/             # 01_BusinessAnalytics.md (SQL library — read before writing queries),
                            #   tracksolidApiDocumentation.md, 260507_pgbouncer_deployment.md
docs/reports/               # Baseline reports, audit output, improvement reviews

5. Database Schema — Key Tables

tracksolid.devices           -- Device / driver / vehicle registry (63 rows; 0 driver_name populated)
                             -- IMEI mix: 353549* AT4 (23), 862798* X3/JC400P (23), 865135* X3/JC400P (10), 359857* (7)
                             -- Full CSV (144 devices) not yet imported — run import_drivers_csv.py --apply
tracksolid.live_positions    -- Current fix per IMEI (19 rows; refreshed every 60s by ingest_worker / movement pipeline)
tracksolid.position_history  -- All GPS fixes (hypertable, partitioned by gps_time). ~519 rows (308 track_list + 211 poll).
                             -- pg_stat_user_tables shows 0 for hypertables — always COUNT(*) directly.
                             -- source: 'poll' (60s sweep) | 'track_list' (30m high-res)
tracksolid.trips             -- Trip summaries: distance_km, driving_time_s, avg/max speed
tracksolid.parking_events    -- Stop events with duration and address (0 rows — endpoint returning empty)
tracksolid.alarms            -- Alarm events (alarm_type, alarm_name, alarm_time) — 10 rows, polling healthy
tracksolid.obd_readings      -- OBD diagnostics (push only, awaiting webhook registration)
tracksolid.device_events     -- Power on/off tamper events (push only)
tracksolid.ingestion_log     -- API call audit trail — 875 runs / 24h, 0 failures at last check (2026-04-19)
tracksolid.schema_migrations -- Applied migrations 0219 (19 reporting.v_ingest_health applied 2026-06-10)
-- PURGED 2026-06-05 (migrations 12 + 13): the dormant `ops` schema (tickets, service_log,
-- odometer_readings, cost_rates, kpi_targets, vw_service_forecast), tracksolid.dispatch_log,
-- and the `dwh_gold` schema (dim_vehicles, fact_daily_fleet_metrics, refresh_daily_metrics).
-- Those workshop/dispatch/SLA/utilisation features were never implemented. Do NOT reintroduce
-- references to ops.* or dwh_gold.* — they no longer exist.

Full DDL: 02_tracksolid_full_schema_rev.sql + migrations 0306.

Analytics views (migration 07_analytics_views.sql) — one view per BA-file query block, readable by grafana_ro:

tracksolid.v_fleet_today              -- §9 per-vehicle today roll-up
tracksolid.v_vehicles_not_moved_today -- §2.3 alert source
tracksolid.v_active_dispatch_map      -- §4.3 geomap source
tracksolid.v_currently_idle           -- §2.2 idle lens
tracksolid.v_driver_aggregates_daily  -- §3.1 + §3.2 aggression index source
tracksolid.v_fleet_km_daily           -- §7 Panel 5 distance trend
tracksolid.v_alarms_daily             -- §7 Panel 7 alarm frequency
-- v_utilisation_daily (dwh_gold) and v_sla_inflight (ops) were DROPPED 2026-06-05 with
-- their schemas (migrations 12/13); their Grafana panels were removed from the dashboard.

All views carry a COMMENT ON VIEW referencing their spec — \d+ tracksolid.v_* shows the provenance.


6. API Critical Facts

Always read docs/reference/tracksolidApiDocumentation.md before adding a new endpoint call.

Fact Detail
Auth OAuth2 — token cached in tracksolid.api_token_cache, refreshed via jimi.oauth.token.refresh
Signing MD5: secret + sorted(k+v pairs) + secret — see build_sign() in ts_shared_rev.py
Batch limit Max 50 IMEIs per call for most endpoints
distance field Returns METRES, not km despite docs. Always divide by 1000. (FIX-M16)
driverName/driverPhone From jimi.user.device.list — will be NULL if not set in Tracksolid Pro UI
alarm_type field API polling returns alertTypeId/alarmTypeName — NOT alarmType/alarmName (FIX-E06)
durSecond Parking endpoint returns durSecond, not seconds (FIX-M13)
jimi.device.track.mileage startMileage/endMileage are cumulative odometer in metres
Rate limit Code 1006 — back off and retry with re-sign (handled in api_post())
OBD data Push only via /pushobd webhook — no polling endpoint exists

7. Fix History (do not regress)

Fix ID File What it fixed
FIX-M11 ingest_movement_rev.py Removed erroneous ×1000 on distance (was storing km as mm)
FIX-M13 ingest_movement_rev.py Parking: added acc_type=0, account; mapped durSecond
FIX-M14 ingest_movement_rev.py poll_track_list() — high-res GPS trail every 30m
FIX-M15 ingest_movement_rev.py get_device_locations() — on-demand precision refresh
FIX-M16 ingest_movement_rev.py distance from API is metres → divide by 1000 before storing
FIX-M17 ingest_movement_rev.py sync_devices() ON CONFLICT now updates all 26 fields (was 5)
FIX-M18 ingest_movement_rev.py sync_devices() pulls vehicleName/vehicleNumber/driverName/driverPhone/sim from jimi.track.device.detail — list endpoint returns null for these even when set
FIX-M19 ts_shared_rev.py, ingest_movement_rev.py Multi-account support: fleet spans fireside, Fireside@HQ, Fireside_MSA (156 devices total). sync_devices, poll_live_positions, poll_parking iterate TRACKSOLID_TARGETS (comma-separated env var). New helper get_active_imeis_by_target() scopes parking calls to the right account
FIX-E06 ingest_events_rev.py Alarm field mapping: alertTypeId/alarmTypeName/alertTime
BUG-02 Migration 04 Historical distance_m rows ÷1,000,000 → renamed to distance_km
FIX-D01 dashboard_api_rev.py POST /webhook/fleet-dashboard read body as JSON, but the SPA posts x-www-form-urlencodedrequest.json() threw, filters silently dropped, map always returned the whole fleet. Now parsed by Content-Type (parse_qs for form, JSON still accepted). Commit f1387d1
FIX-D02 dashboard_api_rev.py reporting.v_trips matview froze on 2026-06-01 when n8n (which ran the scheduled refresh) was retired → dashboard showed "no trips". Added an in-process background refresher (REFRESH MATERIALIZED VIEW CONCURRENTLY every VTRIPS_REFRESH_INTERVAL_S, default 300s; pg advisory-lock guarded for --workers; logs to reporting.refresh_log source=dashboard_api). Commit 30b3515
FIX-D03 dashboard_api_rev.py, ~/deploy_dashboard_api.sh (host) Added https://fleetnow.rahamafresh.com to DASHBOARD_CORS_ORIGINS default for the merged FleetNow dashboard. The standalone bridge container inherits its env from webhook_receiver, which already carries the old two-origin value — so the deploy script's conditional append never fired. The script now strips any inherited DASHBOARD_CORS_ORIGINS and sets all three origins unconditionally, and guards the mv so a missing staged dashboard_api_rev.py doesn't abort the run under set -e (env changes need a container recreate, not a restart). Commit d95e5c2

8. Working Rules

  1. No prod push without explicit user confirmation. Always state what you are about to push and wait.
  2. Never rewrite a migration that is already applied. Check tracksolid.schema_migrations first. Add a new numbered migration file for any schema change.
  3. Read before writing. Before suggesting any code change, read the relevant source file. Before writing a query, check docs/reference/01_BusinessAnalytics.md for an existing pattern.
  4. Reuse shared utilities. All DB access via get_conn(), all API calls via api_post(), all cleaning via clean() / clean_num() / clean_int() / clean_ts() in ts_shared_rev.py. Do not reinvent these.
  5. Resolve container names dynamically. Never hardcode the Coolify suffix. Use docker ps --filter name=<service>.
  6. SSH only when asked. Default workflow is local code → commit → push. SSH into the instance only when explicitly asked to test or run something live.
  7. Secrets from env only. Connection strings, API keys, and passwords live in .env. Reference variable names from docs/CONNECTIONS.md, never values.
  8. Two developers, one incoming. Write code and docs that a second developer (mixed technical/operations background) can follow without prior context.
  9. Forgejo API auth: credentials stored in macOS keychain. Retrieve with git credential fill (host=repo.rahamafresh.com). Use basic auth against https://repo.rahamafresh.com/api/v1 directly — no tea or gh needed.
  10. Single live schema. All live data lives in tracksolid; the map-dashboard read layer lives in reporting. Do not reintroduce references to the retired tracksolid_2, ops, or dwh_gold schemas (the latter two purged 2026-06-05, migrations 12/13).

9. Fleet State (as of 2026-04-19)

Metric Value
Registered devices (tracksolid.devices) 63 total — 23 × 353549* (AT4), 23 × 862798* + 10 × 865135* (X3/JC400P), 7 × 359857*
Devices in CSV not yet imported 144 (X3/JC400P); import_drivers_csv.py --apply will upsert names + plates
Driver names populated 0 / 63 — pending CSV import
Live positions 19 (latest fix 2026-04-19 10:25 UTC)
Trips recorded 8 (latest 2026-04-19 08:34 UTC)
Alarms recorded 10 (latest 2026-04-19 09:15 UTC)
position_history rows 519 (308 track_list + 211 poll); hypertable stats don't update in pg_stat_user_tables — query directly
Pipeline status Running healthy: 875 runs / 24h, 0 failures across all six endpoints
Cities active Nairobi (primary), Mombasa (deploying), Kampala (4 devices in CSV)
Service flags KDK 829A GP (239,264 km), Belta KCU-647D (235,000 km)

Latest full snapshot: docs/reports/260412_baseline_report.md


10. Open Items (update as resolved)

Priority Item
LOW Pipeline freshness is surfaced by reporting.v_ingest_health (migration 19) via dashboard_api GET /health/ingest — wire it into a FleetOps panel.
HIGH Run import_drivers_csv.py --apply — 144 X3/JC400P devices with names + plates waiting
HIGH Register webhooks: /pushoil /pushtem /pushlbs (auto-register on push now done — commit 257643c)
HIGH Investigate X3-63282 in Kampala — legitimate or unauthorised?
MEDIUM Set fuel_100km per vehicle type to activate fuel cost calculations
MEDIUM Investigate 44 silent devices (only 19 of 63 reporting) — SIM installed? Activated?
MEDIUM Co-develop client KPI framework (see docs/KPI_FRAMEWORK.md)
LOW Populate geofences — depot boundaries, city zones