tracksolid_timescale_grafan.../CLAUDE.md

273 lines
20 KiB
Markdown
Raw Normal View History

# CLAUDE.md — Fireside Communications · Tracksolid Fleet Intelligence
## 0. Commands
**Package manager:** `uv` (not pip/poetry)
```bash
uv sync # Install / sync dependencies
pytest tests/ # Run test suite
ruff check . # Lint
mypy . # Type check
```
**Run a query via Docker (DATABASE_URL is internal-only):**
```bash
DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1)
docker exec $DB psql -U postgres -d tracksolid_db -c "SELECT COUNT(*) FROM tracksolid.devices;"
```
**Run a migration file:**
```bash
docker exec -i $DB psql -U postgres -d tracksolid_db < migrations/07_your_migration.sql
```
---
## 1. What This Project Is
Fleet telematics ingestion and analytics stack for a **telco first-line support client** operating in Nairobi, Mombasa, and Kampala. The client dispatches field technicians to install, repair, and maintain home and business broadband, handle LOS signal faults, service migrations, and maintain outside plant infrastructure. The fleet is ~80 vehicles across three cities, all tracked via Tracksolid Pro (Jimi IoT API).
This repository ingests the Tracksolid Pro API into a TimescaleDB/PostGIS database and serves it to the FleetNow / FleetOps SPAs (own repos) via the `dashboard_api` read layer. The pipeline is deployed on Coolify at `stage.rahamafresh.com`. (Grafana was retired 2026-06-10 — FleetOps now owns KPI visualisation.)
**Repository:** `https://repo.rahamafresh.com/kianiadee/tracksolid_timescale_grafana_prod.git`
---
## 2. Tech Stack
| Layer | Technology |
|---|---|
| Ingestion | Python 3.12 — `ingest_worker_rev.py` (merged movement + events poller), `webhook_receiver_rev.py` |
| Shared utils | `ts_shared_rev.py` — token cache, DB pool, API signing, clean helpers |
| Database | PostgreSQL 16 + TimescaleDB 2.15 + PostGIS 3 (`tracksolid_db`) |
| Orchestration | Docker Compose on Coolify |
| Visualisation | FleetOps / FleetNow SPAs (own repos) via `dashboard_api`. Grafana **removed** 2026-06-10 (was redundant — service, provisioning, and runbooks all deleted) |
| API source | Tracksolid Pro / Jimi IoT Open Platform (`eu-open.tracksolidpro.com/route/rest`) |
| Backup | pg_dump sidecar → rustfs S3 (`fleet-db` bucket), nightly |
| Version control | Forgejo at `repo.rahamafresh.com` |
---
## 3. Instance & Connection Parameters
See `docs/CONNECTIONS.md` for the full shape. Summary:
- **SSH:** `ssh -i ~/.ssh/id_ed25519 kianiadee@stage.rahamafresh.com`
- **DB name:** `tracksolid_db` · **DB user:** `postgres` (internal) · `tracksolid_owner` (app) · `dashboard_ro` (read-only, used by the staging bridge) · `grafana_ro` (legacy read-only role, retained but no longer used by any active service)
- **DB schemas:** `tracksolid` (live, single source of truth) · `reporting` (map-dashboard read layer) · `infrastructure`. The legacy `tracksolid_2` schema no longer exists (migrations 0206, 2026-04-18); the `ops` and `dwh_gold` schemas were purged 2026-06-05 (migrations 12/13) as unused.
- **DB access:** `DATABASE_URL` points to `timescale_db:5432` (internal Docker network — not reachable locally). Use `docker exec` pattern above. See `docs/CONNECTIONS.md` for full reference.
- **Container naming:** Coolify appends a random suffix. Always resolve with:
```bash
docker ps --filter name=<service_name> --format "{{.Names}}" | head -1
```
e.g. `docker ps --filter name=timescale_db --format "{{.Names}}" | head -1`
- **Env vars:** loaded from `.env` via `env_file` in `docker-compose.yaml`. See `docs/CONNECTIONS.md` for variable names. Never hardcode secrets.
### Map dashboards & read-API
The UIs read the **`dashboard_api`** service (FastAPI, `dashboard_api_rev.py`) at
`https://fleetapi.rahamafresh.com` — the stable read-API for the map dashboards. It serves
GeoJSON from the `reporting.*` functions (`fn_live_positions`, `fn_vehicle_track`, `fn_trips_for_map`)
+ filter options, **plus the `/analytics/*` read endpoints** (fleet-summary, utilisation,
driver-behaviour, fuel, filters) that power FleetOps. **`dashboard_api` is a STANDALONE
Traefik-labelled bridge container, NOT Coolify-managed** — it bind-mounts the host file
`~/dashboard_api/dashboard_api_rev.py` and is (re)deployed by `~/deploy_dashboard_api.sh` on the host
(an env/CORS change needs a *recreate*, not a restart). The SPAs that consume it:
| Dashboard | What | Hosting |
|---|---|---|
| `liveposition.rahamafresh.com` | live positions only | `index.html` in rustfs bucket `liveposition` behind an nginx proxy |
| `fleetintelligence.rahamafresh.com` | historical trips only | `index.html` in rustfs bucket `fleetintelligence` behind an nginx proxy |
| `fleetnow.rahamafresh.com` | **merged** live + trips map (fleet *tracking*) | **own repo** `repo.rahamafresh.com/kianiadee/fleetnow.git`, **Coolify (Dockerfile → nginx)** |
| `fleetops.rahamafresh.com` | fuel / analytics / KPIs (fleet *operations*) | **own repo** `repo.rahamafresh.com/kianiadee/fleetops.git` (local `~/Downloads/projects/15_fleetops`), **Coolify (Dockerfile → Caddy)** |
All prod origins must be in the API's `DASHBOARD_CORS_ORIGINS` (see FIX-D03). **FleetNow and FleetOps
each live in their own repos — edit them there, not here.**
**Staging (the `fivetitude.com` wildcard umbrella).** A parallel staging stack mirrors the above so the
frozen prod apps are never edited directly: `fleetnow.fivetitude.com`, `fleetops.fivetitude.com`, and a
**second `dashboard_api` bridge** at `fleetapi.fivetitude.com` (port **8891**,
`deploy_dashboard_api_staging.sh` in this repo). The staging bridge reads the **same prod DB** as the
dedicated **read-only `dashboard_ro`** role (`scripts/dashboard_ro_role.sql` + `bootstrap_dashboard_ro.sh`),
with the `v_trips` refresher disabled (`VTRIPS_REFRESH_INTERVAL_S=0`) — prod owns the refresh. Each SPA's
**API base is injected per-environment at container start** (FleetOps via Caddy `templates``/env.js`;
FleetNow via an nginx `envsubst` entrypoint → `/env.js`), falling back to the prod API if unset. Deploys
are **Forgejo → Coolify webhooks**. Full topology + runbooks: **`docs/STAGING_FLEETOPS_ARCHITECTURE.md`**
and the fleetops repo's `docs/webhook-auto-deploy.html`.
---
## 4. Codebase Map
```
ts_shared_rev.py # Shared: config, signing, DB pool, token cache, clean helpers
ingest_worker_rev.py # Merged poller entrypoint — runs movement + events in one process (the deployed `ingest_worker` service)
ingest_movement_rev.py # GPS positions, trips, parking, track-list (high-res trail), device sync. main() split into startup_catchup()/register_jobs() for reuse; standalone entrypoint still works
ingest_events_rev.py # Alarm events polling (fallback for webhook push). Same startup_catchup()/register_jobs() split
webhook_receiver_rev.py # FastAPI push receiver: /pushobd /pushevent /pushtripreport etc.
sync_driver_audit.py # One-shot: API↔DB driver/IMEI gap report + full upsert
import_drivers_csv.py # One-shot: populate 144 X3/JC400P devices from CSV (--apply to commit)
run_migrations.py # Applies SQL migrations in order at container startup
docker-compose.yaml # Services (4 app + db, was 7): timescale_db, ingest_worker,
# webhook_receiver, dashboard_api, db_backup.
# pgbouncer + grafana REMOVED 2026-06-10.
docs/ # Reference docs (connections, API, KPIs, project context)
docs/PLATFORM_OVERVIEW.html # Current-state platform reference (architecture, deploy, read-API,
# full DB schema) — open in a browser.
docs/OSM_POI_EXPORT.md # Runbook: OSM .pbf → POI GeoJSON → FleetNow map layer (Shell stations)
docs/superpowers/ # Pitch specs and implementation plans (not deployed code)
scripts/export_osm_pois.py # OSM .pbf → GeoJSON+CSV POI exporter (amenity/brand filter); see OSM_POI_EXPORT.md
migrations/ # Numbered SQL migrations 0219, applied in order by run_migrations.py
# 02 full schema · 03 webhook · 04 distance fix · 05 enhancements
# 06 ops/analytics · 07 views · 08 config · 09 trips enrichment
# 10_driver_clock_views.sql · 10_pgbouncer_auth.sql · 11 reporting
# 12 drop ops schema · 13 drop dwh_gold schema (both 2026-06-05)
# 14 fleet segment · 15 map exclude cc · 16 live feed vehicle_type
# 17 reporting.v_fuel_daily (FleetOps) · 18 grant reporting.* to grafana_ro
# 19 reporting.v_ingest_health (pipeline freshness; replaces Grafana panels)
deploy_dashboard_api_staging.sh # Staging dashboard_api bridge (8891, fleetapi.fivetitude.com); see STAGING_FLEETOPS_ARCHITECTURE.md
scripts/dashboard_ro_role.sql + bootstrap_dashboard_ro.sh # Dedicated read-only DB role for the staging bridge
Dockerfile # Custom image for ingest/webhook containers
pyproject.toml # Python project + uv dependency spec
backup/ # pg_dump sidecar scripts and config
data/ # Source CSVs (FS Logistics 144-device list, FSG vehicles)
legacy/ # Superseded pre-_rev scripts + old pipeline notes (NOT deployed)
docs/manuals/ # OPERATIONS_MANUAL, docker commands, DB manual
docs/reference/ # 01_BusinessAnalytics.md (SQL library — read before writing queries),
# tracksolidApiDocumentation.md, 260507_pgbouncer_deployment.md
docs/reports/ # Baseline reports, audit output, improvement reviews
```
---
## 5. Database Schema — Key Tables
```sql
tracksolid.devices -- Device / driver / vehicle registry (63 rows; 0 driver_name populated)
-- IMEI mix: 353549* AT4 (23), 862798* X3/JC400P (23), 865135* X3/JC400P (10), 359857* (7)
-- Full CSV (144 devices) not yet imported — run import_drivers_csv.py --apply
tracksolid.live_positions -- Current fix per IMEI (19 rows; refreshed every 60s by ingest_worker / movement pipeline)
tracksolid.position_history -- All GPS fixes (hypertable, partitioned by gps_time). ~519 rows (308 track_list + 211 poll).
-- pg_stat_user_tables shows 0 for hypertables — always COUNT(*) directly.
-- source: 'poll' (60s sweep) | 'track_list' (30m high-res)
tracksolid.trips -- Trip summaries: distance_km, driving_time_s, avg/max speed
tracksolid.parking_events -- Stop events with duration and address (0 rows — endpoint returning empty)
tracksolid.alarms -- Alarm events (alarm_type, alarm_name, alarm_time) — 10 rows, polling healthy
tracksolid.obd_readings -- OBD diagnostics (push only, awaiting webhook registration)
tracksolid.device_events -- Power on/off tamper events (push only)
tracksolid.ingestion_log -- API call audit trail — 875 runs / 24h, 0 failures at last check (2026-04-19)
tracksolid.schema_migrations -- Applied migrations 0219 (19 reporting.v_ingest_health applied 2026-06-10)
-- PURGED 2026-06-05 (migrations 12 + 13): the dormant `ops` schema (tickets, service_log,
-- odometer_readings, cost_rates, kpi_targets, vw_service_forecast), tracksolid.dispatch_log,
-- and the `dwh_gold` schema (dim_vehicles, fact_daily_fleet_metrics, refresh_daily_metrics).
-- Those workshop/dispatch/SLA/utilisation features were never implemented. Do NOT reintroduce
-- references to ops.* or dwh_gold.* — they no longer exist.
```
Full DDL: `02_tracksolid_full_schema_rev.sql` + migrations `03``06`.
**Analytics views (migration `07_analytics_views.sql`)** — one view per BA-file query block, readable by `grafana_ro`:
```sql
tracksolid.v_fleet_today -- §9 per-vehicle today roll-up
tracksolid.v_vehicles_not_moved_today -- §2.3 alert source
tracksolid.v_active_dispatch_map -- §4.3 geomap source
tracksolid.v_currently_idle -- §2.2 idle lens
tracksolid.v_driver_aggregates_daily -- §3.1 + §3.2 aggression index source
tracksolid.v_fleet_km_daily -- §7 Panel 5 distance trend
tracksolid.v_alarms_daily -- §7 Panel 7 alarm frequency
-- v_utilisation_daily (dwh_gold) and v_sla_inflight (ops) were DROPPED 2026-06-05 with
-- their schemas (migrations 12/13); their Grafana panels were removed from the dashboard.
```
All views carry a `COMMENT ON VIEW` referencing their spec — `\d+ tracksolid.v_*` shows the provenance.
---
## 6. API Critical Facts
**Always read `docs/reference/tracksolidApiDocumentation.md` before adding a new endpoint call.**
| Fact | Detail |
|---|---|
| Auth | OAuth2 — token cached in `tracksolid.api_token_cache`, refreshed via `jimi.oauth.token.refresh` |
| Signing | MD5: `secret + sorted(k+v pairs) + secret` — see `build_sign()` in `ts_shared_rev.py` |
| Batch limit | Max 50 IMEIs per call for most endpoints |
| `distance` field | **Returns METRES, not km** despite docs. Always divide by 1000. (FIX-M16) |
| `driverName`/`driverPhone` | From `jimi.user.device.list` — will be NULL if not set in Tracksolid Pro UI |
| `alarm_type` field | API polling returns `alertTypeId`/`alarmTypeName` — NOT `alarmType`/`alarmName` (FIX-E06) |
| `durSecond` | Parking endpoint returns `durSecond`, not `seconds` (FIX-M13) |
| `jimi.device.track.mileage` | `startMileage`/`endMileage` are cumulative odometer in **metres** |
| Rate limit | Code 1006 — back off and retry with re-sign (handled in `api_post()`) |
| OBD data | Push only via `/pushobd` webhook — no polling endpoint exists |
---
## 7. Fix History (do not regress)
| Fix ID | File | What it fixed |
|---|---|---|
| FIX-M11 | `ingest_movement_rev.py` | Removed erroneous ×1000 on distance (was storing km as mm) |
| FIX-M13 | `ingest_movement_rev.py` | Parking: added `acc_type=0`, `account`; mapped `durSecond` |
| FIX-M14 | `ingest_movement_rev.py` | `poll_track_list()` — high-res GPS trail every 30m |
| FIX-M15 | `ingest_movement_rev.py` | `get_device_locations()` — on-demand precision refresh |
| FIX-M16 | `ingest_movement_rev.py` | `distance` from API is metres → divide by 1000 before storing |
| FIX-M17 | `ingest_movement_rev.py` | `sync_devices()` ON CONFLICT now updates all 26 fields (was 5) |
| FIX-M18 | `ingest_movement_rev.py` | `sync_devices()` pulls `vehicleName`/`vehicleNumber`/`driverName`/`driverPhone`/`sim` from `jimi.track.device.detail` — list endpoint returns null for these even when set |
| FIX-M19 | `ts_shared_rev.py`, `ingest_movement_rev.py` | Multi-account support: fleet spans `fireside`, `Fireside@HQ`, `Fireside_MSA` (156 devices total). `sync_devices`, `poll_live_positions`, `poll_parking` iterate `TRACKSOLID_TARGETS` (comma-separated env var). New helper `get_active_imeis_by_target()` scopes parking calls to the right account |
| FIX-E06 | `ingest_events_rev.py` | Alarm field mapping: `alertTypeId`/`alarmTypeName`/`alertTime` |
| BUG-02 | Migration 04 | Historical `distance_m` rows ÷1,000,000 → renamed to `distance_km` |
| FIX-D01 | `dashboard_api_rev.py` | `POST /webhook/fleet-dashboard` read body as JSON, but the SPA posts `x-www-form-urlencoded``request.json()` threw, filters silently dropped, map always returned the whole fleet. Now parsed by Content-Type (`parse_qs` for form, JSON still accepted). Commit `f1387d1` |
| FIX-D02 | `dashboard_api_rev.py` | `reporting.v_trips` matview froze on 2026-06-01 when n8n (which ran the scheduled refresh) was retired → dashboard showed "no trips". Added an in-process background refresher (`REFRESH MATERIALIZED VIEW CONCURRENTLY` every `VTRIPS_REFRESH_INTERVAL_S`, default 300s; pg advisory-lock guarded for `--workers`; logs to `reporting.refresh_log` source=`dashboard_api`). Commit `30b3515` |
| FIX-D03 | `dashboard_api_rev.py`, `~/deploy_dashboard_api.sh` (host) | Added `https://fleetnow.rahamafresh.com` to `DASHBOARD_CORS_ORIGINS` default for the merged **FleetNow** dashboard. The standalone bridge container inherits its env from `webhook_receiver`, which already carries the old two-origin value — so the deploy script's *conditional* append never fired. The script now **strips any inherited `DASHBOARD_CORS_ORIGINS` and sets all three origins unconditionally**, and **guards the `mv`** so a missing staged `dashboard_api_rev.py` doesn't abort the run under `set -e` (env changes need a container *recreate*, not a restart). Commit `d95e5c2` |
---
## 8. Working Rules
1. **No prod push without explicit user confirmation.** Always state what you are about to push and wait.
2. **Never rewrite a migration that is already applied.** Check `tracksolid.schema_migrations` first. Add a new numbered migration file for any schema change.
3. **Read before writing.** Before suggesting any code change, read the relevant source file. Before writing a query, check `docs/reference/01_BusinessAnalytics.md` for an existing pattern.
4. **Reuse shared utilities.** All DB access via `get_conn()`, all API calls via `api_post()`, all cleaning via `clean()` / `clean_num()` / `clean_int()` / `clean_ts()` in `ts_shared_rev.py`. Do not reinvent these.
5. **Resolve container names dynamically.** Never hardcode the Coolify suffix. Use `docker ps --filter name=<service>`.
6. **SSH only when asked.** Default workflow is local code → commit → push. SSH into the instance only when explicitly asked to test or run something live.
7. **Secrets from env only.** Connection strings, API keys, and passwords live in `.env`. Reference variable names from `docs/CONNECTIONS.md`, never values.
8. **Two developers, one incoming.** Write code and docs that a second developer (mixed technical/operations background) can follow without prior context.
9. **Forgejo API auth:** credentials stored in macOS keychain. Retrieve with `git credential fill` (host=repo.rahamafresh.com). Use basic auth against `https://repo.rahamafresh.com/api/v1` directly — no `tea` or `gh` needed.
10. **Single live schema.** All live data lives in `tracksolid`; the map-dashboard read layer lives in `reporting`. Do not reintroduce references to the retired `tracksolid_2`, `ops`, or `dwh_gold` schemas (the latter two purged 2026-06-05, migrations 12/13).
---
## 9. Fleet State (as of 2026-04-19)
| Metric | Value |
|---|---|
| Registered devices (`tracksolid.devices`) | 63 total — 23 × `353549*` (AT4), 23 × `862798*` + 10 × `865135*` (X3/JC400P), 7 × `359857*` |
| Devices in CSV not yet imported | 144 (X3/JC400P); `import_drivers_csv.py --apply` will upsert names + plates |
| Driver names populated | 0 / 63 — pending CSV import |
| Live positions | 19 (latest fix 2026-04-19 10:25 UTC) |
| Trips recorded | 8 (latest 2026-04-19 08:34 UTC) |
| Alarms recorded | 10 (latest 2026-04-19 09:15 UTC) |
| `position_history` rows | 519 (308 `track_list` + 211 `poll`); hypertable stats don't update in `pg_stat_user_tables` — query directly |
| Pipeline status | Running healthy: 875 runs / 24h, 0 failures across all six endpoints |
| Cities active | Nairobi (primary), Mombasa (deploying), Kampala (4 devices in CSV) |
| Service flags | KDK 829A GP (239,264 km), Belta KCU-647D (235,000 km) |
Latest full snapshot: `docs/reports/260412_baseline_report.md`
---
## 10. Open Items (update as resolved)
| Priority | Item |
|---|---|
| LOW | Pipeline freshness is surfaced by `reporting.v_ingest_health` (migration 19) via dashboard_api `GET /health/ingest` — wire it into a FleetOps panel. |
| HIGH | Run `import_drivers_csv.py --apply` — 144 X3/JC400P devices with names + plates waiting |
| HIGH | Register webhooks: `/pushoil` `/pushtem` `/pushlbs` (auto-register on push now done — commit 257643c) |
| HIGH | Investigate X3-63282 in Kampala — legitimate or unauthorised? |
| MEDIUM | Set `fuel_100km` per vehicle type to activate fuel cost calculations |
| MEDIUM | Investigate 44 silent devices (only 19 of 63 reporting) — SIM installed? Activated? |
| MEDIUM | Co-develop client KPI framework (see `docs/KPI_FRAMEWORK.md`) |
| LOW | Populate geofences — depot boundaries, city zones |