From 274473c5442a8bfffc3f606e18abcb29b77863f6 Mon Sep 17 00:00:00 2001 From: David Kiania Date: Sat, 18 Apr 2026 08:39:58 +0300 Subject: [PATCH] docs: update analytics report with live DB state (18 Apr 2026) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - §1: add current deployment state table — 63 devices, 0 driver names, 5 trips, pipeline stopped 6 Apr (401 token expiry); note tracksolid_2 vs tracksolid schema split - §6: status column per question (Ready/Needs data/Blocked) reflecting actual DB state; add cost-per-ticket, city drift, odometer rows - §8: add Step 0 full deployment sequence (git pull → migrations 01-06 → container rebuild → sync_driver_audit → import_drivers_csv); Step 3 updated to reference import script; Step 5 collapsed to pointer - Footer: db-state stamp and update date Co-Authored-By: Claude Opus 4.7 --- 01_BusinessAnalytics.md | 163 +++++++++++++++++++++++++++------------- 1 file changed, 112 insertions(+), 51 deletions(-) diff --git a/01_BusinessAnalytics.md b/01_BusinessAnalytics.md index 28138be..3a1b7a5 100644 --- a/01_BusinessAnalytics.md +++ b/01_BusinessAnalytics.md @@ -41,7 +41,32 @@ Every query in this document is tagged by intended consumption cadence. Build Gr ## 1. Data Foundation Summary -The ingestion stack currently populates the following data sources, each feeding the analytics layer: +### 1.1 Current Deployment State *(as of 18 Apr 2026)* + +> **⚠ New stack not yet live.** The refactored ingestion pipeline (`ingest_movement_rev.py` v2.2) targets the `tracksolid` schema, which is currently empty. All live data sits in the legacy `tracksolid_2` schema populated by the prior codebase. The queries in this document are written for the target schema (`tracksolid`) and will produce results once the new stack is deployed and the device sync has run. + +| Metric | Observed value | Source | +|---|---|---| +| Devices registered | **63** (AT4-series, `353549*` IMEIs) | `tracksolid_2.devices` | +| Driver names populated | **0 / 63** | `tracksolid_2.devices` | +| Vehicle numbers populated | **0 / 63** | `tracksolid_2.devices` | +| SIM numbers populated | **14 / 63** | `tracksolid_2.devices` | +| Live positions (stale) | **19** | `tracksolid_2.live_positions` | +| Position history rows | **208** | `tracksolid_2.position_history` | +| Trips recorded | **5** (12.8 km total) | `tracksolid_2.trips` | +| Parking / alarms / OBD | **0** each | `tracksolid_2.*` | +| Last pipeline run | **6 Apr 2026 13:20 EAT** | `tracksolid_2.ingestion_log` | +| Pipeline failure rate | **41%** (277/668 runs, all 401 auth errors) | `tracksolid_2.ingestion_log` | + +**Why the pipeline stopped (6 Apr):** 276 consecutive `401 Unauthorized` errors against `eu-open.tracksolidpro.com`. The API token expired and was not refreshed — the prior codebase lacked the auto-refresh logic that `ts_shared_rev.py` now includes. Deploying the new stack resolves this permanently. + +**CSV fleet (144 devices, X3/JC400P series):** The `20260414_FS__Logistics - final_fixed.csv` file contains a separate, newer batch of devices (`865135*`, `862798*` IMEIs) with full driver names and plates. **These 144 devices are not yet registered in the DB at all** — they will be synced by `sync_driver_audit.py` after the new stack is deployed, then enriched by `import_drivers_csv.py`. + +--- + +### 1.2 Target Data Architecture + +Once deployed, the ingestion stack populates the following data sources: | Table | Content | Frequency | |---|---|---| @@ -54,7 +79,7 @@ The ingestion stack currently populates the following data sources, each feeding | `tracksolid.devices` | Vehicle and driver registry | Daily at 02:00 | | `dwh_gold.fact_daily_fleet_metrics` | Daily KPI aggregates per vehicle | Nightly ETL | -**Position history density** increased significantly with the addition of `poll_track_list` (POLL-01): +**Position history density** improvement with `poll_track_list` (POLL-01): | Before | After | |---|---| @@ -981,28 +1006,32 @@ ORDER BY k.total_km DESC; ## 6. Business Questions Now Answerable -| Business Question | Primary Data Source | Confidence | +Status key: **✅ Ready** = answerable once new stack deployed | **⚙ Needs data** = additional setup required | **🔴 Blocked** = pending action before any data + +| Business Question | Primary Data Source | Status | |---|---|---| -| Which vehicles are moving right now? | `live_positions` | High | -| Who started work latest today? | `fact_daily_fleet_metrics.day_start_time` | High | -| Who drove the most km this week? | `trips` + `devices` | High | -| Which vehicle spent the most time idling? | `trips.idle_time_s` | High | -| How much fuel was wasted on idle today? | `trips.idle_time_s` × est. rate | Medium (needs `fuel_100km` set) | -| Which driver triggered the most alarms this month? | `alarms` + `devices` | High | -| What is total fleet distance this month? | `trips` | High | -| Which vehicles did not move at all today? | `trips` LEFT JOIN `devices` | High | -| Who is nearest to a new job right now? | `live_positions` + PostGIS | High | -| Did any vehicle leave depot after hours? | `trips` time filter | High | -| What is the speeding rate per driver per week? | `position_history` speed filter | High | -| Which driver has the harshest driving style? | `position_history` delta query | High (needs 1–2 weeks of `track_list` data to accumulate) | -| Are vehicles on approved routes? | `position_history` + `geofences` | Low (pending geofence population) | -| Is cold chain in temperature range? | `temperature_readings` | Low (pending webhook registration) | -| How much fuel is consumed per route? | `fuel_readings` + `trips` | Low (pending fuel sensor webhook) | -| What is the real odometer per vehicle? | `live_positions.current_mileage` | Medium (depends on tracker calibration) | -| How many km to next service interval? | `live_positions.current_mileage` - last service | Open (requires service log) | -| Did any vehicle enter a restricted zone? | `alarms` (geofence type) + `geofences` | Low (pending geofence setup) | -| Which drivers are consistently late on Mondays? | `fact_daily_fleet_metrics` day-of-week filter | High | -| What percentage of the fleet was utilised today? | `trips` + `devices` count | High | +| Which vehicles are moving right now? | `live_positions` | ✅ Ready (deploy stack) | +| Who started work latest today? | `fact_daily_fleet_metrics.day_start_time` | ✅ Ready (deploy stack) | +| Who drove the most km this week? | `trips` + `devices` | ✅ Ready (deploy + CSV import) | +| Which vehicle spent the most time idling? | `trips.idle_time_s` | ✅ Ready (deploy stack) | +| How much fuel was wasted on idle today? | `trips.idle_time_s` × rate | ⚙ Needs `fuel_100km` set per vehicle | +| Which driver triggered the most alarms this month? | `alarms` + `devices` | ✅ Ready (deploy stack) | +| What is total fleet distance this month? | `trips` | ✅ Ready (deploy stack) | +| Which vehicles did not move at all today? | `trips` LEFT JOIN `devices` | ✅ Ready (deploy stack) | +| Who is nearest to a new job right now? | `live_positions` + PostGIS | ✅ Ready (deploy + CSV import for names) | +| Did any vehicle leave depot after hours? | `trips` time filter | ✅ Ready (deploy stack) | +| What is the speeding rate per driver per week? | `position_history` speed filter | ✅ Ready (needs 1 week data) | +| Which driver has the harshest driving style? | `position_history` delta query | ✅ Ready (needs 2 weeks `track_list`) | +| What does one field ticket cost in fuel? | `trips` + `ops.tickets` + `fuel_100km` | ⚙ Needs `fuel_100km` + ticket feed wired | +| Which vehicles are running outside assigned city? | `position_history` + `assigned_city` | ⚙ Needs `assigned_city` set (CSV import) | +| How many km to next service interval? | `devices.current_mileage` + `ops.service_log` | ⚙ Needs first service-log entry per vehicle | +| Are vehicles on approved routes? | `position_history` + `geofences` | ⚙ Pending geofence population (Step 4) | +| Is cold chain in temperature range? | `temperature_readings` | 🔴 Pending webhook registration (Step 1) | +| How much fuel is consumed per route? | `fuel_readings` + `trips` | 🔴 Pending fuel sensor webhook (Step 1) | +| Did any vehicle enter a restricted zone? | `alarms` + `geofences` | 🔴 Pending geofence setup (Step 4) | +| What percentage of the fleet was utilised today? | `trips` + `devices` count | ✅ Ready (deploy stack) | +| Alarm while parked — tamper / theft signal | `alarms` + `parking_events` | ✅ Ready (deploy stack) | +| Odometer divergence — tracker vs physical | `trips` + `ops.odometer_readings` | ⚙ Needs first odometer reading entry | --- @@ -1054,7 +1083,48 @@ Ranked by aggression index (harsh events per 100 km), speeding events, and late ## 8. What Unlocks the Remaining 30% -The data foundation is in place. The following five steps activate the remaining analytics capabilities: +The data foundation is in place. The following steps activate the remaining analytics capabilities, in priority order. + +### Step 0 — Deploy New Ingestion Stack *(Current Blocker — do first)* + +All analytics in this document are blocked until the new stack is live. The legacy pipeline stopped on **6 Apr 2026** due to 401 token expiry errors. The refactored code fixes this permanently. + +```bash +# On the Coolify server / inside the repo directory: + +# 1. Pull latest code (includes all revisions through cebcf74) +git pull + +# 2. Apply schema migrations (01 through 06 in order) +TS_DB=$(docker ps --filter "name=timescale_db" --format "{{.Names}}" | head -1) +for f in 01_tracksolid_base.sql 02_tracksolid_full_schema_rev.sql \ + 03_webhook_schema_migration.sql 04_bug_fix_migration.sql \ + 05_enhancement_migration.sql 06_business_analytics_migration.sql; do + echo "Applying $f..." + docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db < "$f" +done + +# 3. Rebuild and start new ingestion containers +docker compose up -d --build ingest_movement ingest_events webhook_receiver + +# 4. Run initial device sync (populates tracksolid.devices from API) +docker exec -it ingest_movement python sync_driver_audit.py + +# 5. Import driver/vehicle details from CSV +docker exec -it ingest_movement python import_drivers_csv.py # dry-run +docker exec -it ingest_movement python import_drivers_csv.py --apply # commit + +# 6. Schedule nightly ETL +# Add to cron or n8n: SELECT dwh_gold.refresh_daily_metrics(CURRENT_DATE - 1); +``` + +**Expected state after Step 0:** +- `tracksolid.devices`: 144+ rows with driver names, plates, departments, assigned_city +- `tracksolid.live_positions`: positions refreshing every 60 seconds +- `tracksolid.trips` / `position_history`: accumulating from first pipeline run +- All analytics in this document begin producing results within 15 minutes of container start + +--- ### Step 1 — Register Webhooks in Tracksolid Pro Account *(Blocker)* Without registration, the following tables remain empty regardless of code: @@ -1088,16 +1158,24 @@ UPDATE tracksolid.devices SET fuel_100km = 9.0 WHERE vehicle_category = 'car'; ### Step 3 — Populate Vehicle Names and Driver Names -Currently all 63 devices show blank fields. Reports display IMEI numbers instead of human-readable identities. +**Automated:** `import_drivers_csv.py` (committed to the repo) reads `20260414_FS__Logistics - final_fixed.csv` (144 devices) and sets `driver_name`, `vehicle_number`, `vehicle_models`, `cost_centre`, `assigned_city`, `sim`, `iccid`, `imsi` in a single pass. Run after Step 0 device sync. + +```bash +docker exec -it ingest_movement python import_drivers_csv.py --apply +``` + +CSV coverage after import: 140 vehicles with plates, 144 with driver names, 138 with SIM, `assigned_city` inferred (NBO=136, KLA=4). The 4 "Identification" spare units are skipped automatically. + +**Manual top-up** for any device not in the CSV: ```sql --- Update individually or import from CSV via COPY UPDATE tracksolid.devices -SET vehicle_name = 'KBZ 123A', - vehicle_number = 'KBZ 123A', - driver_name = 'John Kamau', - driver_phone = '+254700000001', - vehicle_category = 'van' +SET vehicle_name = 'KBZ 123A', + vehicle_number = 'KBZ 123A', + driver_name = 'John Kamau', + driver_phone = '+254700000001', + vehicle_category = 'van', + assigned_city = 'NBO' WHERE imei = '352093080000001'; ``` @@ -1126,25 +1204,7 @@ VALUES ( ### Step 5 — Run Migrations and Deploy Updated Containers -```bash -# Resolve container name dynamically (survives Coolify redeployments) -TS_DB=$(docker ps --filter "name=timescale_db" --format "{{.Names}}" | head -1) - -# 1. Run distance correction migration (fixes historical data) -docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db \ - < /migrations/04_bug_fix_migration.sql - -# 2. Run schema enhancement migration (new tables + columns) -docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db \ - < /migrations/05_enhancement_migration.sql - -# 3. Rebuild and restart ingestion containers with updated code -docker compose up -d --build ingest_movement ingest_events webhook_receiver - -# 4. Schedule nightly ETL -# Add to cron or n8n: -# SELECT dwh_gold.refresh_daily_metrics(CURRENT_DATE - 1); -``` +See **Step 0** above for the full deployment sequence. All six migrations (01–06) must be applied in order before starting the new containers. Step 0 includes the complete command block. --- @@ -1433,5 +1493,6 @@ GROUP BY d.assigned_city; --- -*Document generated: 2026-04-18 · Stack: TimescaleDB 2.15 + PostGIS + Tracksolid Pro Open Platform API* +*Document updated: 2026-04-18 · Stack: TimescaleDB 2.15 + PostGIS + Tracksolid Pro Open Platform API* *Ingestion pipeline: `ingest_movement_rev.py` v2.2 · `ingest_events_rev.py` · `webhook_receiver_rev.py`* +*DB state verified: 18 Apr 2026 — live data in `tracksolid_2` (63 devices, pipeline stopped 6 Apr). New stack targets `tracksolid` schema — pending deployment.*