docs: update analytics report with live DB state (18 Apr 2026)
- §1: add current deployment state table — 63 devices, 0 driver names, 5 trips, pipeline stopped 6 Apr (401 token expiry); note tracksolid_2 vs tracksolid schema split - §6: status column per question (Ready/Needs data/Blocked) reflecting actual DB state; add cost-per-ticket, city drift, odometer rows - §8: add Step 0 full deployment sequence (git pull → migrations 01-06 → container rebuild → sync_driver_audit → import_drivers_csv); Step 3 updated to reference import script; Step 5 collapsed to pointer - Footer: db-state stamp and update date Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
cebcf74ba2
commit
274473c544
1 changed files with 112 additions and 51 deletions
|
|
@ -41,7 +41,32 @@ Every query in this document is tagged by intended consumption cadence. Build Gr
|
||||||
|
|
||||||
## 1. Data Foundation Summary
|
## 1. Data Foundation Summary
|
||||||
|
|
||||||
The ingestion stack currently populates the following data sources, each feeding the analytics layer:
|
### 1.1 Current Deployment State *(as of 18 Apr 2026)*
|
||||||
|
|
||||||
|
> **⚠ New stack not yet live.** The refactored ingestion pipeline (`ingest_movement_rev.py` v2.2) targets the `tracksolid` schema, which is currently empty. All live data sits in the legacy `tracksolid_2` schema populated by the prior codebase. The queries in this document are written for the target schema (`tracksolid`) and will produce results once the new stack is deployed and the device sync has run.
|
||||||
|
|
||||||
|
| Metric | Observed value | Source |
|
||||||
|
|---|---|---|
|
||||||
|
| Devices registered | **63** (AT4-series, `353549*` IMEIs) | `tracksolid_2.devices` |
|
||||||
|
| Driver names populated | **0 / 63** | `tracksolid_2.devices` |
|
||||||
|
| Vehicle numbers populated | **0 / 63** | `tracksolid_2.devices` |
|
||||||
|
| SIM numbers populated | **14 / 63** | `tracksolid_2.devices` |
|
||||||
|
| Live positions (stale) | **19** | `tracksolid_2.live_positions` |
|
||||||
|
| Position history rows | **208** | `tracksolid_2.position_history` |
|
||||||
|
| Trips recorded | **5** (12.8 km total) | `tracksolid_2.trips` |
|
||||||
|
| Parking / alarms / OBD | **0** each | `tracksolid_2.*` |
|
||||||
|
| Last pipeline run | **6 Apr 2026 13:20 EAT** | `tracksolid_2.ingestion_log` |
|
||||||
|
| Pipeline failure rate | **41%** (277/668 runs, all 401 auth errors) | `tracksolid_2.ingestion_log` |
|
||||||
|
|
||||||
|
**Why the pipeline stopped (6 Apr):** 276 consecutive `401 Unauthorized` errors against `eu-open.tracksolidpro.com`. The API token expired and was not refreshed — the prior codebase lacked the auto-refresh logic that `ts_shared_rev.py` now includes. Deploying the new stack resolves this permanently.
|
||||||
|
|
||||||
|
**CSV fleet (144 devices, X3/JC400P series):** The `20260414_FS__Logistics - final_fixed.csv` file contains a separate, newer batch of devices (`865135*`, `862798*` IMEIs) with full driver names and plates. **These 144 devices are not yet registered in the DB at all** — they will be synced by `sync_driver_audit.py` after the new stack is deployed, then enriched by `import_drivers_csv.py`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 1.2 Target Data Architecture
|
||||||
|
|
||||||
|
Once deployed, the ingestion stack populates the following data sources:
|
||||||
|
|
||||||
| Table | Content | Frequency |
|
| Table | Content | Frequency |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
|
|
@ -54,7 +79,7 @@ The ingestion stack currently populates the following data sources, each feeding
|
||||||
| `tracksolid.devices` | Vehicle and driver registry | Daily at 02:00 |
|
| `tracksolid.devices` | Vehicle and driver registry | Daily at 02:00 |
|
||||||
| `dwh_gold.fact_daily_fleet_metrics` | Daily KPI aggregates per vehicle | Nightly ETL |
|
| `dwh_gold.fact_daily_fleet_metrics` | Daily KPI aggregates per vehicle | Nightly ETL |
|
||||||
|
|
||||||
**Position history density** increased significantly with the addition of `poll_track_list` (POLL-01):
|
**Position history density** improvement with `poll_track_list` (POLL-01):
|
||||||
|
|
||||||
| Before | After |
|
| Before | After |
|
||||||
|---|---|
|
|---|---|
|
||||||
|
|
@ -981,28 +1006,32 @@ ORDER BY k.total_km DESC;
|
||||||
|
|
||||||
## 6. Business Questions Now Answerable
|
## 6. Business Questions Now Answerable
|
||||||
|
|
||||||
| Business Question | Primary Data Source | Confidence |
|
Status key: **✅ Ready** = answerable once new stack deployed | **⚙ Needs data** = additional setup required | **🔴 Blocked** = pending action before any data
|
||||||
|
|
||||||
|
| Business Question | Primary Data Source | Status |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| Which vehicles are moving right now? | `live_positions` | High |
|
| Which vehicles are moving right now? | `live_positions` | ✅ Ready (deploy stack) |
|
||||||
| Who started work latest today? | `fact_daily_fleet_metrics.day_start_time` | High |
|
| Who started work latest today? | `fact_daily_fleet_metrics.day_start_time` | ✅ Ready (deploy stack) |
|
||||||
| Who drove the most km this week? | `trips` + `devices` | High |
|
| Who drove the most km this week? | `trips` + `devices` | ✅ Ready (deploy + CSV import) |
|
||||||
| Which vehicle spent the most time idling? | `trips.idle_time_s` | High |
|
| Which vehicle spent the most time idling? | `trips.idle_time_s` | ✅ Ready (deploy stack) |
|
||||||
| How much fuel was wasted on idle today? | `trips.idle_time_s` × est. rate | Medium (needs `fuel_100km` set) |
|
| How much fuel was wasted on idle today? | `trips.idle_time_s` × rate | ⚙ Needs `fuel_100km` set per vehicle |
|
||||||
| Which driver triggered the most alarms this month? | `alarms` + `devices` | High |
|
| Which driver triggered the most alarms this month? | `alarms` + `devices` | ✅ Ready (deploy stack) |
|
||||||
| What is total fleet distance this month? | `trips` | High |
|
| What is total fleet distance this month? | `trips` | ✅ Ready (deploy stack) |
|
||||||
| Which vehicles did not move at all today? | `trips` LEFT JOIN `devices` | High |
|
| Which vehicles did not move at all today? | `trips` LEFT JOIN `devices` | ✅ Ready (deploy stack) |
|
||||||
| Who is nearest to a new job right now? | `live_positions` + PostGIS | High |
|
| Who is nearest to a new job right now? | `live_positions` + PostGIS | ✅ Ready (deploy + CSV import for names) |
|
||||||
| Did any vehicle leave depot after hours? | `trips` time filter | High |
|
| Did any vehicle leave depot after hours? | `trips` time filter | ✅ Ready (deploy stack) |
|
||||||
| What is the speeding rate per driver per week? | `position_history` speed filter | High |
|
| What is the speeding rate per driver per week? | `position_history` speed filter | ✅ Ready (needs 1 week data) |
|
||||||
| Which driver has the harshest driving style? | `position_history` delta query | High (needs 1–2 weeks of `track_list` data to accumulate) |
|
| Which driver has the harshest driving style? | `position_history` delta query | ✅ Ready (needs 2 weeks `track_list`) |
|
||||||
| Are vehicles on approved routes? | `position_history` + `geofences` | Low (pending geofence population) |
|
| What does one field ticket cost in fuel? | `trips` + `ops.tickets` + `fuel_100km` | ⚙ Needs `fuel_100km` + ticket feed wired |
|
||||||
| Is cold chain in temperature range? | `temperature_readings` | Low (pending webhook registration) |
|
| Which vehicles are running outside assigned city? | `position_history` + `assigned_city` | ⚙ Needs `assigned_city` set (CSV import) |
|
||||||
| How much fuel is consumed per route? | `fuel_readings` + `trips` | Low (pending fuel sensor webhook) |
|
| How many km to next service interval? | `devices.current_mileage` + `ops.service_log` | ⚙ Needs first service-log entry per vehicle |
|
||||||
| What is the real odometer per vehicle? | `live_positions.current_mileage` | Medium (depends on tracker calibration) |
|
| Are vehicles on approved routes? | `position_history` + `geofences` | ⚙ Pending geofence population (Step 4) |
|
||||||
| How many km to next service interval? | `live_positions.current_mileage` - last service | Open (requires service log) |
|
| Is cold chain in temperature range? | `temperature_readings` | 🔴 Pending webhook registration (Step 1) |
|
||||||
| Did any vehicle enter a restricted zone? | `alarms` (geofence type) + `geofences` | Low (pending geofence setup) |
|
| How much fuel is consumed per route? | `fuel_readings` + `trips` | 🔴 Pending fuel sensor webhook (Step 1) |
|
||||||
| Which drivers are consistently late on Mondays? | `fact_daily_fleet_metrics` day-of-week filter | High |
|
| Did any vehicle enter a restricted zone? | `alarms` + `geofences` | 🔴 Pending geofence setup (Step 4) |
|
||||||
| What percentage of the fleet was utilised today? | `trips` + `devices` count | High |
|
| What percentage of the fleet was utilised today? | `trips` + `devices` count | ✅ Ready (deploy stack) |
|
||||||
|
| Alarm while parked — tamper / theft signal | `alarms` + `parking_events` | ✅ Ready (deploy stack) |
|
||||||
|
| Odometer divergence — tracker vs physical | `trips` + `ops.odometer_readings` | ⚙ Needs first odometer reading entry |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1054,7 +1083,48 @@ Ranked by aggression index (harsh events per 100 km), speeding events, and late
|
||||||
|
|
||||||
## 8. What Unlocks the Remaining 30%
|
## 8. What Unlocks the Remaining 30%
|
||||||
|
|
||||||
The data foundation is in place. The following five steps activate the remaining analytics capabilities:
|
The data foundation is in place. The following steps activate the remaining analytics capabilities, in priority order.
|
||||||
|
|
||||||
|
### Step 0 — Deploy New Ingestion Stack *(Current Blocker — do first)*
|
||||||
|
|
||||||
|
All analytics in this document are blocked until the new stack is live. The legacy pipeline stopped on **6 Apr 2026** due to 401 token expiry errors. The refactored code fixes this permanently.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# On the Coolify server / inside the repo directory:
|
||||||
|
|
||||||
|
# 1. Pull latest code (includes all revisions through cebcf74)
|
||||||
|
git pull
|
||||||
|
|
||||||
|
# 2. Apply schema migrations (01 through 06 in order)
|
||||||
|
TS_DB=$(docker ps --filter "name=timescale_db" --format "{{.Names}}" | head -1)
|
||||||
|
for f in 01_tracksolid_base.sql 02_tracksolid_full_schema_rev.sql \
|
||||||
|
03_webhook_schema_migration.sql 04_bug_fix_migration.sql \
|
||||||
|
05_enhancement_migration.sql 06_business_analytics_migration.sql; do
|
||||||
|
echo "Applying $f..."
|
||||||
|
docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db < "$f"
|
||||||
|
done
|
||||||
|
|
||||||
|
# 3. Rebuild and start new ingestion containers
|
||||||
|
docker compose up -d --build ingest_movement ingest_events webhook_receiver
|
||||||
|
|
||||||
|
# 4. Run initial device sync (populates tracksolid.devices from API)
|
||||||
|
docker exec -it ingest_movement python sync_driver_audit.py
|
||||||
|
|
||||||
|
# 5. Import driver/vehicle details from CSV
|
||||||
|
docker exec -it ingest_movement python import_drivers_csv.py # dry-run
|
||||||
|
docker exec -it ingest_movement python import_drivers_csv.py --apply # commit
|
||||||
|
|
||||||
|
# 6. Schedule nightly ETL
|
||||||
|
# Add to cron or n8n: SELECT dwh_gold.refresh_daily_metrics(CURRENT_DATE - 1);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected state after Step 0:**
|
||||||
|
- `tracksolid.devices`: 144+ rows with driver names, plates, departments, assigned_city
|
||||||
|
- `tracksolid.live_positions`: positions refreshing every 60 seconds
|
||||||
|
- `tracksolid.trips` / `position_history`: accumulating from first pipeline run
|
||||||
|
- All analytics in this document begin producing results within 15 minutes of container start
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Step 1 — Register Webhooks in Tracksolid Pro Account *(Blocker)*
|
### Step 1 — Register Webhooks in Tracksolid Pro Account *(Blocker)*
|
||||||
Without registration, the following tables remain empty regardless of code:
|
Without registration, the following tables remain empty regardless of code:
|
||||||
|
|
@ -1088,16 +1158,24 @@ UPDATE tracksolid.devices SET fuel_100km = 9.0 WHERE vehicle_category = 'car';
|
||||||
|
|
||||||
### Step 3 — Populate Vehicle Names and Driver Names
|
### Step 3 — Populate Vehicle Names and Driver Names
|
||||||
|
|
||||||
Currently all 63 devices show blank fields. Reports display IMEI numbers instead of human-readable identities.
|
**Automated:** `import_drivers_csv.py` (committed to the repo) reads `20260414_FS__Logistics - final_fixed.csv` (144 devices) and sets `driver_name`, `vehicle_number`, `vehicle_models`, `cost_centre`, `assigned_city`, `sim`, `iccid`, `imsi` in a single pass. Run after Step 0 device sync.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker exec -it ingest_movement python import_drivers_csv.py --apply
|
||||||
|
```
|
||||||
|
|
||||||
|
CSV coverage after import: 140 vehicles with plates, 144 with driver names, 138 with SIM, `assigned_city` inferred (NBO=136, KLA=4). The 4 "Identification" spare units are skipped automatically.
|
||||||
|
|
||||||
|
**Manual top-up** for any device not in the CSV:
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
-- Update individually or import from CSV via COPY
|
|
||||||
UPDATE tracksolid.devices
|
UPDATE tracksolid.devices
|
||||||
SET vehicle_name = 'KBZ 123A',
|
SET vehicle_name = 'KBZ 123A',
|
||||||
vehicle_number = 'KBZ 123A',
|
vehicle_number = 'KBZ 123A',
|
||||||
driver_name = 'John Kamau',
|
driver_name = 'John Kamau',
|
||||||
driver_phone = '+254700000001',
|
driver_phone = '+254700000001',
|
||||||
vehicle_category = 'van'
|
vehicle_category = 'van',
|
||||||
|
assigned_city = 'NBO'
|
||||||
WHERE imei = '352093080000001';
|
WHERE imei = '352093080000001';
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -1126,25 +1204,7 @@ VALUES (
|
||||||
|
|
||||||
### Step 5 — Run Migrations and Deploy Updated Containers
|
### Step 5 — Run Migrations and Deploy Updated Containers
|
||||||
|
|
||||||
```bash
|
See **Step 0** above for the full deployment sequence. All six migrations (01–06) must be applied in order before starting the new containers. Step 0 includes the complete command block.
|
||||||
# Resolve container name dynamically (survives Coolify redeployments)
|
|
||||||
TS_DB=$(docker ps --filter "name=timescale_db" --format "{{.Names}}" | head -1)
|
|
||||||
|
|
||||||
# 1. Run distance correction migration (fixes historical data)
|
|
||||||
docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db \
|
|
||||||
< /migrations/04_bug_fix_migration.sql
|
|
||||||
|
|
||||||
# 2. Run schema enhancement migration (new tables + columns)
|
|
||||||
docker exec -i "$TS_DB" psql -U postgres -d tracksolid_db \
|
|
||||||
< /migrations/05_enhancement_migration.sql
|
|
||||||
|
|
||||||
# 3. Rebuild and restart ingestion containers with updated code
|
|
||||||
docker compose up -d --build ingest_movement ingest_events webhook_receiver
|
|
||||||
|
|
||||||
# 4. Schedule nightly ETL
|
|
||||||
# Add to cron or n8n:
|
|
||||||
# SELECT dwh_gold.refresh_daily_metrics(CURRENT_DATE - 1);
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1433,5 +1493,6 @@ GROUP BY d.assigned_city;
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Document generated: 2026-04-18 · Stack: TimescaleDB 2.15 + PostGIS + Tracksolid Pro Open Platform API*
|
*Document updated: 2026-04-18 · Stack: TimescaleDB 2.15 + PostGIS + Tracksolid Pro Open Platform API*
|
||||||
*Ingestion pipeline: `ingest_movement_rev.py` v2.2 · `ingest_events_rev.py` · `webhook_receiver_rev.py`*
|
*Ingestion pipeline: `ingest_movement_rev.py` v2.2 · `ingest_events_rev.py` · `webhook_receiver_rev.py`*
|
||||||
|
*DB state verified: 18 Apr 2026 — live data in `tracksolid_2` (63 devices, pipeline stopped 6 Apr). New stack targets `tracksolid` schema — pending deployment.*
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue