feat(reporting): fleet segmentation + deduped vehicle roster (migration 14)
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run

Add reporting.fn_fleet_segment() and reporting.v_vehicles, splitting the fleet
into ticket-closing field_service vs specialist plant (crane/pick-up/motorbike)
that does not close immediate customer tickets.

The segment is DERIVED from tracksolid.devices.vehicle_models — itself an
authoritative Tracksolid API field (sync_devices maps jimi.user.device.list ->
vehicleModels) — so it stays API-current with no re-seeding; the manual
vehicle_category column is intentionally unused. v_vehicles collapses the
tracker+dashcam device pairs to one row per vehicle by reusing
reporting.normalize_plate() and the same primary-device precedence as
reporting.v_trips / v_live_positions (auto-merges 'KDS 453Y'/'KDS 453 Y',
resolves within-plate model conflicts via the primary tracker).

Verified live: 80 vehicles (61 field_service / 16 specialist / 3 unassigned),
grafana_ro granted. Includes the supporting data-quality report.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
david kiania 2026-06-08 13:54:47 +03:00
parent 94cbd2a85e
commit 347c97ec4c
4 changed files with 6366 additions and 0 deletions

View file

@ -0,0 +1,117 @@
# Fleet Registry — Data Quality Report
**Date:** 2026-06-08
**Source:** `tracksolid.devices` (the single registry of record)
**Prepared for:** Fireside leadership / fleet operations
**Scope:** 181 tracking devices → **80 physical vehicles** after de-duplicating the GPS-tracker + dashcam pairs that share a number plate.
---
## 1. Executive summary
The fleet registry is **operationally usable but materially incomplete**, and the gaps concentrate in exactly the fields the business needs to run field-service KPIs: driver identity, driver contact, vehicle type, and device pairing.
| Theme | Headline | Business impact |
|---|---|---|
| **Driver contact** | **97%** of records have **no driver phone** (175 / 181) | Dispatch and escalation can't reach the driver from our own systems |
| **Driver identity** | **23%** have **no driver name** (41 / 181) | Trips, speeding and idle time can't be attributed to a person — no accountability |
| **Vehicle type** | **22%** have **no model** (40 / 181) | We can't cleanly separate ticket-closing service cars from cranes / bikes |
| **Device pairing** | **29%** of vehicles (23 / 80) are **missing a tracker or a camera** | Blind spots: 6 vehicles have **no GPS at all**; 16 have **no dashcam evidence** |
| **Unidentified hardware** | **41** device rows carry **no plate**; 19 of those are fully blank | Hardware we are paying to track but cannot tie to a vehicle, driver or city |
**Root cause is upstream, not in our database.** Almost every gap is a field that was never entered in the **Tracksolid Pro portal** at provisioning time — our pipeline faithfully stores whatever the portal holds. This is a **data-entry discipline** problem at vehicle onboarding, plus an **incomplete driver/plate import** that is still pending (the 144-device CSV).
---
## 2. The fleet, de-duplicated
After collapsing tracker+camera pairs to one row per number plate:
| Segment | Vehicles | Detail |
|---|---|---|
| **Field service** (closes customer tickets) | ~62 | Probox ×57, Van, Vezel, Mazda, + a few UG/other cars |
| **Specialist** (cranes, bikes, pick-ups — *not* immediate tickets) | ~17 | Crane ×3, Motorbike ×8, Pick-Up ×~6 |
| **Ambiguous** (type conflicts — see §3) | 2 | KCY 080X, KCZ 223P |
| **Total physical vehicles** | **80** | from 181 devices |
| **Unassigned spare devices** (no plate) | 41 rows | cannot be counted as vehicles |
---
## 3. Findings by severity
### 🔴 Critical — operational blind spots
**C1. 6 vehicles have a dashcam but NO GPS tracker** — they are invisible on the live fleet map and contribute no trips/mileage.
`KCN 496A · KCQ 215F · KCU 237Z · KDM 306S · KDN 759G · KCZ 199P`
**C2. 16 vehicles have a GPS tracker but NO dashcam** — no video evidence for incidents, disputes, or safety review. Concentrated in the **specialist fleet** (all 8 motorbikes, the Uganda vehicles) plus several Proboxes.
**C3. 41 device rows (23%) carry no number plate** — they cannot be tied to a vehicle, driver, or city. **19 of these are entirely blank** (no plate, no model, no driver, status "unknown") — hardware we track but cannot identify at all.
### 🟠 High — accountability & classification gaps
**H1. 97% of records have no driver phone** (175 / 181). Only 6 drivers are contactable from our data.
**H2. 23% have no driver name** (41 / 181). Behaviour analytics (speeding, idle, harsh events) cannot be attributed.
**H3. 22% have no vehicle model** (40 / 181) — this is the field that drives the field-service vs specialist split. Until it's populated, ~22% of the fleet can only be classified by guessing from the device name.
**H4. Plate data-entry inconsistencies** corrupting the vehicle count:
- **`KDS 453Y` is entered twice** — `KDS 453Y` (tracker) and `KDS 453 Y` (camera, stray space). One vehicle, counted as two.
- **`KCC 199P` vs `KCZ 199P`** — both pick-ups, both driver *Mbuvi Kioko*, one tracker + one camera. Almost certainly **one vehicle mis-keyed under two plates**.
**H5. Two plates disagree with themselves** — the tracker and dashcam on the *same plate* report different vehicle types:
| Plate | Tracker says | Camera says | Driver |
|---|---|---|---|
| KCY 080X | Pick-Up | Probox | Lawrence Kijogi |
| KCZ 223P | Pick-Up | Probox | Felix Muema |
### 🟡 Medium — analytics reliability
**M1. `assigned_city` is unreliable.** 4 vehicles have their two devices assigned to *different cities* (e.g. KDC 490Q: Mombasa vs Nairobi; KCY 838X: Mombasa vs Voi). The field appears **derived from the Tracksolid account** the device sits in, not the vehicle's actual base — so **regional (Nairobi / Mombasa / Kampala) reporting is suspect**.
**M2. Placeholder / non-person driver names** pollute driver-level analytics: `Garage` (×4), `UG` (×2), `Management_Mazda` (×2), `Parked` (×1). These are slots, not people.
**M3. One driver, multiple plates** — 5 cases (`Garage`, `Gideon Kiprono`, `Kelvin Wambugu`, `Mbuvi Kioko`, `UG`). Some are the duplicate-plate issues above; the rest need confirmation of whether a driver genuinely rotates vehicles.
### ⚪ Low — asset-register completeness (low operational impact today)
**L1. `vehicle_brand` is 99% empty** (179 / 181) and **`vin` is 100% empty** (181 / 181). Not currently used in any KPI, but would be needed for a formal asset/insurance register.
---
## 4. Completeness scorecard (device-level, 181 rows)
| Field | Populated | Missing | % missing |
|---|---|---|---|
| Number plate | 140 | 41 | 23% |
| Vehicle model | 141 | 40 | 22% |
| Driver name | 140 | 41 | 23% |
| Driver phone | 6 | 175 | **97%** |
| SIM | 155 | 26 | 14% |
| Assigned city | 152 | 29 | 16% |
| Vehicle brand | 2 | 179 | **99%** |
| VIN | 0 | 181 | **100%** |
---
## 5. Recommended actions
| # | Action | Owner | Effort | Fixes |
|---|---|---|---|---|
| 1 | **Run the pending driver/plate CSV import** (144 devices with names + plates already prepared) | Engineering | Low — one command | H1, H2, H3 (large chunk) |
| 2 | **Mandate model + driver + phone at vehicle onboarding** in the Tracksolid Pro portal; make them required fields in the process | Operations | Process change | Root cause of most gaps |
| 3 | **Resolve the 4 specific record issues** in §3 (KDS 453Y dup, KCC/KCZ 199P, KCY 080X, KCZ 223P) | Operations to confirm, Engineering to correct | Low | H4, H5 |
| 4 | **Audit the 22 vehicles missing a tracker or camera** (§3 C1/C2) — install missing hardware or confirm intentional | Field ops | Medium | C1, C2 |
| 5 | **Identify the 19 fully-blank devices** — match serial → vehicle, or decommission | Field ops | Medium | C3 |
| 6 | **Stop trusting `assigned_city` for regional reporting** until it's set per-vehicle rather than inherited from the account | Engineering + Ops | Medium | M1 |
| 7 | Replace placeholder driver names with real assignments or a clear "unassigned" convention | Operations | Low | M2, M3 |
**Quick wins:** Actions 1 and 3 are low-effort and would clear the majority of the High-severity gaps immediately.
---
## 6. Method & reproducibility
All figures derived from a single read-only scan of `tracksolid.devices`. De-duplication keys on a normalised plate (`UPPER(REPLACE(vehicle_number,' ',''))`) so spacing variants collapse. Tracker = device types `GT06E / X3 / AT4`; camera = `JC400P`. The scan script and the proposed `tracksolid.v_vehicles` de-duplicated view are tracked in the repository; re-running the scan after each remediation step will show progress against this baseline.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,104 @@
-- 14_fleet_segment_and_vehicles_view.sql
-- Fleet segmentation + de-duplicated vehicle roster.
--
-- Splits the fleet into ticket-closing FIELD SERVICE vehicles vs SPECIALIST plant
-- (cranes / pick-ups / motorbikes) that do NOT close immediate customer tickets.
--
-- The segment is DERIVED, not stored: it is computed from tracksolid.devices.vehicle_models,
-- which is itself an authoritative Tracksolid API field (sync_devices() maps
-- jimi.user.device.list -> vehicleModels, refreshed daily). Keeping it derived means it
-- always tracks the API and needs no re-seeding. The manual tracksolid.vehicle_category
-- column is intentionally NOT used here.
--
-- reporting.v_vehicles collapses the GPS-tracker + dashcam device pairs into one row per
-- vehicle, reusing reporting.normalize_plate() and the same "primary device per normalized
-- plate" precedence as reporting.v_trips / reporting.v_live_positions (migration 11). This
-- auto-merges plate-spacing duplicates (e.g. 'KDS 453Y' vs 'KDS 453 Y') and resolves any
-- within-plate model disagreement by letting the primary tracker's value win.
--
-- Every object uses CREATE OR REPLACE / guarded grants so the file is safe to re-apply.
-- Provenance: docs/reports/260608_fleet_registry_data_quality.md + ~/.claude plan binary-singing-wave.
SET search_path = tracksolid, reporting, public;
-- ── classification rule (single source of truth) ─────────────────────────────
CREATE OR REPLACE FUNCTION reporting.fn_fleet_segment(model text)
RETURNS text
LANGUAGE sql
IMMUTABLE PARALLEL SAFE
AS $function$
SELECT CASE lower(coalesce(trim(model), ''))
WHEN '' THEN 'unassigned' -- no model on record -> triage
WHEN 'crane' THEN 'specialist'
WHEN 'pick-up' THEN 'specialist'
WHEN 'pickup' THEN 'specialist'
WHEN 'truck' THEN 'specialist'
WHEN 'motorbike' THEN 'specialist'
ELSE 'field_service' -- Probox, Mazda, Van, Station Wagon, Vezel + any other named model
END
$function$;
COMMENT ON FUNCTION reporting.fn_fleet_segment(text) IS
'Maps tracksolid.devices.vehicle_models -> field_service | specialist | unassigned. '
'Specialist = crane/pick-up/motorbike/truck (do not close immediate customer tickets).';
-- ── de-duplicated vehicle roster (one row per physical vehicle) ───────────────
CREATE OR REPLACE VIEW reporting.v_vehicles AS
WITH device_trip_counts AS (
SELECT trips.imei, count(*) AS trip_count
FROM trips
GROUP BY trips.imei
), primary_device AS (
SELECT DISTINCT ON ((reporting.normalize_plate(d.vehicle_number)))
reporting.normalize_plate(d.vehicle_number) AS plate,
d.imei AS primary_imei,
d.vehicle_models,
d.driver_name,
d.driver_phone,
d.account,
d.assigned_city
FROM devices d
LEFT JOIN device_trip_counts c USING (imei)
WHERE d.vehicle_number IS NOT NULL AND d.enabled_flag = 1
ORDER BY (reporting.normalize_plate(d.vehicle_number)),
(CASE WHEN d.mc_type = ANY (ARRAY['GT06E','X3','AT4']) THEN 0 ELSE 1 END),
(COALESCE(c.trip_count, 0::bigint)) DESC,
d.activation_time,
d.imei
), plate_agg AS (
SELECT reporting.normalize_plate(d.vehicle_number) AS plate,
bool_or(d.mc_type = ANY (ARRAY['GT06E','X3','AT4'])) AS has_tracker,
bool_or(d.mc_type = 'JC400P') AS has_camera,
count(*) AS device_count
FROM devices d
WHERE d.vehicle_number IS NOT NULL AND d.enabled_flag = 1
GROUP BY reporting.normalize_plate(d.vehicle_number)
)
SELECT pd.plate,
pd.vehicle_models AS vehicle_type,
reporting.fn_fleet_segment(pd.vehicle_models) AS fleet_segment,
pd.driver_name AS driver,
pd.driver_phone,
pd.account,
pd.assigned_city,
pa.has_tracker,
pa.has_camera,
pa.device_count,
pd.primary_imei
FROM primary_device pd
JOIN plate_agg pa USING (plate);
COMMENT ON VIEW reporting.v_vehicles IS
'One row per physical vehicle (tracker+dashcam pairs collapsed by normalize_plate, primary '
'device = tracker-first then trip-count). fleet_segment derived from API-authoritative '
'vehicle_models. Source: docs/reports/260608_fleet_registry_data_quality.md.';
-- ── grants (guarded: roles may not exist on a fresh DB) ───────────────────────
DO $grants$
BEGIN
IF EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'grafana_ro') THEN
GRANT USAGE ON SCHEMA reporting TO grafana_ro;
GRANT EXECUTE ON FUNCTION reporting.fn_fleet_segment(text) TO grafana_ro;
GRANT SELECT ON reporting.v_vehicles TO grafana_ro;
END IF;
END $grants$;

View file

@ -37,6 +37,7 @@ MIGRATIONS = [
"11_reporting_schema.sql", # reporting.* map-dashboard read layer (dashboard_api)
"12_drop_ops.sql", # purge dormant ops schema + dispatch_log + v_sla_inflight
"13_drop_dwh_gold.sql", # purge dormant dwh_gold schema + v_utilisation_daily
"14_fleet_segment_and_vehicles_view.sql", # reporting.fn_fleet_segment + reporting.v_vehicles roster
]
# ── Tables that must exist before the service is allowed to start ─────────────