tracksolid_timescale_grafan.../docs/DATA_FLOW.md
david kiania 8c5a43f3b8
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
chore(db): purge unused ops + dwh_gold schemas
Drop the dormant ops (workshop / tickets / dispatch / SLA / odometer)
and dwh_gold (nightly ETL aggregates) schemas plus their dependents —
features never implemented, no live writer or scheduled refresh.

- Prod DB (already applied): DROP SCHEMA ops/dwh_gold CASCADE, plus
  tracksolid.dispatch_log, v_sla_inflight, v_utilisation_daily.
- migrations/12_drop_ops.sql + 13_drop_dwh_gold.sql (forward, all
  IF EXISTS) registered in run_migrations.py for rebuild durability.
- grafana: removed 8 now-broken panels (In-flight SLA, Idle Cost,
  Utilisation Heatmap, Row 7 Field-Service SLAs) from daily_operations;
  panel count 21 -> 13.
- docs: scrubbed CLAUDE.md, PLATFORM_OVERVIEW.html (-19KB), DATA_FLOW.md;
  pre-drop seed snapshot in docs/reports/260605_ops_purge_backup.md.

The separate tracksolid_dwh server (31.97.44.246:5888) is unrelated
and untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:11:03 +03:00

240 lines
16 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Data Flow — Ingestion → Aggregation → Views → Functions → Consumers
**Scope:** the *live* fleet pipeline only (`tracksolid` + `reporting`). The `ops` and
`dwh_gold` schemas were **purged on 2026-06-05** (migrations 12/13) — those workshop /
dispatch / SLA / utilisation features were never implemented. They are shown below only as
a struck-out "removed" footnote for historical context. (The *separate* `tracksolid_dwh`
server at 31.97.44.246:5888 is unrelated and was not touched.)
**Verified against prod 2026-06-05** (TimescaleDB hypertable + continuous-aggregate
catalog, `pg_depend` view graph, ingestion `INSERT` targets, `dashboard_api` queries,
Grafana panel SQL). Key facts that surprised the docs:
- Only `position_history` (and the empty/planned `heartbeats`, `fuel_readings`,
`temperature_readings`) are **hypertables**. `trips` and `alarms` are **plain tables**.
- `tracksolid.v_mileage_daily_cagg` is a **real TimescaleDB continuous aggregate**, not a
plain view — and it currently has **no downstream consumer**.
- `reporting.v_trips` is a **matview**, refreshed every 5 min by the in-process
`dashboard_api` background loop (FIX-D02), pg advisory-lock `920145`.
---
## Mermaid
```mermaid
flowchart TD
API["Tracksolid / Jimi API<br/>poll + push webhooks · OAuth2"]
subgraph ING["Ingestion — ts_shared_rev.py: get_conn() · api_post() · clean*()"]
IM["ingest_movement_rev.py"]
IE["ingest_events_rev.py"]
WR["webhook_receiver_rev.py"]
end
API --> IM & IE & WR
subgraph L1["L1 · Base tables + hypertables — schema: tracksolid (single source of truth)"]
LP["live_positions<br/>current fix / IMEI"]
PH[("position_history<br/>HYPERTABLE · high-res GPS trail")]
TR["trips<br/>plain table — NOT a hypertable"]
DV["devices<br/>vehicle / driver registry"]
AL["alarms<br/>plain table"]
ILOG["ingestion_log · api_token_cache"]
EMPTY["heartbeats* · fuel_readings* · temperature_readings* (HYPERTABLES, empty)<br/>device_events · fault_codes · obd_readings · parking_events · lbs_readings · geofences (empty)"]
end
IM --> LP & PH & TR & DV
IM --> ILOG
IE --> AL
WR --> EMPTY
subgraph L2["L2 · Aggregation"]
CAGG[("v_mileage_daily_cagg<br/>CONTINUOUS AGGREGATE")]
VT[("reporting.v_trips<br/>MATVIEW · ix_v_trips_trip_id")]
RLOG["reporting.refresh_log"]
end
PH -->|Timescale cont-agg policy| CAGG
TR --> VT
DV --> VT
VT -->|"REFRESH CONCURRENTLY every 300s<br/>(dashboard_api loop, adv-lock 920145)"| RLOG
subgraph L3L["L3 · Reporting views — fed by v_trips matview"]
FILT["v_filter_vehicles · v_filter_drivers<br/>v_filter_cost_centres · v_filter_cities"]
SUMM["v_daily/weekly/monthly_summary<br/>v_*_cost_centre · v_trips_today"]
VLP["v_live_positions"]
end
VT --> FILT & SUMM
LP --> VLP
subgraph L3R["L3 · Grafana views — tracksolid.* (read base tables directly)"]
GV["v_fleet_today · v_fleet_status · v_active_dispatch_map<br/>v_currently_idle · v_alarms_daily · v_fleet_km_daily<br/>v_ingestion_health · v_vehicles_not_moved_today<br/>v_driver_aggregates_daily · v_fleet_trace · v_driver_clock_*"]
end
LP --> GV
TR --> GV
DV --> GV
AL --> GV
PH --> GV
subgraph L4["L4 · Functions — reporting.* (the only API entrypoints)"]
FLP["fn_live_positions(cost_centre, acc_status)"]
FVT["fn_vehicle_track(vehicle_number, hours)"]
FTM["fn_trips_for_map(veh[], driver, cc, city, start, end)"]
NP["normalize_plate(p) · helper"]
end
LP --> FLP
DV --> FLP
PH --> FVT
VT --> FTM
NP -.-> FTM
subgraph L5["L5 · Consumers"]
DAPI["dashboard_api_rev.py<br/>FastAPI :8890 · 2 workers"]
SPA["SPAs: liveposition.* · fleetintelligence.*<br/>(rustfs / S3 single-file maps)"]
GRAF["Grafana"]
end
FLP --> DAPI
FVT --> DAPI
FTM --> DAPI
FILT --> DAPI
DAPI -->|"HTTPS · fleetapi.rahamafresh.com"| SPA
GV --> GRAF
CAGG -.->|no consumer yet| NONE(["⚠ unconsumed — no panel / API reads it"])
subgraph PARK["REMOVED 2026-06-05 — purged via migrations 12 / 13 (never implemented)"]
GONE["ops.* (tickets · dispatch_log · service_log · odometer_readings · cost_rates · kpi_targets · vw_service_forecast)<br/>dwh_gold.* (dim_vehicles · fact_daily_fleet_metrics · refresh_daily_metrics)<br/>tracksolid.v_sla_inflight · tracksolid.v_utilisation_daily + their Grafana panels"]
end
classDef hyper fill:#e1f0ff,stroke:#3b82f6,color:#0b3d91;
classDef mat fill:#fff3cd,stroke:#d4a017,color:#664d03;
classDef cagg fill:#e7f7e7,stroke:#2e9e2e,color:#1e5e1e;
classDef parked fill:#f0f0f0,stroke:#999,stroke-dasharray:5 5,color:#555;
classDef warn fill:#fdecea,stroke:#d93025,color:#a52714;
class PH hyper;
class VT mat;
class CAGG cagg;
class PARK,GONE parked;
class NONE warn;
```
---
## ASCII
```
╔══════════════════════════════════════════════╗
║ TRACKSOLID / JIMI API ║
║ (poll + push webhooks, OAuth2) ║
╚════════════════════╤═════════════════════════╝
┌──────────────────────────┬───────────┴───────────┬────────────────────────────┐
▼ ▼ ▼ ▼
┌─────────────────┐ ┌────────────────────┐ ┌──────────────────────┐
│ ingest_movement │ │ ingest_events │ │ webhook_receiver │ (all via ts_shared_rev.py:
│ _rev.py │ │ _rev.py │ │ _rev.py │ get_conn() pool, api_post(),
└────────┬────────┘ └─────────┬──────────┘ └──────────┬───────────┘ clean*() , token cache)
│ │ │
│ writes │ writes │ writes (push)
▼ ▼ ▼
╔════════════════════════════════════════════════════════════════════════════════════════════╗
║ L1 · BASE TABLES + HYPERTABLES schema: tracksolid (single source of truth) ║
║ ║
║ live_positions ── current fix / IMEI (plain) api_token_cache (auth) ║
║ position_history ◀═══ HYPERTABLE (high-res GPS trail, partitioned by gps_time) ║
║ trips ── trip summaries (plain table — NOT a hypertable) ║
║ devices ── vehicle/driver registry (plain) ║
║ alarms ── alarm events (plain table) ║
║ ingestion_log ── API audit trail (plain) ║
║ heartbeats*, fuel_readings*, temperature_readings* ◀═ HYPERTABLES (empty / planned push) ║
║ device_events, fault_codes, obd_readings, parking_events, lbs_readings, geofences (empty) ║
╚══════╤══════════════════════════╤══════════════════════════════════╤════════════════════════╝
│ │ │
│ TimescaleDB │ matview refresh │ direct reads
│ cont-agg policy │ (in-app loop) │ (Grafana SQL)
▼ ▼ ▼
╔══════════════════╗ ╔════════════════════════════════╗ ╔══════════════════════════════════╗
║ L2 · AGGREGATION ║ ║ L2 · AGGREGATION (matview) ║ ║ L3 · GRAFANA VIEWS (tracksolid.*) ║
║ ║ ║ ║ ║ read base tables directly ║
║ v_mileage_daily ║ ║ reporting.v_trips [MATVIEW] ║ ║ ║
║ _cagg ║ ║ ◀── trips + devices ║ ║ v_fleet_today v_fleet_status ║
║ CONTINUOUS AGG ║ ║ unique ix_v_trips_trip_id ║ ║ v_active_dispatch_map ║
║ ◀═ position_ ║ ║ ║ ║ v_currently_idle v_alarms_daily ║
║ history ║ ║ REFRESH … CONCURRENTLY every ║ ║ v_fleet_km_daily v_ingestion_health║
║ (auto refresh ║ ║ VTRIPS_REFRESH_INTERVAL_S=300 ║ ║ v_vehicles_not_moved_today ║
║ policy) ║ ║ by dashboard_api bg asyncio ║ ║ v_driver_aggregates_daily ║
╚════════╤═════════╝ ║ loop, pg advisory-lock 920145 ║ ║ v_fleet_trace ║
│ ║ │ ║ ║ v_driver_clock_daily/_today/_attnd ║
│ ║ └──▶ reporting.refresh_log ║ ╚══════════════╤═════════════════════╝
│ ╚═══════════════╤════════════════╝ │
│ │ │
│ ▼ │
│ ╔════════════════════════════════════════════╗ │
│ ║ L3 · REPORTING VIEWS (← v_trips matview) ║ │
│ ║ v_filter_vehicles v_filter_drivers ║ │
│ ║ v_filter_cost_centres v_filter_cities ║ │
│ ║ v_daily/weekly/monthly_summary ║ │
│ ║ v_*_cost_centre v_trips_today ║ │
│ ║ v_live_positions (view, ← live_positions) ║ │
│ ╚════════════════════════╤═══════════════════╝ │
│ │ │
│ ╔══════════════════════════════════╧═══════════════════╗ │
│ ║ L4 · FUNCTIONS (schema reporting — API entrypoints) ║ │
│ ║ ║ │
│ ║ fn_live_positions(cost_centre, acc_status) ║ │
│ ║ ◀── live_positions + devices ║ │
│ ║ fn_vehicle_track(vehicle_number, hours) ║ │
│ ║ ◀── position_history (hypertable trail) ────────╫─────┘ (also reads L1)
│ ║ fn_trips_for_map(veh[],driver,cc,city,start,end) ║
│ ║ ◀── reporting.v_trips (matview) ║
│ ║ normalize_plate(p) ── helper used by the above ║
│ ╚════════════════════════════╤══════════════════════════╝
│ │
▼ ▼
(cagg currently ╔═══════════════════════════════════════════════╗
unconsumed — ║ L5 · CONSUMERS ║
no panel/API yet) ║ ║
║ dashboard_api_rev.py (FastAPI :8890, ×2 wkrs) ║
║ POST /webhook/fleet-dashboard → fn_trips_for_map + v_filter_*
║ GET /webhook/live-positions → fn_live_positions
║ GET /webhook/live-positions/track → fn_vehicle_track
║ │ ║
║ ▼ HTTPS (fleetapi.rahamafresh.com) ║
║ SPAs: liveposition.* · fleetintelligence.* ║
║ (rustfs/S3 single-file maps) ║
║ ║
║ Grafana ──SQL──▶ tracksolid.v_* (L3 right) ║
╚═══════════════════════════════════════════════╝
┌─ REMOVED 2026-06-05 · purged via migrations 12 / 13 (never implemented) ───────────────┐
│ ops.* : tickets, dispatch_log, service_log, odometer_readings, │
│ cost_rates, kpi_targets, vw_service_forecast │
│ dwh_gold.* : dim_vehicles, fact_daily_fleet_metrics, refresh_daily_metrics() │
│ tracksolid : v_sla_inflight, v_utilisation_daily (+ their Grafana panels removed) │
│ → Schemas dropped from prod. The separate tracksolid_dwh server is unrelated/untouched.│
└────────────────────────────────────────────────────────────────────────────────────────┘
```
---
## How to read it
- **L1** is where ingestion lands. Only `position_history` (and three empty/planned
hypertables) are TimescaleDB hypertables; `trips`/`alarms` are ordinary tables.
- **Two aggregation mechanisms** run in parallel: the TimescaleDB **continuous aggregate**
`v_mileage_daily_cagg` (refreshed by a Timescale policy off `position_history`), and the
**`reporting.v_trips` matview** (refreshed every 5 min by the in-process `dashboard_api`
loop — the FIX-D02 self-refresher that replaced the dead n8n job).
- **L4 functions are the only thing the API calls.** SPAs never touch tables directly:
SPA → `dashboard_api``fn_*` → (matview / tables).
- **Grafana** takes the right-hand path, reading the `tracksolid.v_*` analytics views
straight off L1.
## Caveats / housekeeping notes
1. **`v_mileage_daily_cagg` has no consumer** — nothing in the API or Grafana reads it. It
is a live continuous aggregate maintaining itself for nobody. Candidate for removal or
wiring into a panel.
2. **`ops` and `dwh_gold` were purged** (2026-06-05, migrations 12/13) along with
`v_utilisation_daily`, `v_sla_inflight`, and their Grafana panels — the features were
never implemented.
3. See the companion housekeeping audit (2026-06-05) for the full unused-object list; the
only clean ad-hoc drop is `public.trips_viz_v1`.