diff --git a/web/status.html b/web/status.html new file mode 100644 index 0000000..d6b4c76 --- /dev/null +++ b/web/status.html @@ -0,0 +1,562 @@ + + + + + + Fleet-platform · status · 2026-05-28 + + + +
+ + + +
+ + +

Fleet-platform status — week 1 of 5

+

Greenfield rebuild of the Fireside telematics stack · Solo engineer · Started 2026-05-22

+ +
+
Where we are
+ All Week-1 and Week-2 work is shipped. Week-3 is 80% done — only ntfy alerts, parity check, and the 7-day soak remain. + Two large scope additions (trip detection backend + UI, multi-vehicle overlay) have landed on top of the original plan, + and the platform was migrated to a fully Coolify-managed deployment two days ago. +
+ + +

Headline metrics

+
+
+
Vehicles live
+
~144
+
across 4 Tracksolid subaccounts
+
+
+
Polling cadence
+
30 s
+
main poll · 10 m stale sweep
+
+
+
SQL migrations
+
20
+
forward-only, all applied
+
+
+
Live URL
+ +
JWT login at /login.html
+
+
+ + +

Timeline & pacing

+ + + + + + + + +
PhaseWindowStatusNotes
P1 — Foundation + live trackingweeks 1-5In progressDay 6 of 35. Well ahead of plan.
P2 — Trips + history + geocodingweeks 6-8Partially front-loadedTrip detection + reverse-geocoder already shipped in P1.
P3 — Operations tooling + cutoverweeks 7-9Not startedPush cut-over (this doc's last section) is the entry point.
P4 — Driver KPIs + cost allocationweeks 10-12Not startedDepends on P3 driver roster.
+ + +

Capabilities live today

+ +

Live map

+ + +

Filters (multi-select dropdowns)

+ + +

Trip dock (click any vehicle)

+ + +

Auth + API

+ + + +

Architecture

+ +

One FastAPI image, three container roles selected by APP_ROLE. Fate isolation is the point — a heavy report in the worker doesn't stall the gateway.

+ +
Tracksolid Pro API Browser + │ │ + │ poll every 30s │ HTTPS · JWT + ▼ ▼ + ┌─────────┐ ┌─────────┐ + │ cron │ │ gateway │ + │ (poll) │ │ (HTTP) │ + └────┬────┘ └────┬────┘ + │ │ + └──────────┐ ┌──────────────┘ + ▼ ▼ + ┌──────────────┐ LISTEN events_raw_new + │ events.raw │ ──────────────────────┐ + └──────┬───────┘ │ + │ ▼ + ▼ ┌────────┐ + ┌──────────────┐ │ worker │ + │ events.parsed│ ◀───────parse────┤(parser)│ + └──────┬───────┘ └────────┘ + │ + ▼ project (single writer) + ┌────────────────────┐ + │ state.live_positions│ + │ state.position_history│ + └────────────────────┘ + │ + ▼ read + serve.fn_live_view + serve.fn_vehicle_trips + │ + ▼ + Dashboard (browser)
+ + + + + + + + +
RoleContainerWorkload
gatewayfleet-platform-gatewayHTTP: push receivers, dashboard read API, JWT issuance, static UI
workerfleet-platform-workerLISTEN events_raw_new → parser → projectors (single-writer)
cronfleet-platform-cronAPScheduler: polling (30s/10m), reverse geocoder (30s), SLO worker (60s), contract checker (daily 02:00 UTC)
+ + +

Deployment

+ + + +
+
Migration gotcha
+ When piping SQL migrations to psql, strip the -- migrate:down section first (psql ignores the comment marker and runs everything). Use: +
awk "/^-- migrate:down/{exit} {print}" db/migrations/NNNN.sql \
+    | docker exec -i "$PG" psql -U postgres -d fleet_platform -v ON_ERROR_STOP=1
+
+ + +

Data model

+ +

Layered by purpose, not by feature. Read top-down: events are truth, state is derived.

+ + + + + + + + + + + + +
SchemaTablesPurpose
eventsraw · parsed · parser_errorsImmutable log (hypertable). Every push and every poll lands here verbatim before any interpretation.
statelive_positions · position_history · geocoded_positionsDerived projections. Single-writer (the projector). Rebuildable from events.
domainaccounts · vehicles · devicesBusiness entities. Auto-provisioned by the projector on first-sight; CSV/admin edits later.
servefn_live_view · fn_vehicle_trips · helper fnsRead-side SQL functions. Dashboard payloads are computed here, not in Python.
slotargets · measurements · v_current_statusSLO-as-data. Worker writes measurements every 60s. UI surface removed at user request; data still populates.
opscontract_check_logDaily Tracksolid contract probe log; drives the contract_drift_days SLO.
authaccounts · tokensJWT issuance + scope.
+ + +

API endpoints

+ + + + + + + + + + + +
MethodPathAuthWhat it returns
POST/api/auth/tokenForm loginJWT access + refresh
GET/api/views/liveread:fleetFleetNow counters + GeoJSON of all active vehicles + SLO snapshot
GET/api/views/vehicle/{id}/trips?date=YYYY-MM-DDread:fleetPer-day trip breakdown (totals + trips[] with paths + stops)
GET/api/views/vehicle/{id}/trips.csv?date=YYYY-MM-DDread:fleetOne row per trip, downloadable
POST/push/jimi/{pushgps,pushalarm,pushhb,pushobd,…}Shared token (form body)Verbatim INSERT into events.raw + NOTIFY. Receivers built; Tracksolid still pushes to the legacy URL. See push cut-over plan.
GET/health/{gateway,worker,cron}OpenContainer + DB liveness
+ + +

Migrations history

+ + + + + + + + + + + + + + + + + + + + + + +
#FileWhat it adds
01init_schemasSchemas + Postgres extensions (Timescale, PostGIS)
02eventsevents.raw / parsed / parser_errors hypertables + NOTIFY triggers
03domainaccounts, vehicles, devices
04state_livelive_positions + position_history hypertable
05sloslo.targets / measurements / v_current_status
06authauth.accounts (bcrypt) + tokens
07–09, 11serve_fn_live_view v1→v3Dashboard read function — evolved with each UI iteration
08live_positions_richerAdded mc_type, mileage, gps_signal, satellites, device_name, pos_type
10geocoded_positionsNominatim cache table
12label_short_from_plateserve._label_short — plate-tail extraction
13driver_from_device_nameserve._driver_name — heuristic driver-name parse
14real_plates_consolidateOne-shot dedup of plate-equivalent vehicle rows
15-16CSV importRemoved — rolled back by mig 17
17rollback_csv_importFull CSV revert (re-split vehicles, drop CSV cols, restore fn_live_view v3)
18ops_contract_check_logDaily Tracksolid endpoint probe log
19fn_vehicle_tripsPL/pgSQL state machine for trip detection
20normalize_assigned_cityData hygiene — collapsed Nairobi/nairobi
+ + +

Trip detection algorithm

+ +

One server-side function (serve.fn_vehicle_trips(vehicle_id, date_eat)), single forward pass over state.position_history for the day.

+ +

Rules

+ + +

Calibration vs legacy DB

+ + + + + + + + + + +
VehiclePatternLegacy tripsNew algoVerdict
KDE 638Jfull day, clean reporting1515Perfect alignment
KDK 728Khalf day, noisy stop-and-go339Cleaner — legacy over-segments traffic stops
KMGW 538Whalf day208Legacy splits on sub-minute gaps
KDB 585Ebusy day, many short trips2118Close — most boundaries match
KDV 683Zmoderate137Same pattern as 538W
+ +

5-min thresholds (work stop + no-fix stop) locked in. Sim tool at scripts/simulate_trips_from_legacy.py replays any legacy JSON dump through the algorithm offline.

+ + +

Known issues & follow-ups

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IssueSeverityStatusNotes
Polyline straight-line artifacts between fixesVisualMitigatedDropped polling 60s→30s. Permanent fix is push cut-over (denser stream) or map-matching (OSRM/Valhalla) — both deferred
APP_GIT_SHA shows unknown in containersCosmeticOpenCoolify isn't injecting SOURCE_COMMIT; need compose tweak
Some vehicles report has_acc_data=falseData qualityAcceptedAlgorithm falls back to speed-only detection; flagged in response
state.position_history has no (imei, occurred_at) unique constraintLatentAddress before push cut-overBites only if push + polling overlap; needed for ingest idempotency. See push plan
Auto-deploy webhook not wired Forgejo → CoolifyDXOpenManual "Redeploy" click required after each push
+ + +

Roadmap

+ +

P1 — remaining

+ + + + + + + + +
#ItemStatusEffort
05Coolify rollback smoke testPending~1 h
14ntfy.sh container + SLO breach alertsPending~half day
16parity_check.py vs legacy DBPending~half day
177-day soak + dispatcher sign-offPending7 days calendar
+ +

P2 — next

+ + +

P3 — operations + cut-over

+ + +

P4 — driver KPIs + cost allocation

+ + + +

Push-receiver cut-over plan (P3)

+ +

Currently Tracksolid posts to a legacy endpoint at https://tshook.rahamafresh.com/pushalarm (a separate project that no longer benefits us). We poll every 30 s as a workaround. The cut-over moves us off polling and onto real-time push.

+ +

Why it's worth doing

+ + + + + + + + + +
Today (polling)After (push)
~30 s minimum lag from event to dashboard~1-5 s (push is event-driven)
~1 fix/min/vehicle when stationary, ~2/min when moving~5-15 s between fixes on motion; immediate for ACC/alarm events
Polyline cuts straight across roads (low-density fixes)Polyline traces actual movement (high-density fixes)
Alarms (ACC ON/OFF, SOS, geofence) buried in the polled snapshotAlarms arrive as their own typed events instantly
~4 Tracksolid API calls every 30 s = 11,520/dayZero outbound API calls for the main fix stream
+ +

What's already in place

+ + +

What needs to happen (in order)

+ +
    +
  1. Add dedup to state.position_history — migration 21. Add a unique index on (imei, occurred_at, source) (or insert with ON CONFLICT DO NOTHING). Without this, push + polling overlap will duplicate fixes during the mirror window and inflate trip distances.
  2. +
  3. Synthetic-payload smoke test — curl a realistic Tracksolid push body at each receiver, confirm events.raw row appears, parser produces an events.parsed row, projector updates state.live_positions. Validates the path end-to-end before depending on real traffic.
  4. +
  5. Tracksolid console: add the new URL alongside the legacy URL — this is a vendor-portal step, done by whoever manages the Fireside Tracksolid account. The exact URL list to paste: +
    https://api.rahamafresh.com/push/jimi/pushgps
    +https://api.rahamafresh.com/push/jimi/pushalarm
    +https://api.rahamafresh.com/push/jimi/pushhb
    +https://api.rahamafresh.com/push/jimi/pushevent
    +https://api.rahamafresh.com/push/jimi/pushobd
    +https://api.rahamafresh.com/push/jimi/pushfaultinfo
    +https://api.rahamafresh.com/push/jimi/pushtripreport
    + Token: the value of TRACKSOLID_PUSH_TOKEN (set in Coolify env). +
  6. +
  7. Mirror window (≥3 days) — both push and polling run. Compare daily counts per IMEI between push-derived and poll-derived fixes. Watch for: parser errors, auth failures, payload-shape surprises, dedup hit rate.
  8. +
  9. Cut polling cadence — once mirror data shows push is delivering >95% of fixes, drop main polling from 30 s → 10 min as a sparse safety net (or disable entirely). Keep the stale-IMEI sweep for offline-recovery.
  10. +
  11. Tracksolid console: remove legacy URL — once dispatchers confirm the new dashboard is showing identical or better real-time data, drop the legacy URL from Tracksolid. Hot-standby on our side for 48 h as fallback.
  12. +
  13. Decommission legacy receiver project — final step; the old project at tshook.rahamafresh.com can be shut down.
  14. +
+ +

What needs decisions before starting

+ + +

Expected outcomes

+ + + +

Decisions log (significant ones)

+ + + + + + + + + + + + + + + +
DateDecisionRationale
2026-05-22Greenfield rebuild, no legacy reuseBranch divergence + race conditions in legacy made incremental patching unviable
2026-05-23Three container roles from one imageFate isolation without microservices overhead
2026-05-24CSV roster importTo enrich devices with real plates/drivers/cost-centres
2026-05-25CSV import fully rolled backSuffix-merge regression dropped vehicle count 144 → 124; underlying merge problem must be solved before any retry
2026-05-265-min thresholds (work stop, no-fix stop)Calibrated against 5 legacy report dumps; matches dispatcher mental model on clean data, cleaner on noisy data
2026-05-27Migrate from manual docker run → Coolify ComposeAd-hoc deploys were brittle; needed permanent infrastructure
2026-05-27Polling 60 s → 30 sMitigation for sparse polyline artifacts pending push cut-over
2026-05-27Remove SLO panel from top barUser pref — backend still computes, UI just hides
2026-05-27Light Carto Positron basemap + HQ POIHigher contrast for cost-centre marker tints; reference landmark
2026-05-27Per-trip colour coding in single-vehicle modeTrip cards ↔ map polylines pair visually at a glance
+ +

— end —

+ +
+
+ +