fleettickets/docs/260702_audit_report.md

84 lines
4.6 KiB
Markdown
Raw Permalink Normal View History

# fleettickets — Platform Audit Report (2026-07-02)
Part of the 2026-07-02 cross-repo audit (see the tracksolid repo's
`docs/reports/260702_platform_audit_report.md` for the DB/host-wide findings).
Scope here: this repo's code, its migrations, the live `tickets` schema contents,
and the deployed Coolify container. Companion docs: `260702_fix_plan.md`,
`260702_work_done.md`.
---
## Critical
### FT-BUG-01 — Precise location geocodes are never applied at ingest time (and get clobbered)
Verified live: `tickets.geo_locations` holds **808 geocoded locations**, yet only
**51 of 114k geocoded INC tickets** have `geo_source='location'` — and **0 of 42k CRQ**.
Two compounding defects:
1. The `tg_ticket_geom` trigger (migration 01) resolves feed coords → **cluster
centroid** → none. It **never consults `tickets.geo_locations`**, so a freshly
upserted ticket can only ever get cluster precision.
2. `tickets.resolve_ticket_geoms()` (which does honour geo_locations) only runs at
the end of the manual `--geocode-*` commands. Every 20-minute delta ingest
re-upserts changed rows, the trigger recomputes geom, and any previously
resolved `'location'` geom is silently downgraded back to `'cluster'`.
Net effect: the LocationIQ budget spent geocoding 808 locations produces almost no
map precision. The "cluster-gazetteer geocoding is the critical path" goal is
effectively defeated by the write path.
### FT-SEC-01 — Live container connects as the postgres superuser over the PUBLIC port
The deployed container's `DATABASE_URL` is
`postgres://postgres:…@twala.rahamafresh.com:5433/tracksolid_db` — the superuser,
over the internet-exposed host port, without TLS. The repo's own `.env.example`
prescribes `tracksolid_owner@timescale_db:5432` (internal Docker network). This
must be fixed in the Coolify app env **before** the tracksolid stack's new
loopback-only port binding deploys, or ingestion breaks. (Operational — see plan
Phase B.)
## High
### FT-SEC-02 — Unpinned image builds, root runtime
`Dockerfile` does `pip install .` from `pyproject.toml` version ranges; `uv.lock`
exists but is **excluded by `.dockerignore`**, so builds are unpinned and
unreproducible. The container also runs as root (no `USER`).
### FT-BUG-02 — New clusters/locations only geocode when someone remembers to run the command
`--geocode-clusters` / `--geocode-locations` are manual. Live data shows the lag:
156 INC + 71 CRQ tickets sit at `geo_source='none'`, and location coverage is 808
keys vs thousands of distinct location_names. Both functions are already
incremental (NOT EXISTS guards), so chaining a cluster-geocode pass into the
scheduled ingest is nearly free (0 API calls on a quiet run).
## Medium / notes
- **FT-PERF-01** — `tickets.inc` = 766 MB / `tickets.crq` = 486 MB, dominated by the
`raw` jsonb (avg 754 B/row) and **134 MB of never-scanned geo indexes**
(`ix_inc_geog` 83 MB, `ix_inc_geom` 51 MB, `ix_crq_geom` 49 MB — 0 scans; the map
path reads `fn_tickets_for_map`, which planner-serves off the raw/actionable
indexes). A separate spawned task is already preparing the index-drop migration —
not duplicated here. **Numbering note:** that task landed `17_drop_unused_geo_indexes.sql`; this
audit's trigger fix is therefore numbered `18_trigger_location_geom.sql`.
- **FT-OPS-01** — `20260618_bug.txt` (untracked, repo root) reviews a file
(`import_tickets.py`) that no longer exists; its live findings (ETag skip,
meta-outside-txn) were fixed in the pipeline rewrite. Stale — remove.
- **FT-OPS-02** — migration numbering gap (no `11_*.sql`) — harmless (lexical
ordering), just don't reuse 11.
- **FT-OPS-03** — `run_migrations.py` is manual (not run on container start, unlike
the tracksolid stack). Acceptable for a batch worker, but document it in deploys.
- **FT-NOTE-01** — straggler archiving: when nothing is pending, `ingest()` archives
*every* file still in `changes/` (including pre-watermark leftovers) — files are
moved to `processed/`, not destroyed, so this is acceptable; noted for awareness.
- **FT-NOTE-02** — the geocode loops open a fresh psycopg2 connection per written
row (`shared.get_conn` has no pool). Fine at batch cadence; revisit only if
geocode volumes grow 10×.
## In good shape
- The CDC drain design is solid: per-file watermark + archive in one txn each,
crash-safe resume, `--reseed` for bucket cutovers, raw-first schema resilient to
source drift.
- `capture_history()` gives durable closure/backlog history despite the
current-state upsert model.
- Sentinel/alarm-row filtering, place extraction, and the viewbox+distance-guarded
two-pass geocoder are careful, well-commented work.