Live-DB reconciliation before seeding CRQ revealed two divergences: - tickets.crq did NOT exist: 01_tickets_schema.sql was applied 2026-06-15 from a version predating its crq section, so the IF-NOT-EXISTS ledger guard has blocked it ever since (fn_tickets_for_map + resolve_ticket_geoms already reference crq, so they errored if called — masked because the live INC view uses fn_inc_dashboard). - The live ledger carries un-versioned 13_inc_search_fn.sql / 14_inc_filter_options.sql (applied 2026-06-19, absent from this repo). So 13_crq_columns.sql (ALTER-only, number 13) is replaced by 15_crq_table.sql, which CREATEs tickets.crq self-containedly (table + geom trigger + raw/typed indexes) and adds the typed STORED generated columns. Deterministic + idempotent on both the live DB (crq missing) and a fresh DB (crq minimal from 01). Numbered 15 to sit after the live ledger's max. Docs/CLI references updated 13->15. Applied + seeded on the live DB out-of-band (running container, INC image untouched): 39,240 crq rows, 99.99% geocoded (cluster + shared location cache), watermark current, crq now renders on fn_tickets_for_map. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
118 lines
7.9 KiB
Markdown
118 lines
7.9 KiB
Markdown
# Implementation record — fleettickets (as built)
|
||
|
||
What is actually built and deployed, as of the Phase-1 completion. Companion to
|
||
`docs/phase-1-ingestion.md` (plan) and `docs/phase-2-dashboard.md` (next).
|
||
|
||
## Pipeline (`pipeline.py` engine + `inc/`,`crq/` entrypoints)
|
||
|
||
The dataset-agnostic CDC engine lives in **`pipeline.py`**, parameterized by a small
|
||
`Dataset` config (name, table, `automations/<type>/changes|processed/` prefixes, key
|
||
regex, optional `post_apply` hook). Two thin entrypoints supply that config and the CLI:
|
||
**`inc/import_inc.py`** (`python -m inc.import_inc`, `post_apply=capture_history`) and
|
||
**`crq/import_crq.py`** (`python -m crq.import_crq`, no history hook). INC and CRQ share an
|
||
**identical 32-column source schema**, so the engine is fully shared; geocoding is
|
||
**cross-dataset** (one gazetteer/budget, unions `tickets.inc` + `tickets.crq`) and is run
|
||
from the INC entrypoint.
|
||
|
||
- **Source:** the incremental CDC stream `automations/<inc|crq>/changes/<EAT-timestamp>.csv`
|
||
in the **`isptickets`** S3 bucket (endpoint `https://s3.rahamafresh.com`, path-style,
|
||
region `us-east-1`; was the `tickets` bucket before the 2026-06-25 cutover).
|
||
- **S3 access via boto3** (no aws-CLI dependency): `list_objects_v2` (paginator),
|
||
`get_object`, `copy_object` + `delete_object` for archiving.
|
||
- **Watermark:** drains every `changes/` file newer than
|
||
`tickets.import_meta.metadata.source_max_key`, oldest→newest; reruns with no new file
|
||
are a cheap no-op. `--reseed` ignores the watermark for a one-time bucket cutover.
|
||
- **Cleaning:** drop `is_alarm=true` rows + the `EXPORT STOPPED…` sentinel; drop
|
||
`week_start`/`week_end`, `source_s3_bucket`/`source_s3_key`/`source_snapshot_id`,
|
||
`department`, `source_type`; normalize `region`→lowercase, `raw_status`→UPPERCASE.
|
||
- **Upsert** on `ticket_id` (`ON CONFLICT DO UPDATE`); never delete. On success,
|
||
**move** processed file(s) → `automations/inc/processed/`.
|
||
- **Geocoding** (keyed LocationIQ): `--geocode-clusters` (coarse, per cluster) and
|
||
`--geocode-locations` (precise, actionable INC; strips network codes; 25 km
|
||
wrong-city guard). Results cache in `tickets.geo_clusters` / `tickets.geo_locations`.
|
||
- **History capture:** after each `--apply` run (ingest or skip), calls
|
||
`tickets.capture_history()` → appends new closures + upserts today's backlog
|
||
snapshot.
|
||
- CLI (`inc`): `--from-bucket` (drain the INC change stream), `--reseed` (ignore the
|
||
watermark; one-time bucket cutover), `--inc-csv <file>` (local dev), `--apply` (else
|
||
dry-run), `--geocode-clusters`, `--geocode-locations`, `--capture-history`.
|
||
- CLI (`crq`): `--from-bucket`, `--reseed`, `--crq-csv <file>`, `--apply` (ingest only;
|
||
geocoding + history are not on the CRQ entrypoint).
|
||
|
||
## Schema / migrations (`tracksolid_db`, applied via `run_migrations.py`)
|
||
|
||
| Migration | What |
|
||
|---|---|
|
||
| 01_tickets_schema | `tickets.inc`/`crq` (raw-jsonb-first), `geo_clusters`/`geo_locations` gazetteers, geom-resolution trigger, `reporting.fn_tickets_for_map` |
|
||
| 02_import_meta | `tickets.import_meta` (snapshot freshness) + `fn_tickets_for_map` `summary.freshness` |
|
||
| 03_inc_columns | Unpack `raw` → typed STORED generated columns (text/numeric/bool + EAT→timestamptz via `tickets.eat_ts()`) |
|
||
| 04_inc_latlng | `latitude`/`longitude` = `COALESCE(feed, ST_Y/ST_X(geom))` (populated from geocode) |
|
||
| 05_inc_geography | `geog geography(Point,4326)` (= `geom::geography`) + GiST index for routing |
|
||
| 06_inc_mttr_minutes | `mttr` → integer **minutes**; drop constant `is_alarm`/`is_auto_created`/`is_auto_closed` |
|
||
| 07_inc_drop_service_type | drop constant `service_type` |
|
||
| 08_inc_open_sla_view | `tickets.inc_open_sla` view (open tickets + derived SLA) |
|
||
| 09_inc_dashboard_fn | **built** — `reporting.fn_inc_dashboard(cluster, status, window, from, to)`: one JSON payload (open GeoJSON + windowed closed GeoJSON + metrics + freshness) for the FleetOps live INC map. See `docs/phase-2-dashboard.md` |
|
||
| 10_inc_history_capture | **built** — `tickets.closure_events` (append-only observed closures) + `tickets.inc_daily_snapshot` (per-EAT-day open backlog + flow) + `tickets.capture_history()`; the ingest calls it each `--apply` run. Unlocks backlog-over-time |
|
||
| 12_inc_dashboard_by_owner | **built** — owner/team breakdown extension to `fn_inc_dashboard` |
|
||
| 15_crq_table | **built** — materializes `tickets.crq` (table + geom trigger + indexes; `01`'s crq section never ran on the live DB) + the typed STORED generated columns from `03` (reuses `tickets.eat_ts()`). Data-layer parity for the CRQ tab |
|
||
|
||
`tickets.inc` columns: `ticket_id` (PK), `raw` (jsonb, source of truth),
|
||
`normalized_status`/`raw_status`, `bucket`, `is_actionable`, `cluster`/`region`/
|
||
`location_name`, `assigned_team`/`owner`, `sla_status`, `mttr` (min),
|
||
`created_at_service`/`scheduled_at`/`closed_at`/`first_seen_at`/`last_seen_at`/
|
||
`source_created_at`/`source_updated_at` (timestamptz), `latitude`/`longitude`,
|
||
`geom`/`geog`/`geo_source`, `ingested_at`. Dropped-but-in-`raw`: `service_type`,
|
||
`is_alarm`, `is_auto_created`, `is_auto_closed`, and the ingest-time drops.
|
||
|
||
## Deployment
|
||
|
||
- **Coolify** app built from this repo's `Dockerfile` (`python:3.12-slim`,
|
||
`TZ=Africa/Nairobi`, keep-alive `tail -f /dev/null`). Separate from the FleetOps
|
||
web app (`fleet-ops-staging`).
|
||
- **Scheduled Tasks (two):** `inc_tickets` → `python -m inc.import_inc --from-bucket
|
||
--apply` and `crq_tickets` → `python -m crq.import_crq --from-bucket --apply`, both cron
|
||
`*/20 6-20 * * *` in **EAT** (Coolify runs tasks in EAT — no UTC conversion).
|
||
- **Env vars** (Coolify): `DATABASE_URL` (internal DB host), `RUSTFS_*`
|
||
(`isptickets` bucket — serves both inc + crq), `GEOCODER_*`.
|
||
- For a plain host/VM, `run_ingest.sh` + a crontab line is the alternative.
|
||
|
||
Full ops runbook (env management, the Forgejo → Coolify auto-deploy webhook, manual
|
||
deploys, bucket cutover, verification): **`docs/deployment-and-operations.md`**.
|
||
|
||
## State at hand-off
|
||
|
||
- `tickets.inc` ≈ 21,312 rows (current non-alarm INC + a few aged-out history rows);
|
||
**0 alarm / 0 sentinel** (legacy rows cleaned up one-time).
|
||
- Geocoding ~**99.99%** (`geom` on all but 1 null-cluster ticket); `QOA`/`PTMP`
|
||
cluster codes mapped to Quarry Road / Pipeline.
|
||
- Read path verified: `reporting.fn_tickets_for_map()` + `tickets.inc_open_sla`.
|
||
|
||
## Data-quality caveats (must inform analytics)
|
||
|
||
- Source `sla_status` only meaningful once **closed**; open SLA must be **derived**
|
||
(`now − created_at_service`, `first_seen_at` fallback; ~30% lack
|
||
`created_at_service`).
|
||
- `mttr` is **minutes**, null until closed; not wall-clock and not a 48h threshold.
|
||
- Lifecycle timestamps = `created_at_service`→`closed_at`; the `*_seen_at` / `source_*`
|
||
ones are export bookkeeping (don't use for SLA/closure-time).
|
||
- Content lag ~2 days behind wall-clock.
|
||
- **History:** `tickets.inc` is current-state (upsert). Closure/creation/MTTR event
|
||
series work directly; **backlog-over-time** now accrues via
|
||
`tickets.inc_daily_snapshot` + `tickets.closure_events` (written by
|
||
`tickets.capture_history()` each ingest) — builds forward from the first capture.
|
||
|
||
## Roadmap
|
||
|
||
Phase 2 (built): `fn_inc_dashboard` read-API → FleetOps live map (open + closed
|
||
overlay + metrics); history capture (`closure_events` + `inc_daily_snapshot`) for
|
||
backlog/closure trends. Remaining: `dashboard_api` endpoint + FleetOps SPA (other
|
||
repos; see `docs/dashboard-api-contract.md`), FleetNow **dispatch** off `geog`,
|
||
**team closure attribution**.
|
||
|
||
**CRQ** (this milestone): the shared engine now feeds `tickets.crq` from
|
||
`automations/crq/changes/` (`crq/import_crq.py`), with the `tickets.crq` table + typed columns (migration 15) and
|
||
cross-dataset geocoding — CRQ shows on the Tickets map via `fn_tickets_for_map` (which
|
||
already unions it) and gets its own FleetOps tab. Deferred to a follow-up once
|
||
installation-lifecycle semantics are confirmed: the CRQ analogues of migrations
|
||
08/09/10 — `crq_open_sla`, `fn_crq_dashboard`, and CRQ history capture (`tickets.crq`
|
||
currently has **no** `post_apply` hook).
|