david kiania 5f5d71d500 feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints

Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.

- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
  keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
  now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
  tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
  columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
  packages inc/crq. Docs (README, implementation, deployment-and-operations,
  n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.

tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-25 23:16:38 +03:00

7.8 KiB

Raw Blame History

Implementation record — fleettickets (as built)

What is actually built and deployed, as of the Phase-1 completion. Companion to docs/phase-1-ingestion.md (plan) and docs/phase-2-dashboard.md (next).

Pipeline (`pipeline.py` engine + `inc/`,`crq/` entrypoints)

The dataset-agnostic CDC engine lives in pipeline.py, parameterized by a small Dataset config (name, table, automations/<type>/changes|processed/ prefixes, key regex, optional post_apply hook). Two thin entrypoints supply that config and the CLI: inc/import_inc.py (python -m inc.import_inc, post_apply=capture_history) and crq/import_crq.py (python -m crq.import_crq, no history hook). INC and CRQ share an identical 32-column source schema, so the engine is fully shared; geocoding is cross-dataset (one gazetteer/budget, unions tickets.inc + tickets.crq) and is run from the INC entrypoint.

Source: the incremental CDC stream automations/<inc|crq>/changes/<EAT-timestamp>.csv in the isptickets S3 bucket (endpoint https://s3.rahamafresh.com, path-style, region us-east-1; was the tickets bucket before the 2026-06-25 cutover).
S3 access via boto3 (no aws-CLI dependency): list_objects_v2 (paginator), get_object, copy_object + delete_object for archiving.
Watermark: drains every changes/ file newer than tickets.import_meta.metadata.source_max_key, oldest→newest; reruns with no new file are a cheap no-op. --reseed ignores the watermark for a one-time bucket cutover.
Cleaning: drop is_alarm=true rows + the EXPORT STOPPED… sentinel; drop week_start/week_end, source_s3_bucket/source_s3_key/source_snapshot_id, department, source_type; normalize region→lowercase, raw_status→UPPERCASE.
Upsert on ticket_id (ON CONFLICT DO UPDATE); never delete. On success, move processed file(s) → automations/inc/processed/.
Geocoding (keyed LocationIQ): --geocode-clusters (coarse, per cluster) and --geocode-locations (precise, actionable INC; strips network codes; 25 km wrong-city guard). Results cache in tickets.geo_clusters / tickets.geo_locations.
History capture: after each --apply run (ingest or skip), calls tickets.capture_history() → appends new closures + upserts today's backlog snapshot.
CLI (inc): --from-bucket (drain the INC change stream), --reseed (ignore the watermark; one-time bucket cutover), --inc-csv <file> (local dev), --apply (else dry-run), --geocode-clusters, --geocode-locations, --capture-history.
CLI (crq): --from-bucket, --reseed, --crq-csv <file>, --apply (ingest only; geocoding + history are not on the CRQ entrypoint).

Schema / migrations (`tracksolid_db`, applied via `run_migrations.py`)

Migration	What
01_tickets_schema	`tickets.inc`/`crq` (raw-jsonb-first), `geo_clusters`/`geo_locations` gazetteers, geom-resolution trigger, `reporting.fn_tickets_for_map`
02_import_meta	`tickets.import_meta` (snapshot freshness) + `fn_tickets_for_map` `summary.freshness`
03_inc_columns	Unpack `raw` → typed STORED generated columns (text/numeric/bool + EAT→timestamptz via `tickets.eat_ts()`)
04_inc_latlng	`latitude`/`longitude` = `COALESCE(feed, ST_Y/ST_X(geom))` (populated from geocode)
05_inc_geography	`geog geography(Point,4326)` (= `geom::geography`) + GiST index for routing
06_inc_mttr_minutes	`mttr` → integer minutes; drop constant `is_alarm`/`is_auto_created`/`is_auto_closed`
07_inc_drop_service_type	drop constant `service_type`
08_inc_open_sla_view	`tickets.inc_open_sla` view (open tickets + derived SLA)
09_inc_dashboard_fn	built — `reporting.fn_inc_dashboard(cluster, status, window, from, to)`: one JSON payload (open GeoJSON + windowed closed GeoJSON + metrics + freshness) for the FleetOps live INC map. See `docs/phase-2-dashboard.md`
10_inc_history_capture	built — `tickets.closure_events` (append-only observed closures) + `tickets.inc_daily_snapshot` (per-EAT-day open backlog + flow) + `tickets.capture_history()`; the ingest calls it each `--apply` run. Unlocks backlog-over-time
12_inc_dashboard_by_owner	built — owner/team breakdown extension to `fn_inc_dashboard`
13_crq_columns	built — CRQ mirror of `03`: typed STORED generated columns + indexes on `tickets.crq` (reuses `tickets.eat_ts()`). Data-layer parity for the CRQ tab

tickets.inc columns: ticket_id (PK), raw (jsonb, source of truth), normalized_status/raw_status, bucket, is_actionable, cluster/region/ location_name, assigned_team/owner, sla_status, mttr (min), created_at_service/scheduled_at/closed_at/first_seen_at/last_seen_at/ source_created_at/source_updated_at (timestamptz), latitude/longitude, geom/geog/geo_source, ingested_at. Dropped-but-in-raw: service_type, is_alarm, is_auto_created, is_auto_closed, and the ingest-time drops.

Deployment

Coolify app built from this repo's Dockerfile (python:3.12-slim, TZ=Africa/Nairobi, keep-alive tail -f /dev/null). Separate from the FleetOps web app (fleet-ops-staging).
Scheduled Tasks (two): inc_tickets → python -m inc.import_inc --from-bucket --apply and crq_tickets → python -m crq.import_crq --from-bucket --apply, both cron */20 6-20 * * * in EAT (Coolify runs tasks in EAT — no UTC conversion).
Env vars (Coolify): DATABASE_URL (internal DB host), RUSTFS_* (isptickets bucket — serves both inc + crq), GEOCODER_*.
For a plain host/VM, run_ingest.sh + a crontab line is the alternative.

Full ops runbook (env management, the Forgejo → Coolify auto-deploy webhook, manual deploys, bucket cutover, verification): docs/deployment-and-operations.md.

State at hand-off

tickets.inc ≈ 21,312 rows (current non-alarm INC + a few aged-out history rows); 0 alarm / 0 sentinel (legacy rows cleaned up one-time).
Geocoding ~99.99% (geom on all but 1 null-cluster ticket); QOA/PTMP cluster codes mapped to Quarry Road / Pipeline.
Read path verified: reporting.fn_tickets_for_map() + tickets.inc_open_sla.

Data-quality caveats (must inform analytics)

Source sla_status only meaningful once closed; open SLA must be derived (now − created_at_service, first_seen_at fallback; ~30% lack created_at_service).
mttr is minutes, null until closed; not wall-clock and not a 48h threshold.
Lifecycle timestamps = created_at_service→closed_at; the *_seen_at / source_* ones are export bookkeeping (don't use for SLA/closure-time).
Content lag ~2 days behind wall-clock.
History: tickets.inc is current-state (upsert). Closure/creation/MTTR event series work directly; backlog-over-time now accrues via tickets.inc_daily_snapshot + tickets.closure_events (written by tickets.capture_history() each ingest) — builds forward from the first capture.

Roadmap

Phase 2 (built): fn_inc_dashboard read-API → FleetOps live map (open + closed overlay + metrics); history capture (closure_events + inc_daily_snapshot) for backlog/closure trends. Remaining: dashboard_api endpoint + FleetOps SPA (other repos; see docs/dashboard-api-contract.md), FleetNow dispatch off geog, team closure attribution.

CRQ (this milestone): the shared engine now feeds tickets.crq from automations/crq/changes/ (crq/import_crq.py), with typed columns (migration 13) and cross-dataset geocoding — CRQ shows on the Tickets map via fn_tickets_for_map (which already unions it) and gets its own FleetOps tab. Deferred to a follow-up once installation-lifecycle semantics are confirmed: the CRQ analogues of migrations 08/09/10 — crq_open_sla, fn_crq_dashboard, and CRQ history capture (tickets.crq currently has no post_apply hook).

7.8 KiB Raw Blame History Unescape Escape