fleettickets/docs/implementation.md
david kiania 764dee986f feat: history capture — closure_events + daily backlog snapshot (migration 10)
- tickets.closure_events: append-only observed closures (PK ticket_id, closed_at;
  observed_at = first sighting; survives row churn).
- tickets.inc_daily_snapshot: one row per EAT day — open backlog (+ SLA split, by
  cluster/status) and created/closed flow; upserted each run.
- tickets.capture_history(): appends new closures + upserts today's snapshot.
- import_tickets calls it after each --apply run (ingest or skip); add
  --capture-history CLI flag for standalone runs.
Verified: backfilled 21,282 closures; today's snapshot recorded (open_total 30).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:19:23 +03:00

5.8 KiB
Raw Blame History

Implementation record — fleettickets (as built)

What is actually built and deployed, as of the Phase-1 completion. Companion to docs/phase-1-ingestion.md (plan) and docs/phase-2-dashboard.md (next).

Pipeline (import_tickets.py)

  • Source: newest automations/inc/<EAT-timestamp>.csv in the rustfs tickets bucket (endpoint https://s3.rahamafresh.com, path-style, region us-east-1).
  • S3 access via boto3 (no aws-CLI dependency): list_objects_v2 (paginator), get_object, copy_object + delete_object for archiving.
  • Skip-if-unchanged: newest S3 ETag vs tickets.import_meta.metadata.source_etag; equal → skip the DB write (the export re-emits identical content most hours).
  • Cleaning: drop is_alarm=true rows + the EXPORT STOPPED… sentinel; drop week_start/week_end, source_s3_bucket/source_s3_key/source_snapshot_id, department, source_type; normalize region→lowercase, raw_status→UPPERCASE.
  • Upsert on ticket_id (ON CONFLICT DO UPDATE); never delete. On success, move processed file(s) → automations/inc/processed/.
  • Geocoding (keyed LocationIQ): --geocode-clusters (coarse, per cluster) and --geocode-locations (precise, actionable INC; strips network codes; 25 km wrong-city guard). Results cache in tickets.geo_clusters / tickets.geo_locations.
  • History capture: after each --apply run (ingest or skip), calls tickets.capture_history() → appends new closures + upserts today's backlog snapshot.
  • CLI: --from-bucket (newest INC csv), --inc-csv <file> (local dev), --apply (else dry-run), --geocode-clusters, --geocode-locations, --capture-history.

Schema / migrations (tracksolid_db, applied via run_migrations.py)

Migration What
01_tickets_schema tickets.inc/crq (raw-jsonb-first), geo_clusters/geo_locations gazetteers, geom-resolution trigger, reporting.fn_tickets_for_map
02_import_meta tickets.import_meta (snapshot freshness) + fn_tickets_for_map summary.freshness
03_inc_columns Unpack raw → typed STORED generated columns (text/numeric/bool + EAT→timestamptz via tickets.eat_ts())
04_inc_latlng latitude/longitude = COALESCE(feed, ST_Y/ST_X(geom)) (populated from geocode)
05_inc_geography geog geography(Point,4326) (= geom::geography) + GiST index for routing
06_inc_mttr_minutes mttr → integer minutes; drop constant is_alarm/is_auto_created/is_auto_closed
07_inc_drop_service_type drop constant service_type
08_inc_open_sla_view tickets.inc_open_sla view (open tickets + derived SLA)
09_inc_dashboard_fn builtreporting.fn_inc_dashboard(cluster, status, window, from, to): one JSON payload (open GeoJSON + windowed closed GeoJSON + metrics + freshness) for the FleetOps live INC map. See docs/phase-2-dashboard.md
10_inc_history_capture builttickets.closure_events (append-only observed closures) + tickets.inc_daily_snapshot (per-EAT-day open backlog + flow) + tickets.capture_history(); the ingest calls it each --apply run. Unlocks backlog-over-time

tickets.inc columns: ticket_id (PK), raw (jsonb, source of truth), normalized_status/raw_status, bucket, is_actionable, cluster/region/ location_name, assigned_team/owner, sla_status, mttr (min), created_at_service/scheduled_at/closed_at/first_seen_at/last_seen_at/ source_created_at/source_updated_at (timestamptz), latitude/longitude, geom/geog/geo_source, ingested_at. Dropped-but-in-raw: service_type, is_alarm, is_auto_created, is_auto_closed, and the ingest-time drops.

Deployment

  • Coolify app built from this repo's Dockerfile (python:3.12-slim, TZ=Africa/Nairobi, keep-alive tail -f /dev/null). Separate from the FleetOps web app (fleet-ops-staging).
  • Scheduled Task: python import_tickets.py --from-bucket --apply, cron 15 7-19 * * * in EAT (Coolify runs tasks in EAT — no UTC conversion).
  • Env vars (Coolify): DATABASE_URL (internal DB host), RUSTFS_*, GEOCODER_*.
  • For a plain host/VM, run_ingest.sh + a crontab line is the alternative.

State at hand-off

  • tickets.inc ≈ 21,312 rows (current non-alarm INC + a few aged-out history rows); 0 alarm / 0 sentinel (legacy rows cleaned up one-time).
  • Geocoding ~99.99% (geom on all but 1 null-cluster ticket); QOA/PTMP cluster codes mapped to Quarry Road / Pipeline.
  • Read path verified: reporting.fn_tickets_for_map() + tickets.inc_open_sla.

Data-quality caveats (must inform analytics)

  • Source sla_status only meaningful once closed; open SLA must be derived (now created_at_service, first_seen_at fallback; ~30% lack created_at_service).
  • mttr is minutes, null until closed; not wall-clock and not a 48h threshold.
  • Lifecycle timestamps = created_at_serviceclosed_at; the *_seen_at / source_* ones are export bookkeeping (don't use for SLA/closure-time).
  • Content lag ~2 days behind wall-clock.
  • History: tickets.inc is current-state (upsert). Closure/creation/MTTR event series work directly; backlog-over-time now accrues via tickets.inc_daily_snapshot + tickets.closure_events (written by tickets.capture_history() each ingest) — builds forward from the first capture.

Roadmap

Phase 2 (built): fn_inc_dashboard read-API → FleetOps live map (open + closed overlay + metrics); history capture (closure_events + inc_daily_snapshot) for backlog/closure trends. Remaining: dashboard_api endpoint + FleetOps SPA (other repos; see docs/dashboard-api-contract.md), FleetNow dispatch off geog, team closure attribution. CRQ = separate future project reusing this machinery against automations/crq/.