No description
Find a file
david kiania a4b90a33d8 fix(inc): ingest the incremental changes/ stream (baseline + deltas)
The S3 source switched from full hourly snapshots at
automations/inc/<ts>.csv to an incremental CDC stream at
automations/inc/changes/<ts>.csv (first file = full baseline, each later
file = only the rows that changed, keyed by ticket_id; no deletions).

The loader still pointed at the old root path and only ingested the single
newest file, so after the switch it found nothing (no new tickets ingested)
and, even with the path fixed, would silently drop intermediate deltas.

Changes:
- point ingestion at automations/inc/changes/ (_CHANGE_KEY_RE)
- ingest EVERY not-yet-processed file in ascending timestamp order
  (baseline first, then each delta), upserting each
- replace the single-ETag skip with a per-file timestamp watermark
  (import_meta.metadata->>'source_max_key'); rows + watermark commit in one
  txn per file, then archive to processed/ — so a mid-run failure leaves a
  consistent, resumable state
- docs: rename n8n-hourly-s3-full-data-exports.md -> n8n-s3-ticket-exports.md
  and rewrite it for the incremental stream; fix the reference in
  docs/phase-1-ingestion.md

Verified live against prod: re-seeded baseline + 5 deltas (26,529 rows),
files archived to processed/, watermark advanced, re-run is a no-op.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 14:37:17 +03:00
docs fix(inc): ingest the incremental changes/ stream (baseline + deltas) 2026-06-23 14:37:17 +03:00
migrations feat(reporting): add closure-by-engineer analytics to fn_inc_dashboard (migration 12) 2026-06-18 17:53:32 +03:00
.dockerignore feat: S3 via boto3 + Dockerfile for Coolify deploy 2026-06-15 20:08:05 +03:00
.env.example feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive) 2026-06-15 19:33:16 +03:00
.gitignore fix: address valid findings from 20260618 bug report 2026-06-18 13:41:38 +03:00
Dockerfile fix: address valid findings from 20260618 bug report 2026-06-18 13:41:38 +03:00
import_tickets.py fix(inc): ingest the incremental changes/ stream (baseline + deltas) 2026-06-23 14:37:17 +03:00
n8n-s3-export-workflows.md feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive) 2026-06-15 19:33:16 +03:00
n8n-s3-ticket-exports.md fix(inc): ingest the incremental changes/ stream (baseline + deltas) 2026-06-23 14:37:17 +03:00
pyproject.toml fix: address valid findings from 20260618 bug report 2026-06-18 13:41:38 +03:00
README.md feat: history capture — closure_events + daily backlog snapshot (migration 10) 2026-06-16 01:19:23 +03:00
run_ingest.sh chore: add hourly INC ingest cron wrapper + schedule docs 2026-06-15 19:40:50 +03:00
run_migrations.py feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema 2026-06-11 20:13:50 +03:00
shared.py feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema 2026-06-11 20:13:50 +03:00

fleettickets

Field-ops INC ticket ingestion, geocoding, and read-schema that powers the Tickets map in FleetOps. Extracted from the tracksolid repo into its own module (it previously lived there as migrations 2123 + tools/import_tickets.py).

  • INC — incident / customer-fault tickets (this pipeline is strictly INC)
  • CRQ — new-installation requests (schema kept, but out of scope — not ingested here)

What this owns

Piece What
migrations/01_tickets_schema.sql The tickets schema: tickets.inc / tickets.crq (raw-jsonb-first), tickets.geo_clusters + tickets.geo_locations gazetteers, geom-resolution trigger, and reporting.fn_tickets_for_map (the GeoJSON read function)
migrations/02_import_meta.sql tickets.import_meta (per-dataset snapshot envelope metadata) + fn_tickets_for_map re-defined to expose it as summary.freshness (same signature — dashboard_api unchanged)
migrations/03_inc_columns.sql Unpacks tickets.inc.raw into typed STORED generated columns (status, cluster, region, team, owner, sla_status, mttr, lat/lng, is_* booleans, and EAT→timestamptz timestamps via tickets.eat_ts()). Computed for all rows + auto-populated on every ingest; raw stays the source of truth
migrations/04_inc_latlng.sql Redefines latitude/longitude to COALESCE(feed, ST_Y/ST_X(geom)) so they're populated from the geocoded position (feed is always empty); precision per geo_source (location vs cluster centroid)
migrations/05_inc_geography.sql Adds geog geography(Point,4326) (= geom::geography) + GiST index for routingST_Distance/ST_DWithin/KNN in real metres (nearest-vehicle, radius search)
migrations/06_inc_mttr_minutes.sql mttr generated column → integer minutes (source is decimal hours); drops the constant is_alarm/is_auto_created/is_auto_closed columns (kept in raw). is_actionable retained
migrations/07_inc_drop_service_type.sql Drops the constant service_type column (always inc; kept in raw)
migrations/08_inc_open_sla_view.sql tickets.inc_open_sla view — open (is_actionable) tickets with derived SLA (hours_open, sla_state vs 48h; clock = created_at_servicefirst_seen_at), plus team/cluster/geog for dispatch
migrations/09_inc_dashboard_fn.sql reporting.fn_inc_dashboard(cluster, status, window, from, to) — one JSON payload (window / open GeoJSON / closed GeoJSON / metrics / freshness) powering the FleetOps live INC map. Open=live, closed=windowed (EAT calendar / custom); filters AND
migrations/10_inc_history_capture.sql History for time-series: tickets.closure_events (append-only observed closures) + tickets.inc_daily_snapshot (per-EAT-day open backlog + flow), populated by tickets.capture_history() each ingest. Unlocks backlog-over-time
import_tickets.py Ingests the newest INC CSV from the rustfs tickets bucket (automations/inc/<EAT-timestamp>.csv) and upserts on ticket_id; geocodes clusters + INC locations
run_migrations.py Applies migrations/*.sql in order (ledger: tickets.schema_migrations)
shared.py Minimal DB/logging helpers (self-contained — no tracksolid dependency)

What this does NOT own (stays where it is)

  • The DB — the tickets schema lives in the shared tracksolid_db.
  • The read-APIdashboard_api (in the tracksolid stack) serves GET /webhook/tickets, which calls reporting.fn_tickets_for_map (defined here).
  • The frontend — the Tickets map is a tab in the FleetOps SPA (fleetops repo).

Data model (raw-first)

Each row is ticket_id + raw (the full source record as jsonb) + a derived geom / geo_source. Everything reads from raw, so a change to the source schema needs no migration. For convenient typed/indexable access, raw is also unpacked into STORED generated columns (migration 03) — e.g. normalized_status, cluster, region, assigned_team, owner, sla_status, mttr, is_actionable, created_at_service/closed_at (as EAT→timestamptz). These stay in lock-step with raw automatically (no loader change); raw remains the source of truth. geom is resolved: feed coords (raw lat/lng) → location (geocoded location_name) → cluster centroid → none.

Source coordinates are empty in the feed, so geocoding is required:

  • --geocode-clusters — one coordinate per cluster (coarse fallback).
  • --geocode-locations — precise per-location for actionable INC tickets: strips the network codes from location_name (e.g. NW_, ADR_MNT_, FDT<n>, SDUS), geocodes the real place via a keyed provider (LocationIQ / OpenCage), and **rejects any result

    25 km from the cluster centroid** (wrong-city guard). Results cache in tickets.geo_locations.

Columns on tickets.inc

Column Type Notes
ticket_id text (PK) e.g. WOT0715527
raw jsonb full source record — the source of truth
normalized_status · raw_status text use normalized_status for filtering (canonical)
bucket text lifecycle: closed / pending
is_actionable boolean the open/closed flag (open = true)
cluster · region · location_name text region lowercased; cluster feeds the gazetteer
assigned_team · owner text closure attribution dimensions
sla_status text source Compliant/Breachedonly meaningful once closed
mttr numeric minutes (source is decimal hours); null until closed
created_at_service · scheduled_at · closed_at · first_seen_at · last_seen_at · source_created_at · source_updated_at timestamptz EAT→UTC via tickets.eat_ts(). lifecycle = created_at_serviceclosed_at; export bookkeeping = first_seen_at/last_seen_at/source_*
latitude · longitude double precision COALESCE(feed, geocoded) — populated from geom
geom geometry(Point,4326) display / the map
geog geography(Point,4326) routing — metres-accurate distance (GiST indexed)
geo_source text precision: feed / location / cluster / none
ingested_at timestamptz when we last upserted this row

Dropped from the unpacked columns (still in raw): service_type, is_alarm, is_auto_created, is_auto_closed (all single-cardinality), plus the ingest-time drops below. reporting.fn_tickets_for_map reads from raw and serves the map; tickets.inc_open_sla is the open-ticket SLA view for dashboards/dispatch.

Setup

uv sync
cp .env.example .env        # fill in DATABASE_URL, RUSTFS_*, GEOCODER_*
python run_migrations.py    # apply the schema (idempotent)

Run

# ingest the newest INC CSV from the bucket (skip-if-unchanged, then archive)
python import_tickets.py --from-bucket --apply

# geocode (needs GEOCODER_API_KEY)
python import_tickets.py --geocode-clusters  --apply   # coarse, once
python import_tickets.py --geocode-locations --apply   # precise, actionable INC

# from a local CSV instead of the bucket (dev)
python import_tickets.py --inc-csv 2026-06-15T17-00-00.csv --apply

Dry-run is the default (omit --apply). import_tickets.py --from-bucket talks to S3 via boto3 using the RUSTFS_* env (path-style addressing; no aws-CLI dependency).

Deploy (Coolify)

The repo ships a Dockerfile — a small batch worker with no web server. Coolify builds it and keeps the container alive (CMD tail -f /dev/null); the ingest runs as a Scheduled Task, not a system crontab:

  • Command: python import_tickets.py --from-bucket --apply
  • Frequency: 15 7-19 * * * (:15 past each hour, 07:1519:15 EAT). This Coolify instance runs scheduled tasks in EAT (Africa/Nairobi), so no UTC conversion is needed.
  • Env vars (Coolify → Environment Variables): DATABASE_URL (internal DB host), RUSTFS_*, GEOCODER_*.

Skip-if-unchanged makes a run on an already-ingested snapshot a cheap no-op.

For a plain host/VM instead of Coolify, run_ingest.sh loads .env and runs the ingest; schedule it with a crontab line (CRON_TZ=Africa/Nairobi / 15 7-19 * * *).

Notes

  • The n8n export writes a full current-state CSV per hour to automations/inc/<EAT-timestamp>.csv — no latest pointer, no metadata envelope, no deltas. The loader lists the prefix, takes the newest file, and ingests it.
  • Skip-if-unchanged: the newest file's S3 ETag is compared to the last processed file's ETag (stored in tickets.import_meta.metadata.source_etag); if equal, the DB write is skipped (the export re-emits byte-identical content most hours).
  • Upsert on ticket_id (PRIMARY KEY) — duplication is impossible; rows are never deleted, so closed-ticket history accumulates. On success the file is moved to automations/inc/processed/.
  • Cleaning at ingest: drop is_alarm=true rows + the EXPORT STOPPED… sentinel; drop week_start/week_end, source_s3_*/source_snapshot_id, department/source_type; normalize region → lowercase and raw_status → UPPERCASE. service_type and bucket (a closed/pending flag) are kept.
  • tickets.import_meta captures snapshot freshness (surfaced as summary.freshness by fn_tickets_for_map).
  • The curated/geocoded coordinates are written verified = false — review tickets.geo_clusters / tickets.geo_locations and flip verified once checked.

Querying

-- map payload (GeoJSON + summary, incl. summary.freshness) — what dashboard_api serves
SELECT reporting.fn_tickets_for_map();              -- open-only by default
SELECT reporting.fn_tickets_for_map(p_open_only := false);   -- all geocoded tickets

-- open tickets by SLA (derived) + by cluster — via the view
SELECT sla_state, count(*) FROM tickets.inc_open_sla GROUP BY 1;
SELECT cluster, count(*), round(avg(hours_open),1) AS avg_hrs
FROM tickets.inc_open_sla GROUP BY 1 ORDER BY 2 DESC;

-- closures / creations per day (EAT)
SELECT (closed_at AT TIME ZONE 'Africa/Nairobi')::date AS d, count(*)
FROM tickets.inc WHERE closed_at IS NOT NULL GROUP BY 1 ORDER BY 1 DESC;

-- open-backlog-over-time (accrues from first capture; one row per EAT day)
SELECT snapshot_date, open_total, open_breached, closed_today
FROM tickets.inc_daily_snapshot ORDER BY snapshot_date DESC;

-- nearest open tickets to a vehicle (lng, lat) — metres, index-accelerated KNN
SELECT ticket_id, cluster, hours_open,
       round(ST_Distance(geog, ST_SetSRID(ST_MakePoint(:lng,:lat),4326)::geography))::int AS metres
FROM tickets.inc_open_sla
ORDER BY geog <-> ST_SetSRID(ST_MakePoint(:lng,:lat),4326)::geography
LIMIT 10;

Data-quality & SLA notes

Findings to keep in mind (see the PRD for detail):

  • Source sla_status is only meaningful for closed tickets. It reads Compliant for essentially all open tickets, so for open work use the derived state in tickets.inc_open_sla (now() created_at_service vs the contract's 48h).
  • created_at_service is missing on ~30% of rows (incl. most open ones); the SLA view falls back to first_seen_at and flags it via sla_clock_source.
  • mttr is not wall-clock closed_at created_at_service and the source's Breached/Compliant does not match a plain 48h threshold — pin the contract's exact SLA definition before trusting cross-field SLA math.
  • Content lag: the feed's file timestamps are current, but the ticket content trails ~2 days (the underlying …wm_task.xlsx source), so creation/closure dates run a couple of days behind wall-clock.
  • History: tickets.inc is current-state (upsert). Closure/creation/MTTR event series work directly off closed_at/created_at_service. Backlog-over-time now accrues via tickets.inc_daily_snapshot (one row per EAT day, written by tickets.capture_history() each ingest); observed closures log to tickets.closure_events. Past backlog can't be reconstructed — the series builds from the first capture onward.

Status / roadmap

Live: INC ingestion deployed on Coolify (hourly 15 7-19 * * * EAT), schema + generated columns + geocoding + the inc_open_sla view in tracksolid_db. Next (Phase 2): time-series analytics (closure rate, MTTR/SLA trends), then FleetNow vehicle dispatch off geog, and team closure attribution. CRQ is a separate future project that will reuse this machinery against automations/crq/.