No description

Find a file

david kiania 565cd592a0 feat: add geography column + GiST index for routing (migration 05) geom is geometry(Point,4326) (planar degrees); add geog = geom::geography (STORED generated) + GiST index so ST_Distance/ST_DWithin/KNN work in real metres for nearest-vehicle and radius queries. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>		2026-06-15 23:33:45 +03:00
migrations	feat: add geography column + GiST index for routing (migration 05)	2026-06-15 23:33:45 +03:00
.dockerignore	feat: S3 via boto3 + Dockerfile for Coolify deploy	2026-06-15 20:08:05 +03:00
.env.example	feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)	2026-06-15 19:33:16 +03:00
.gitignore	feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)	2026-06-15 19:33:16 +03:00
Dockerfile	feat: S3 via boto3 + Dockerfile for Coolify deploy	2026-06-15 20:08:05 +03:00
import_tickets.py	feat: S3 via boto3 + Dockerfile for Coolify deploy	2026-06-15 20:08:05 +03:00
n8n-hourly-s3-full-data-exports.md	feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)	2026-06-15 19:33:16 +03:00
n8n-s3-export-workflows.md	feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)	2026-06-15 19:33:16 +03:00
pyproject.toml	feat: S3 via boto3 + Dockerfile for Coolify deploy	2026-06-15 20:08:05 +03:00
README.md	feat: add geography column + GiST index for routing (migration 05)	2026-06-15 23:33:45 +03:00
run_ingest.sh	chore: add hourly INC ingest cron wrapper + schedule docs	2026-06-15 19:40:50 +03:00
run_migrations.py	feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema	2026-06-11 20:13:50 +03:00
shared.py	feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema	2026-06-11 20:13:50 +03:00

README.md

fleettickets

Field-ops INC ticket ingestion, geocoding, and read-schema that powers the Tickets map in FleetOps. Extracted from the tracksolid repo into its own module (it previously lived there as migrations 21–23 + tools/import_tickets.py).

INC — incident / customer-fault tickets (this pipeline is strictly INC)
CRQ — new-installation requests (schema kept, but out of scope — not ingested here)

What this owns

Piece	What
`migrations/01_tickets_schema.sql`	The `tickets` schema: `tickets.inc` / `tickets.crq` (raw-jsonb-first), `tickets.geo_clusters` + `tickets.geo_locations` gazetteers, geom-resolution trigger, and `reporting.fn_tickets_for_map` (the GeoJSON read function)
`migrations/02_import_meta.sql`	`tickets.import_meta` (per-dataset snapshot envelope metadata) + `fn_tickets_for_map` re-defined to expose it as `summary.freshness` (same signature — dashboard_api unchanged)
`migrations/03_inc_columns.sql`	Unpacks `tickets.inc.raw` into typed STORED generated columns (status, cluster, region, team, owner, sla_status, mttr, lat/lng, is_* booleans, and EAT→`timestamptz` timestamps via `tickets.eat_ts()`). Computed for all rows + auto-populated on every ingest; `raw` stays the source of truth
`migrations/04_inc_latlng.sql`	Redefines `latitude`/`longitude` to `COALESCE(feed, ST_Y/ST_X(geom))` so they're populated from the geocoded position (feed is always empty); precision per `geo_source` (`location` vs `cluster` centroid)
`migrations/05_inc_geography.sql`	Adds `geog geography(Point,4326)` (= `geom::geography`) + GiST index for routing — `ST_Distance`/`ST_DWithin`/KNN in real metres (nearest-vehicle, radius search)
`import_tickets.py`	Ingests the newest INC CSV from the rustfs `tickets` bucket (`automations/inc/<EAT-timestamp>.csv`) and upserts on `ticket_id`; geocodes clusters + INC locations
`run_migrations.py`	Applies `migrations/*.sql` in order (ledger: `tickets.schema_migrations`)
`shared.py`	Minimal DB/logging helpers (self-contained — no tracksolid dependency)

What this does NOT own (stays where it is)

The DB — the tickets schema lives in the shared tracksolid_db.
The read-API — dashboard_api (in the tracksolid stack) serves GET /webhook/tickets, which calls reporting.fn_tickets_for_map (defined here).
The frontend — the Tickets map is a tab in the FleetOps SPA (fleetops repo).

Data model (raw-first)

Each row is ticket_id + raw (the full source record as jsonb) + a derived geom / geo_source. Everything reads from raw, so a change to the source schema needs no migration. For convenient typed/indexable access, raw is also unpacked into STORED generated columns (migration 03) — e.g. normalized_status, cluster, region, assigned_team, owner, sla_status, mttr, is_actionable, created_at_service/closed_at (as EAT→timestamptz). These stay in lock-step with raw automatically (no loader change); raw remains the source of truth. geom is resolved: feed coords (raw lat/lng) → location (geocoded location_name) → cluster centroid → none.

Source coordinates are empty in the feed, so geocoding is required:

--geocode-clusters — one coordinate per cluster (coarse fallback).
--geocode-locations — precise per-location for actionable INC tickets: strips the network codes from location_name (e.g. NW_, ADR_MNT_, FDT<n>, SDUS), geocodes the real place via a keyed provider (LocationIQ / OpenCage), and **rejects any result

25 km from the cluster centroid** (wrong-city guard). Results cache in tickets.geo_locations.

Setup

uv sync
cp .env.example .env        # fill in DATABASE_URL, RUSTFS_*, GEOCODER_*
python run_migrations.py    # apply the schema (idempotent)

Run

# ingest the newest INC CSV from the bucket (skip-if-unchanged, then archive)
python import_tickets.py --from-bucket --apply

# geocode (needs GEOCODER_API_KEY)
python import_tickets.py --geocode-clusters  --apply   # coarse, once
python import_tickets.py --geocode-locations --apply   # precise, actionable INC

# from a local CSV instead of the bucket (dev)
python import_tickets.py --inc-csv 2026-06-15T17-00-00.csv --apply

Dry-run is the default (omit --apply). import_tickets.py --from-bucket talks to S3 via boto3 using the RUSTFS_* env (path-style addressing; no aws-CLI dependency).

Deploy (Coolify)

The repo ships a Dockerfile — a small batch worker with no web server. Coolify builds it and keeps the container alive (CMD tail -f /dev/null); the ingest runs as a Scheduled Task, not a system crontab:

Command: python import_tickets.py --from-bucket --apply
Frequency: 15 7-19 * * * (:15 past each hour, 07:15–19:15 EAT). This Coolify instance runs scheduled tasks in EAT (Africa/Nairobi), so no UTC conversion is needed.
Env vars (Coolify → Environment Variables): DATABASE_URL (internal DB host), RUSTFS_*, GEOCODER_*.

Skip-if-unchanged makes a run on an already-ingested snapshot a cheap no-op.

For a plain host/VM instead of Coolify, run_ingest.sh loads .env and runs the ingest; schedule it with a crontab line (CRON_TZ=Africa/Nairobi / 15 7-19 * * *).

Notes

The n8n export writes a full current-state CSV per hour to automations/inc/<EAT-timestamp>.csv — no latest pointer, no metadata envelope, no deltas. The loader lists the prefix, takes the newest file, and ingests it.
Skip-if-unchanged: the newest file's S3 ETag is compared to the last processed file's ETag (stored in tickets.import_meta.metadata.source_etag); if equal, the DB write is skipped (the export re-emits byte-identical content most hours).
Upsert on ticket_id (PRIMARY KEY) — duplication is impossible; rows are never deleted, so closed-ticket history accumulates. On success the file is moved to automations/inc/processed/.
Cleaning at ingest: drop is_alarm=true rows + the EXPORT STOPPED… sentinel; drop week_start/week_end, source_s3_*/source_snapshot_id, department/source_type; normalize region → lowercase and raw_status → UPPERCASE. service_type and bucket (a closed/pending flag) are kept.
tickets.import_meta captures snapshot freshness (surfaced as summary.freshness by fn_tickets_for_map).
The curated/geocoded coordinates are written verified = false — review tickets.geo_clusters / tickets.geo_locations and flip verified once checked.

README.md Unescape Escape