2026-06-11 17:13:50 +00:00
# fleettickets
2026-06-15 16:33:16 +00:00
Field-ops **INC ticket** ingestion, geocoding, and read-schema that powers the
2026-06-11 17:13:50 +00:00
**Tickets** map in FleetOps. Extracted from the `tracksolid` repo into its own module
(it previously lived there as migrations 21– 23 + `tools/import_tickets.py` ).
2026-06-15 16:33:16 +00:00
- **INC** — incident / customer-fault tickets *(this pipeline is * *strictly INC**)*
- **CRQ** — new-installation requests *(schema kept, but * *out of scope** — not ingested here)*
2026-06-11 17:13:50 +00:00
## What this owns
| Piece | What |
|---|---|
| `migrations/01_tickets_schema.sql` | The `tickets` schema: `tickets.inc` / `tickets.crq` (raw-jsonb-first), `tickets.geo_clusters` + `tickets.geo_locations` gazetteers, geom-resolution trigger, and `reporting.fn_tickets_for_map` (the GeoJSON read function) |
2026-06-15 16:33:16 +00:00
| `migrations/02_import_meta.sql` | `tickets.import_meta` (per-dataset snapshot envelope metadata) + `fn_tickets_for_map` re-defined to expose it as `summary.freshness` (same signature — dashboard_api unchanged) |
2026-06-15 20:08:31 +00:00
| `migrations/03_inc_columns.sql` | Unpacks `tickets.inc.raw` into **typed STORED generated columns** (status, cluster, region, team, owner, sla_status, mttr, lat/lng, is_* booleans, and EAT→`timestamptz` timestamps via `tickets.eat_ts()` ). Computed for all rows + auto-populated on every ingest; `raw` stays the source of truth |
2026-06-15 20:26:39 +00:00
| `migrations/04_inc_latlng.sql` | Redefines `latitude` /`longitude` to `COALESCE(feed, ST_Y/ST_X(geom))` so they're **populated from the geocoded position** (feed is always empty); precision per `geo_source` (`location` vs `cluster` centroid) |
2026-06-15 20:33:45 +00:00
| `migrations/05_inc_geography.sql` | Adds `geog geography(Point,4326)` (= `geom::geography` ) + GiST index for **routing** — `ST_Distance` /`ST_DWithin`/KNN in real metres (nearest-vehicle, radius search) |
2026-06-15 20:51:28 +00:00
| `migrations/06_inc_mttr_minutes.sql` | `mttr` generated column → integer **minutes** (source is decimal hours); drops the constant `is_alarm` /`is_auto_created`/`is_auto_closed` columns (kept in `raw` ). `is_actionable` retained |
2026-06-15 20:54:43 +00:00
| `migrations/07_inc_drop_service_type.sql` | Drops the constant `service_type` column (always `inc` ; kept in `raw` ) |
2026-06-15 16:33:16 +00:00
| `import_tickets.py` | Ingests the **newest INC CSV** from the rustfs `tickets` bucket (`automations/inc/< EAT-timestamp > .csv`) and upserts on `ticket_id` ; geocodes clusters + INC locations |
2026-06-11 17:13:50 +00:00
| `run_migrations.py` | Applies `migrations/*.sql` in order (ledger: `tickets.schema_migrations` ) |
| `shared.py` | Minimal DB/logging helpers (self-contained — no tracksolid dependency) |
## What this does NOT own (stays where it is)
- **The DB** — the `tickets` schema lives in the shared `tracksolid_db` .
- **The read-API** — `dashboard_api` (in the tracksolid stack) serves
`GET /webhook/tickets` , which calls `reporting.fn_tickets_for_map` (defined here).
- **The frontend** — the Tickets map is a tab in the **FleetOps** SPA (`fleetops` repo).
## Data model (raw-first)
2026-06-15 20:08:31 +00:00
Each row is `ticket_id` + `raw` (the full source record as `jsonb` ) + a derived
2026-06-11 17:13:50 +00:00
`geom` / `geo_source` . Everything reads from `raw` , so a change to the source schema
2026-06-15 20:08:31 +00:00
needs no migration. For convenient typed/indexable access, `raw` is also **unpacked
into STORED generated columns** (migration 03) — e.g. `normalized_status` , `cluster` ,
`region` , `assigned_team` , `owner` , `sla_status` , `mttr` , `is_actionable` ,
`created_at_service` /`closed_at` (as EAT→`timestamptz`). These stay in lock-step with
`raw` automatically (no loader change); `raw` remains the source of truth. `geom` is resolved: **feed** coords (`raw` lat/lng) → **location**
2026-06-11 17:13:50 +00:00
(geocoded `location_name` ) → **cluster** centroid → **none** .
Source coordinates are empty in the feed, so geocoding is required:
- `--geocode-clusters` — one coordinate per cluster (coarse fallback).
- `--geocode-locations` — precise per-location for **actionable INC** tickets: strips the
network codes from `location_name` (e.g. `NW_` , `ADR_MNT_` , `FDT<n>` , `SDUS` ), geocodes
the real place via a **keyed** provider (LocationIQ / OpenCage), and **rejects any result
>25 km from the cluster centroid** (wrong-city guard). Results cache in
`tickets.geo_locations` .
## Setup
```bash
uv sync
cp .env.example .env # fill in DATABASE_URL, RUSTFS_*, GEOCODER_*
python run_migrations.py # apply the schema (idempotent)
```
## Run
```bash
2026-06-15 16:33:16 +00:00
# ingest the newest INC CSV from the bucket (skip-if-unchanged, then archive)
2026-06-11 17:13:50 +00:00
python import_tickets.py --from-bucket --apply
# geocode (needs GEOCODER_API_KEY)
python import_tickets.py --geocode-clusters --apply # coarse, once
python import_tickets.py --geocode-locations --apply # precise, actionable INC
2026-06-15 16:33:16 +00:00
# from a local CSV instead of the bucket (dev)
python import_tickets.py --inc-csv 2026-06-15T17-00-00.csv --apply
2026-06-11 17:13:50 +00:00
```
2026-06-15 17:08:05 +00:00
Dry-run is the default (omit `--apply` ). `import_tickets.py --from-bucket` talks to S3
via **boto3** using the `RUSTFS_*` env (path-style addressing; no aws-CLI dependency).
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
## Deploy (Coolify)
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
The repo ships a [`Dockerfile` ](Dockerfile ) — a small batch worker with no web server.
Coolify builds it and keeps the container alive (`CMD tail -f /dev/null`); the ingest
runs as a **Scheduled Task** , not a system crontab:
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
- **Command:** `python import_tickets.py --from-bucket --apply`
2026-06-15 19:43:01 +00:00
- **Frequency:** `15 7-19 * * *` (`:15` past each hour, **07:15– 19:15 EAT** ). This
Coolify instance runs scheduled tasks in **EAT (Africa/Nairobi)** , so no UTC
conversion is needed.
2026-06-15 17:08:05 +00:00
- **Env vars** (Coolify → Environment Variables): `DATABASE_URL` (internal DB host),
`RUSTFS_*` , `GEOCODER_*` .
Skip-if-unchanged makes a run on an already-ingested snapshot a cheap no-op.
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
For a plain host/VM instead of Coolify, [`run_ingest.sh` ](run_ingest.sh ) loads `.env`
and runs the ingest; schedule it with a crontab line
(`CRON_TZ=Africa/Nairobi` / `15 7-19 * * *` ).
2026-06-11 17:13:50 +00:00
## Notes
2026-06-15 16:33:16 +00:00
- The n8n export writes a **full current-state CSV per hour** to
`automations/inc/<EAT-timestamp>.csv` — no `latest` pointer, no metadata envelope, no
deltas. The loader lists the prefix, takes the **newest** file, and ingests it.
- **Skip-if-unchanged:** the newest file's S3 **ETag** is compared to the last processed
file's ETag (stored in `tickets.import_meta.metadata.source_etag` ); if equal, the DB write
is skipped (the export re-emits byte-identical content most hours).
- **Upsert on `ticket_id` ** (PRIMARY KEY) — duplication is impossible; rows are never
deleted, so closed-ticket history accumulates. On success the file is **moved** to
`automations/inc/processed/` .
- **Cleaning at ingest:** drop `is_alarm=true` rows + the `EXPORT STOPPED…` sentinel; drop
`week_start` /`week_end`, `source_s3_*` /`source_snapshot_id`, `department` /`source_type`;
normalize `region` → lowercase and `raw_status` → UPPERCASE. `service_type` and `bucket`
(a `closed` /`pending` flag) are kept.
- `tickets.import_meta` captures snapshot freshness (surfaced as `summary.freshness` by
`fn_tickets_for_map` ).
2026-06-11 17:13:50 +00:00
- The curated/geocoded coordinates are written `verified = false` — review
`tickets.geo_clusters` / `tickets.geo_locations` and flip `verified` once checked.