- Replace the aws-CLI subprocess calls with boto3 (list_objects_v2 paginator, get_object, copy_object+delete_object) using path-style addressing + RUSTFS_* env. Removes the external aws-CLI dependency so it runs in a slim container. - Add boto3 to pyproject dependencies. - Add Dockerfile (python:3.12-slim, deps, TZ=Africa/Nairobi, keep-alive CMD) and .dockerignore for Coolify; document Coolify Scheduled Task setup in README. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
5.5 KiB
fleettickets
Field-ops INC ticket ingestion, geocoding, and read-schema that powers the
Tickets map in FleetOps. Extracted from the tracksolid repo into its own module
(it previously lived there as migrations 21–23 + tools/import_tickets.py).
- INC — incident / customer-fault tickets (this pipeline is strictly INC)
- CRQ — new-installation requests (schema kept, but out of scope — not ingested here)
What this owns
| Piece | What |
|---|---|
migrations/01_tickets_schema.sql |
The tickets schema: tickets.inc / tickets.crq (raw-jsonb-first), tickets.geo_clusters + tickets.geo_locations gazetteers, geom-resolution trigger, and reporting.fn_tickets_for_map (the GeoJSON read function) |
migrations/02_import_meta.sql |
tickets.import_meta (per-dataset snapshot envelope metadata) + fn_tickets_for_map re-defined to expose it as summary.freshness (same signature — dashboard_api unchanged) |
import_tickets.py |
Ingests the newest INC CSV from the rustfs tickets bucket (automations/inc/<EAT-timestamp>.csv) and upserts on ticket_id; geocodes clusters + INC locations |
run_migrations.py |
Applies migrations/*.sql in order (ledger: tickets.schema_migrations) |
shared.py |
Minimal DB/logging helpers (self-contained — no tracksolid dependency) |
What this does NOT own (stays where it is)
- The DB — the
ticketsschema lives in the sharedtracksolid_db. - The read-API —
dashboard_api(in the tracksolid stack) servesGET /webhook/tickets, which callsreporting.fn_tickets_for_map(defined here). - The frontend — the Tickets map is a tab in the FleetOps SPA (
fleetopsrepo).
Data model (raw-first)
Each row is just ticket_id + raw (the full source record as jsonb) + a derived
geom / geo_source. Everything reads from raw, so a change to the source schema
needs no migration. geom is resolved: feed coords (raw lat/lng) → location
(geocoded location_name) → cluster centroid → none.
Source coordinates are empty in the feed, so geocoding is required:
--geocode-clusters— one coordinate per cluster (coarse fallback).--geocode-locations— precise per-location for actionable INC tickets: strips the network codes fromlocation_name(e.g.NW_,ADR_MNT_,FDT<n>,SDUS), geocodes the real place via a keyed provider (LocationIQ / OpenCage), and **rejects any result25 km from the cluster centroid** (wrong-city guard). Results cache in
tickets.geo_locations.
Setup
uv sync
cp .env.example .env # fill in DATABASE_URL, RUSTFS_*, GEOCODER_*
python run_migrations.py # apply the schema (idempotent)
Run
# ingest the newest INC CSV from the bucket (skip-if-unchanged, then archive)
python import_tickets.py --from-bucket --apply
# geocode (needs GEOCODER_API_KEY)
python import_tickets.py --geocode-clusters --apply # coarse, once
python import_tickets.py --geocode-locations --apply # precise, actionable INC
# from a local CSV instead of the bucket (dev)
python import_tickets.py --inc-csv 2026-06-15T17-00-00.csv --apply
Dry-run is the default (omit --apply). import_tickets.py --from-bucket talks to S3
via boto3 using the RUSTFS_* env (path-style addressing; no aws-CLI dependency).
Deploy (Coolify)
The repo ships a Dockerfile — a small batch worker with no web server.
Coolify builds it and keeps the container alive (CMD tail -f /dev/null); the ingest
runs as a Scheduled Task, not a system crontab:
- Command:
python import_tickets.py --from-bucket --apply - Frequency:
15 7-19 * * *(:15past each hour, 07:00–19:00). If Coolify runs scheduled tasks in UTC, use15 4-16 * * *(EAT is UTC+3); if it exposes a per-task timezone, setAfrica/Nairobiand keep15 7-19 * * *. - Env vars (Coolify → Environment Variables):
DATABASE_URL(internal DB host),RUSTFS_*,GEOCODER_*.
Skip-if-unchanged makes a run on an already-ingested snapshot a cheap no-op.
For a plain host/VM instead of Coolify, run_ingest.sh loads .env
and runs the ingest; schedule it with a crontab line
(CRON_TZ=Africa/Nairobi / 15 7-19 * * *).
Notes
- The n8n export writes a full current-state CSV per hour to
automations/inc/<EAT-timestamp>.csv— nolatestpointer, no metadata envelope, no deltas. The loader lists the prefix, takes the newest file, and ingests it. - Skip-if-unchanged: the newest file's S3 ETag is compared to the last processed
file's ETag (stored in
tickets.import_meta.metadata.source_etag); if equal, the DB write is skipped (the export re-emits byte-identical content most hours). - Upsert on
ticket_id(PRIMARY KEY) — duplication is impossible; rows are never deleted, so closed-ticket history accumulates. On success the file is moved toautomations/inc/processed/. - Cleaning at ingest: drop
is_alarm=truerows + theEXPORT STOPPED…sentinel; dropweek_start/week_end,source_s3_*/source_snapshot_id,department/source_type; normalizeregion→ lowercase andraw_status→ UPPERCASE.service_typeandbucket(aclosed/pendingflag) are kept. tickets.import_metacaptures snapshot freshness (surfaced assummary.freshnessbyfn_tickets_for_map).- The curated/geocoded coordinates are written
verified = false— reviewtickets.geo_clusters/tickets.geo_locationsand flipverifiedonce checked.