2026-06-11 17:13:50 +00:00
# fleettickets
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
Field-ops **ticket** ingestion, geocoding, and read-schema that powers the
2026-06-11 17:13:50 +00:00
**Tickets** map in FleetOps. Extracted from the `tracksolid` repo into its own module
(it previously lived there as migrations 21– 23 + `tools/import_tickets.py` ).
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
Two ticket types, identical 32-column source schema and CDC change stream, served
through a **shared engine** (`pipeline.py`) with a thin per-type entrypoint each:
- **INC** — incident / customer-fault tickets → `tickets.inc` (`inc/import_inc.py`).
Full feature set: typed columns, geocoding, SLA view, dashboard fn, history capture.
- **CRQ** — new-installation requests → `tickets.crq` (`crq/import_crq.py`). **Data
layer + map** (typed columns, geocoding, appears on the Tickets map via
`fn_tickets_for_map` ). SLA view / dashboard fn / history capture are deferred —
installation-lifecycle semantics differ from incidents (see roadmap). CRQ gets its
**own FleetOps tab** , same look & feel as INC.
Geocoding is **cross-dataset** (one gazetteer, one geocoder budget, covers inc + crq)
and is driven from the INC entrypoint.
2026-06-11 17:13:50 +00:00
## What this owns
| Piece | What |
|---|---|
| `migrations/01_tickets_schema.sql` | The `tickets` schema: `tickets.inc` / `tickets.crq` (raw-jsonb-first), `tickets.geo_clusters` + `tickets.geo_locations` gazetteers, geom-resolution trigger, and `reporting.fn_tickets_for_map` (the GeoJSON read function) |
2026-06-15 16:33:16 +00:00
| `migrations/02_import_meta.sql` | `tickets.import_meta` (per-dataset snapshot envelope metadata) + `fn_tickets_for_map` re-defined to expose it as `summary.freshness` (same signature — dashboard_api unchanged) |
2026-06-15 20:08:31 +00:00
| `migrations/03_inc_columns.sql` | Unpacks `tickets.inc.raw` into **typed STORED generated columns** (status, cluster, region, team, owner, sla_status, mttr, lat/lng, is_* booleans, and EAT→`timestamptz` timestamps via `tickets.eat_ts()` ). Computed for all rows + auto-populated on every ingest; `raw` stays the source of truth |
2026-06-15 20:26:39 +00:00
| `migrations/04_inc_latlng.sql` | Redefines `latitude` /`longitude` to `COALESCE(feed, ST_Y/ST_X(geom))` so they're **populated from the geocoded position** (feed is always empty); precision per `geo_source` (`location` vs `cluster` centroid) |
2026-06-15 20:33:45 +00:00
| `migrations/05_inc_geography.sql` | Adds `geog geography(Point,4326)` (= `geom::geography` ) + GiST index for **routing** — `ST_Distance` /`ST_DWithin`/KNN in real metres (nearest-vehicle, radius search) |
2026-06-15 20:51:28 +00:00
| `migrations/06_inc_mttr_minutes.sql` | `mttr` generated column → integer **minutes** (source is decimal hours); drops the constant `is_alarm` /`is_auto_created`/`is_auto_closed` columns (kept in `raw` ). `is_actionable` retained |
2026-06-15 20:54:43 +00:00
| `migrations/07_inc_drop_service_type.sql` | Drops the constant `service_type` column (always `inc` ; kept in `raw` ) |
2026-06-15 21:03:55 +00:00
| `migrations/08_inc_open_sla_view.sql` | `tickets.inc_open_sla` view — open (`is_actionable`) tickets with **derived SLA** (`hours_open`, `sla_state` vs 48h; clock = `created_at_service` ∥ `first_seen_at` ), plus team/cluster/`geog` for dispatch |
feat: reporting.fn_inc_dashboard — INC operations dashboard read-API (migration 09)
One parameterized function returns {window, open GeoJSON, closed GeoJSON, metrics,
freshness} for the FleetOps live INC map:
- open = all is_actionable tickets (live), filtered by cluster/status, with
sla_state/hours_open (from tickets.inc_open_sla)
- closed= closed_at within the selected window (EAT calendar today/week/month or
custom [from,to)), filtered by cluster/status
- metrics= open/closed counts, SLA split (open derived, closed source), by status/
cluster, closure rate + daily series, avg mttr (minutes)
Filters combine with AND; grants to dashboard_ro/grafana_ro. Verified live
(today/month/cluster/status/custom; last-7d closed=913 matches raw).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 22:10:18 +00:00
| `migrations/09_inc_dashboard_fn.sql` | `reporting.fn_inc_dashboard(cluster, status, window, from, to)` — one JSON payload (`window` / `open` GeoJSON / `closed` GeoJSON / `metrics` / `freshness` ) powering the FleetOps live INC map. Open=live, closed=windowed (EAT calendar / custom); filters AND |
2026-06-15 22:19:23 +00:00
| `migrations/10_inc_history_capture.sql` | History for time-series: `tickets.closure_events` (append-only observed closures) + `tickets.inc_daily_snapshot` (per-EAT-day open backlog + flow), populated by `tickets.capture_history()` each ingest. Unlocks **backlog-over-time** |
fix(crq): migration 15 creates tickets.crq (live DB never materialized it)
Live-DB reconciliation before seeding CRQ revealed two divergences:
- tickets.crq did NOT exist: 01_tickets_schema.sql was applied 2026-06-15 from a
version predating its crq section, so the IF-NOT-EXISTS ledger guard has blocked
it ever since (fn_tickets_for_map + resolve_ticket_geoms already reference crq, so
they errored if called — masked because the live INC view uses fn_inc_dashboard).
- The live ledger carries un-versioned 13_inc_search_fn.sql / 14_inc_filter_options.sql
(applied 2026-06-19, absent from this repo).
So 13_crq_columns.sql (ALTER-only, number 13) is replaced by 15_crq_table.sql, which
CREATEs tickets.crq self-containedly (table + geom trigger + raw/typed indexes) and
adds the typed STORED generated columns. Deterministic + idempotent on both the live DB
(crq missing) and a fresh DB (crq minimal from 01). Numbered 15 to sit after the live
ledger's max. Docs/CLI references updated 13->15.
Applied + seeded on the live DB out-of-band (running container, INC image untouched):
39,240 crq rows, 99.99% geocoded (cluster + shared location cache), watermark current,
crq now renders on fn_tickets_for_map.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:55:17 +00:00
| `migrations/15_crq_table.sql` | **Materializes `tickets.crq`** (table + geom trigger + indexes — `01` 's crq section never ran on the live DB) and unpacks `raw` into the same **typed STORED generated columns** as INC's `03` (reuses `tickets.eat_ts()` ). Brings CRQ to data-layer parity |
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
| `pipeline.py` | **Shared engine** — the dataset-agnostic CDC loader (drains `automations/<type>/changes/<EAT-ts>.csv` from the `isptickets` bucket, upserts on `ticket_id` oldest→newest, watermark + per-file archive) and the **cross-dataset** geocoder (clusters + actionable inc/crq locations) |
| `inc/import_inc.py` | INC entrypoint (`python -m inc.import_inc`) — INC `Dataset` config + CLI; runs `tickets.capture_history()` after each `--apply` ; hosts the shared geocode commands |
| `crq/import_crq.py` | CRQ entrypoint (`python -m crq.import_crq`) — CRQ `Dataset` config + CLI (ingest only; no history hook yet) |
2026-06-11 17:13:50 +00:00
| `run_migrations.py` | Applies `migrations/*.sql` in order (ledger: `tickets.schema_migrations` ) |
| `shared.py` | Minimal DB/logging helpers (self-contained — no tracksolid dependency) |
## What this does NOT own (stays where it is)
- **The DB** — the `tickets` schema lives in the shared `tracksolid_db` .
- **The read-API** — `dashboard_api` (in the tracksolid stack) serves
`GET /webhook/tickets` , which calls `reporting.fn_tickets_for_map` (defined here).
- **The frontend** — the Tickets map is a tab in the **FleetOps** SPA (`fleetops` repo).
## Data model (raw-first)
2026-06-15 20:08:31 +00:00
Each row is `ticket_id` + `raw` (the full source record as `jsonb` ) + a derived
2026-06-11 17:13:50 +00:00
`geom` / `geo_source` . Everything reads from `raw` , so a change to the source schema
2026-06-15 20:08:31 +00:00
needs no migration. For convenient typed/indexable access, `raw` is also **unpacked
into STORED generated columns** (migration 03) — e.g. `normalized_status` , `cluster` ,
`region` , `assigned_team` , `owner` , `sla_status` , `mttr` , `is_actionable` ,
`created_at_service` /`closed_at` (as EAT→`timestamptz`). These stay in lock-step with
`raw` automatically (no loader change); `raw` remains the source of truth. `geom` is resolved: **feed** coords (`raw` lat/lng) → **location**
2026-06-11 17:13:50 +00:00
(geocoded `location_name` ) → **cluster** centroid → **none** .
Source coordinates are empty in the feed, so geocoding is required:
- `--geocode-clusters` — one coordinate per cluster (coarse fallback).
- `--geocode-locations` — precise per-location for **actionable INC** tickets: strips the
network codes from `location_name` (e.g. `NW_` , `ADR_MNT_` , `FDT<n>` , `SDUS` ), geocodes
the real place via a **keyed** provider (LocationIQ / OpenCage), and **rejects any result
>25 km from the cluster centroid** (wrong-city guard). Results cache in
`tickets.geo_locations` .
docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:10:27 +00:00
### Columns on `tickets.inc`
| Column | Type | Notes |
|---|---|---|
| `ticket_id` | text (PK) | e.g. `WOT0715527` |
| `raw` | jsonb | full source record — the source of truth |
| `normalized_status` · `raw_status` | text | use `normalized_status` for filtering (canonical) |
| `bucket` | text | lifecycle: `closed` / `pending` |
| `is_actionable` | boolean | the open/closed flag (open = `true` ) |
| `cluster` · `region` · `location_name` | text | `region` lowercased; `cluster` feeds the gazetteer |
| `assigned_team` · `owner` | text | closure attribution dimensions |
| `sla_status` | text | source `Compliant` /`Breached` — **only meaningful once closed** |
| `mttr` | numeric | **minutes** (source is decimal hours); null until closed |
| `created_at_service` · `scheduled_at` · `closed_at` · `first_seen_at` · `last_seen_at` · `source_created_at` · `source_updated_at` | timestamptz | EAT→UTC via `tickets.eat_ts()` . **lifecycle** = `created_at_service` →`closed_at`; **export bookkeeping** = `first_seen_at` /`last_seen_at`/`source_*` |
| `latitude` · `longitude` | double precision | `COALESCE(feed, geocoded)` — populated from `geom` |
| `geom` | geometry(Point,4326) | display / the map |
| `geog` | geography(Point,4326) | **routing** — metres-accurate distance (GiST indexed) |
| `geo_source` | text | precision: `feed` / `location` / `cluster` / `none` |
| `ingested_at` | timestamptz | when we last upserted this row |
Dropped from the unpacked columns (still in `raw` ): `service_type` , `is_alarm` ,
`is_auto_created` , `is_auto_closed` (all single-cardinality), plus the ingest-time
drops below. ** `reporting.fn_tickets_for_map` ** reads from `raw` and serves the map;
**`tickets.inc_open_sla`** is the open-ticket SLA view for dashboards/dispatch.
2026-06-11 17:13:50 +00:00
## Setup
```bash
uv sync
cp .env.example .env # fill in DATABASE_URL, RUSTFS_*, GEOCODER_*
python run_migrations.py # apply the schema (idempotent)
```
## Run
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
Run from the repo root so the `inc` /`crq` packages + `pipeline.py` /`shared.py` import.
2026-06-11 17:13:50 +00:00
```bash
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
# drain the incremental change streams (every new file oldest→newest, then archive)
python -m inc.import_inc --from-bucket --apply
python -m crq.import_crq --from-bucket --apply
2026-06-11 17:13:50 +00:00
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
# geocode — CROSS-DATASET (covers inc + crq); driven from the INC entrypoint, needs GEOCODER_API_KEY
python -m inc.import_inc --geocode-clusters --apply # coarse, once
python -m inc.import_inc --geocode-locations --apply # precise, actionable inc+crq
2026-06-11 17:13:50 +00:00
2026-06-15 16:33:16 +00:00
# from a local CSV instead of the bucket (dev)
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
python -m inc.import_inc --inc-csv 2026-06-15T17-00-00.csv --apply
python -m crq.import_crq --crq-csv 2026-06-24T12-55-44.csv --apply
2026-06-11 17:13:50 +00:00
```
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
Dry-run is the default (omit `--apply` ). `--from-bucket` talks to S3 via **boto3** using
the `RUSTFS_*` env (path-style addressing; no aws-CLI dependency).
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
## Deploy (Coolify)
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
The repo ships a [`Dockerfile` ](Dockerfile ) — a small batch worker with no web server.
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
Coolify builds it and keeps the container alive (`CMD tail -f /dev/null`); each ingest
runs as its own **Scheduled Task** , not a system crontab:
2026-06-15 16:40:50 +00:00
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
- **`inc_tickets`:** `python -m inc.import_inc --from-bucket --apply`
- **`crq_tickets`:** `python -m crq.import_crq --from-bucket --apply`
- **Frequency:** both `*/20 6-20 * * *` (every 20 min, **06:00– 20:40 EAT** ). This
2026-06-15 19:43:01 +00:00
Coolify instance runs scheduled tasks in **EAT (Africa/Nairobi)** , so no UTC
conversion is needed.
2026-06-15 17:08:05 +00:00
- **Env vars** (Coolify → Environment Variables): `DATABASE_URL` (internal DB host),
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
`RUSTFS_*` (the `isptickets` bucket credentials), `GEOCODER_*` . The same bucket holds
both `automations/inc/` and `automations/crq/` , so one credential set serves both tasks.
2026-06-15 17:08:05 +00:00
2026-06-25 15:20:04 +00:00
The watermark makes a run with no new change files a cheap no-op.
2026-06-15 16:40:50 +00:00
2026-06-15 17:08:05 +00:00
For a plain host/VM instead of Coolify, [`run_ingest.sh` ](run_ingest.sh ) loads `.env`
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
and runs **both** ingests; schedule it with a crontab line
2026-06-25 15:23:17 +00:00
(`CRON_TZ=Africa/Nairobi` / `*/20 6-20 * * *` ).
2026-06-11 17:13:50 +00:00
2026-06-25 19:24:23 +00:00
Full operational runbook — container, env management (encrypted; via the UI or
`artisan tinker` ), the **Forgejo → Coolify auto-deploy webhook** , manual deploys, and the
source-bucket cutover procedure — is in
[`docs/deployment-and-operations.md` ](docs/deployment-and-operations.md ).
2026-06-25 15:20:04 +00:00
### Bucket cutover (one-time reseed)
When the source provider moves the feed to a new bucket (e.g. `tickets` → `isptickets` ),
the stored watermark holds a key from the *old* bucket's stream, whose timestamp may be
newer than the new bucket's first file — which would otherwise be skipped. Point the
`RUSTFS_*` creds + `TICKETS_BUCKET` at the new bucket, then drain it once with `--reseed` ,
which ignores the stored watermark and ingests **every** file in `changes/` oldest→newest:
```bash
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
python -m inc.import_inc --from-bucket --reseed # dry-run first (or -m crq.import_crq)
python -m inc.import_inc --from-bucket --reseed --apply # commit + archive
2026-06-25 15:20:04 +00:00
```
Upserts are idempotent (`ticket_id` PK, rows never deleted) and the new stream's periodic
full-state re-emissions re-assert current state, so this is non-destructive and converges
even across the cutover gap. After it, the watermark is current — resume normal
`--from-bucket --apply` runs (no `--reseed` ). The old bucket is left untouched.
2026-06-11 17:13:50 +00:00
## Notes
2026-06-25 15:20:04 +00:00
- The n8n export writes an **incremental CDC change stream** to
`automations/inc/changes/<EAT-timestamp>.csv` : a full-state baseline followed by files
holding only the rows that changed (with periodic full-state re-emissions). No `latest`
pointer, no metadata envelope. The loader drains **every not-yet-processed file
oldest→newest** — taking only the newest would drop intermediate deltas.
- **Watermark:** the newest file already applied is recorded in
`tickets.import_meta.metadata.source_max_key` ; runs skip anything at/older than it, so
reruns are cheap no-ops. `--reseed` ignores it for a one-time bucket cutover.
2026-06-15 16:33:16 +00:00
- **Upsert on `ticket_id` ** (PRIMARY KEY) — duplication is impossible; rows are never
2026-06-25 15:20:04 +00:00
deleted, so closed-ticket history accumulates. On success each file is **moved** to
2026-06-15 16:33:16 +00:00
`automations/inc/processed/` .
- **Cleaning at ingest:** drop `is_alarm=true` rows + the `EXPORT STOPPED…` sentinel; drop
`week_start` /`week_end`, `source_s3_*` /`source_snapshot_id`, `department` /`source_type`;
normalize `region` → lowercase and `raw_status` → UPPERCASE. `service_type` and `bucket`
(a `closed` /`pending` flag) are kept.
- `tickets.import_meta` captures snapshot freshness (surfaced as `summary.freshness` by
`fn_tickets_for_map` ).
2026-06-11 17:13:50 +00:00
- The curated/geocoded coordinates are written `verified = false` — review
`tickets.geo_clusters` / `tickets.geo_locations` and flip `verified` once checked.
docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:10:27 +00:00
## Querying
```sql
-- map payload (GeoJSON + summary, incl. summary.freshness) — what dashboard_api serves
SELECT reporting.fn_tickets_for_map(); -- open-only by default
SELECT reporting.fn_tickets_for_map(p_open_only := false); -- all geocoded tickets
-- open tickets by SLA (derived) + by cluster — via the view
SELECT sla_state, count(*) FROM tickets.inc_open_sla GROUP BY 1;
SELECT cluster, count(*), round(avg(hours_open),1) AS avg_hrs
FROM tickets.inc_open_sla GROUP BY 1 ORDER BY 2 DESC;
-- closures / creations per day (EAT)
SELECT (closed_at AT TIME ZONE 'Africa/Nairobi')::date AS d, count(*)
FROM tickets.inc WHERE closed_at IS NOT NULL GROUP BY 1 ORDER BY 1 DESC;
2026-06-15 22:19:23 +00:00
-- open-backlog-over-time (accrues from first capture; one row per EAT day)
SELECT snapshot_date, open_total, open_breached, closed_today
FROM tickets.inc_daily_snapshot ORDER BY snapshot_date DESC;
docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:10:27 +00:00
-- nearest open tickets to a vehicle (lng, lat) — metres, index-accelerated KNN
SELECT ticket_id, cluster, hours_open,
round(ST_Distance(geog, ST_SetSRID(ST_MakePoint(:lng,:lat),4326)::geography))::int AS metres
FROM tickets.inc_open_sla
ORDER BY geog < - > ST_SetSRID(ST_MakePoint(:lng,:lat),4326)::geography
LIMIT 10;
```
## Data-quality & SLA notes
Findings to keep in mind (see the PRD for detail):
- **Source `sla_status` is only meaningful for *closed* tickets.** It reads
`Compliant` for essentially all *open* tickets, so for open work use the **derived**
state in `tickets.inc_open_sla` (`now() − created_at_service` vs the contract's 48h).
- **`created_at_service` is missing on ~30% of rows** (incl. most open ones); the SLA
view falls back to `first_seen_at` and flags it via `sla_clock_source` .
- **`mttr` is not wall-clock** `closed_at − created_at_service` and the source's
`Breached` /`Compliant` does **not** match a plain 48h threshold — pin the contract's
exact SLA definition before trusting cross-field SLA math.
- **Content lag:** the feed's *file* timestamps are current, but the ticket *content*
trails ~2 days (the underlying `…wm_task.xlsx` source), so creation/closure dates
run a couple of days behind wall-clock.
- **History:** `tickets.inc` is current-state (upsert). Closure/creation/MTTR
2026-06-15 22:19:23 +00:00
*event* series work directly off `closed_at` /`created_at_service`. **Backlog-over-time**
now accrues via `tickets.inc_daily_snapshot` (one row per EAT day, written by
`tickets.capture_history()` each ingest); observed closures log to
`tickets.closure_events` . Past backlog can't be reconstructed — the series builds
from the first capture onward.
docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:10:27 +00:00
## Status / roadmap
2026-06-25 15:23:17 +00:00
Live: INC ingestion deployed on Coolify (every 20 min `*/20 6-20 * * *` EAT), schema +
docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:10:27 +00:00
generated columns + geocoding + the `inc_open_sla` view in `tracksolid_db` .
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
**CRQ (this milestone):** data layer + map — `tickets.crq` fed from
fix(crq): migration 15 creates tickets.crq (live DB never materialized it)
Live-DB reconciliation before seeding CRQ revealed two divergences:
- tickets.crq did NOT exist: 01_tickets_schema.sql was applied 2026-06-15 from a
version predating its crq section, so the IF-NOT-EXISTS ledger guard has blocked
it ever since (fn_tickets_for_map + resolve_ticket_geoms already reference crq, so
they errored if called — masked because the live INC view uses fn_inc_dashboard).
- The live ledger carries un-versioned 13_inc_search_fn.sql / 14_inc_filter_options.sql
(applied 2026-06-19, absent from this repo).
So 13_crq_columns.sql (ALTER-only, number 13) is replaced by 15_crq_table.sql, which
CREATEs tickets.crq self-containedly (table + geom trigger + raw/typed indexes) and
adds the typed STORED generated columns. Deterministic + idempotent on both the live DB
(crq missing) and a fresh DB (crq minimal from 01). Numbered 15 to sit after the live
ledger's max. Docs/CLI references updated 13->15.
Applied + seeded on the live DB out-of-band (running container, INC image untouched):
39,240 crq rows, 99.99% geocoded (cluster + shared location cache), watermark current,
crq now renders on fn_tickets_for_map.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:55:17 +00:00
`automations/crq/changes/` by `crq/import_crq.py` , the `tickets.crq` table + typed columns (migration 15),
feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.
- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
packages inc/crq. Docs (README, implementation, deployment-and-operations,
n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.
tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 20:16:38 +00:00
cross-dataset geocoding, and visibility on the Tickets map via `fn_tickets_for_map` .
One-time seed: drain the isptickets CRQ stream (`python -m crq.import_crq --from-bucket
--apply`) — empty watermark + the stream's periodic full-state snapshots converge to
current state — then run the shared geocode once. See
[`docs/deployment-and-operations.md` ](docs/deployment-and-operations.md ).
Next (Phase 2): bring CRQ to full INC parity once installation-lifecycle semantics are
confirmed — a `crq_open_sla` view, `fn_crq_dashboard` , and CRQ history capture (the INC
analogues of migrations 08/09/10). Then time-series analytics (closure rate, MTTR/SLA
trends), FleetNow vehicle **dispatch** off `geog` , and **team closure attribution** .