fleettickets/crq/import_crq.py
david kiania 066d866b90 fix(crq): migration 15 creates tickets.crq (live DB never materialized it)
Live-DB reconciliation before seeding CRQ revealed two divergences:
- tickets.crq did NOT exist: 01_tickets_schema.sql was applied 2026-06-15 from a
  version predating its crq section, so the IF-NOT-EXISTS ledger guard has blocked
  it ever since (fn_tickets_for_map + resolve_ticket_geoms already reference crq, so
  they errored if called — masked because the live INC view uses fn_inc_dashboard).
- The live ledger carries un-versioned 13_inc_search_fn.sql / 14_inc_filter_options.sql
  (applied 2026-06-19, absent from this repo).

So 13_crq_columns.sql (ALTER-only, number 13) is replaced by 15_crq_table.sql, which
CREATEs tickets.crq self-containedly (table + geom trigger + raw/typed indexes) and
adds the typed STORED generated columns. Deterministic + idempotent on both the live DB
(crq missing) and a fresh DB (crq minimal from 01). Numbered 15 to sit after the live
ledger's max. Docs/CLI references updated 13->15.

Applied + seeded on the live DB out-of-band (running container, INC image untouched):
39,240 crq rows, 99.99% geocoded (cluster + shared location cache), watermark current,
crq now renders on fn_tickets_for_map.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 23:55:17 +03:00

61 lines
3.2 KiB
Python

"""
crq/import_crq.py — Fireside Communications · CRQ (new-installation) ingestion.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Thin entrypoint over the shared engine (`pipeline.py`) for the CRQ dataset:
tickets.crq — new-installation requests (FleetOps "Tickets" CRQ tab)
CRQ mirrors INC at the data layer — IDENTICAL 32-column CSV schema and the same
incremental CDC change stream automations/crq/changes/<EAT-ts>.csv in the
`isptickets` bucket. This loader upserts on ticket_id, advances the per-dataset
watermark (tickets.import_meta dataset='crq'), and archives each consumed file to
automations/crq/processed/. CRQ flows onto the existing Tickets map via
reporting.fn_tickets_for_map (which already unions tickets.crq).
Scope (current): data layer + map only. CRQ has NO post-apply history capture yet
(installation-lifecycle SLA/backlog semantics differ from incidents — a future
migration). Geocoding is CROSS-DATASET and run from the INC entrypoint
(python -m inc.import_inc --geocode-clusters / --geocode-locations) against the
shared gazetteer, which covers both inc and crq.
Usage (needs DATABASE_URL + RUSTFS_* env; see .env.example):
python -m crq.import_crq --from-bucket --apply
python -m crq.import_crq --from-bucket --reseed --apply # one-time bucket cutover
python -m crq.import_crq --crq-csv 2026-06-24T12-55-44.csv --apply
Pre-requisite: migrations applied (run_migrations.py) — tickets.crq + its typed
columns (15_crq_table.sql) + geo_clusters/geo_locations + fn_tickets_for_map.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"""
from __future__ import annotations
import argparse
import pipeline
# CRQ has no post-apply hook yet (history capture is INC-only — see module docstring).
DATASET = pipeline.make_dataset("crq", post_apply=None)
def main() -> None:
ap = argparse.ArgumentParser(
description="Ingest CRQ (installation) tickets from CSV (raw-first)")
ap.add_argument("--apply", action="store_true", help="Write to DB (default: dry-run)")
ap.add_argument("--from-bucket", action="store_true",
help="Drain the incremental CRQ change stream (automations/crq/changes/) "
"from the isptickets S3 bucket: every not-yet-processed file "
"oldest→newest, upsert on ticket_id, advance the watermark, archive")
ap.add_argument("--reseed", action="store_true",
help="Ignore the stored watermark and drain every file in changes/ once "
"(one-time bucket cutover / reseed). Use with --from-bucket --apply")
ap.add_argument("--crq-csv", dest="local_csv", default=None,
help="Local CRQ tickets CSV file (dev)")
args = ap.parse_args()
if not (args.from_bucket or args.local_csv):
ap.error("provide --from-bucket or --crq-csv")
pipeline.ingest(DATASET, args)
if __name__ == "__main__":
main()