Commit graph

5 commits

Author SHA1 Message Date
david kiania
bb38d354e5 fix(geocode): precise location geoms survive delta re-upserts (FT-BUG-01)
The tg_ticket_geom trigger resolved feed coords -> cluster centroid -> none,
never consulting tickets.geo_locations, so every 20-min delta ingest re-upserted
changed rows and downgraded previously-resolved 'location' geoms back to the
cluster centroid. Live effect: only 51 of 114k INC (and 0 of 42k CRQ) rows kept
the precise geocode the LocationIQ budget paid for.

- migration 18: trigger now resolves feed -> geo_locations (precise) -> cluster
  -> none, mirroring resolve_ticket_geoms() precedence; ends with one resolve
  pass to repair the backlog. Dry-run against the live DB (rolled back) repaired
  7,481 rows: INC location 51 -> 5,339, CRQ 0 -> 2,193.
- pipeline.ingest(): re-resolve after every applied run that ingested files, so
  geoms self-heal even before migration 18 lands.
- run_ingest.sh: chain an incremental --geocode-clusters pass (0 API calls when
  no new clusters) so new clusters map without a manual command (FT-BUG-02).
- Dockerfile/.dockerignore: pinned installs from uv.lock, non-root user (FT-SEC-02).
- 20260618_bug.txt removed (stale review of a since-rewritten file).

Numbered 18 to coexist with 17_drop_unused_geo_indexes.sql (parallel 260702
change). Audit + plan + work log in docs/260702_*. Local only; not applied to prod.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 09:47:15 +03:00
david kiania
5f5d71d500 feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints
Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed
from import_tickets.py) parameterized by a Dataset config, with thin per-type
entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical
32-column source schema and CDC change stream, so the engine is fully shared.

- pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC
  keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations
  now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget).
- crq/import_crq.py: drains automations/crq/changes/ from isptickets into
  tickets.crq (data layer + map; SLA/dashboard/history deferred).
- migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated
  columns + indexes on tickets.crq (reuses tickets.eat_ts()).
- Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject
  packages inc/crq. Docs (README, implementation, deployment-and-operations,
  n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook.

tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into
reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the
existing Tickets map once seeded. Verified locally: ruff-clean new files, engine
lists/parses both streams against live S3 (crq=52 files, inc unaffected).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 23:16:38 +03:00
david kiania
7d3bba8d78 chore(schedule): INC ingest cron -> every 20 min, 06:00-20:40 EAT
Was hourly at :15 (15 7-19 * * *); now */20 6-20 * * * for fresher ticket
data through the working day. Updates the documented schedule in the Coolify
Scheduled Task command, run_ingest.sh, Dockerfile, README, and implementation
notes (the live schedule is set in the Coolify UI).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 18:23:17 +03:00
david kiania
dca2c94c75 fix: address valid findings from 20260618 bug report
Verified each finding against the code (+ profiled the 31k-row CSV sample);
implemented only the genuinely valid fixes:

- import_tickets.py: fold _record_meta into the upsert transaction so rows +
  snapshot meta commit atomically (BUG 2); guard _ts_from_key against
  regex-matching-but-invalid dates so the sort can't crash (BUG 11);
  extract_place now splits glued NW prefixes (~1.7k rows, e.g. NWKIAMBU→KIAMBU)
  and only drops a trailing '-<seg>' when it's a unit/instruction code, keeping
  real-word tails like '-MALL' (BUG 14). Scoped glued-split to NW only —
  CO/NE/SE begin real words (COAST/NEW/SEASONS) per the data.
- Dockerfile + pyproject.toml: install from pyproject (single source of truth)
  instead of mirroring deps; add build-system + py-modules so `pip install .`
  works for the flat-module layout (BUG 9).
- migrations/03_inc_columns.sql: document the eat_ts IMMUTABLE/tzdata footgun
  and the manual-recompute path (BUG 6).
- .gitignore: narrow *.json → *.local.json so real fixtures can be versioned;
  ignore build/ and *.egg-info/ (BUG 10).

Reclassified/skipped as invalid or by-design: BUG 1, 3, 4, 5, 7, 8, 12, 13.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 13:41:38 +03:00
david kiania
68f2b99cd3 feat: S3 via boto3 + Dockerfile for Coolify deploy
- Replace the aws-CLI subprocess calls with boto3 (list_objects_v2 paginator,
  get_object, copy_object+delete_object) using path-style addressing + RUSTFS_*
  env. Removes the external aws-CLI dependency so it runs in a slim container.
- Add boto3 to pyproject dependencies.
- Add Dockerfile (python:3.12-slim, deps, TZ=Africa/Nairobi, keep-alive CMD) and
  .dockerignore for Coolify; document Coolify Scheduled Task setup in README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 20:08:05 +03:00