fleettickets

Author	SHA1	Message	Date
david kiania	5f5d71d500	feat(crq): add CRQ ingestion via shared engine + thin inc/crq entrypoints Split the INC-only loader into a dataset-agnostic engine (pipeline.py, renamed from import_tickets.py) parameterized by a Dataset config, with thin per-type entrypoints inc/import_inc.py and crq/import_crq.py. CRQ shares INC's identical 32-column source schema and CDC change stream, so the engine is fully shared. - pipeline.py: Dataset config (name/table/prefixes/key_regex/post_apply); INC keeps the capture_history post-apply hook, CRQ has none yet. geocode_locations now unions tickets.crq (geocoding is cross-dataset: one gazetteer/budget). - crq/import_crq.py: drains automations/crq/changes/ from isptickets into tickets.crq (data layer + map; SLA/dashboard/history deferred). - migrations/13_crq_columns.sql: CRQ mirror of 03 — typed STORED generated columns + indexes on tickets.crq (reuses tickets.eat_ts()). - Deployment: Dockerfile/run_ingest.sh run both via `python -m`; pyproject packages inc/crq. Docs (README, implementation, deployment-and-operations, n8n export ref, phase-1) updated for the split + the one-time CRQ seed runbook. tickets.crq already exists (mig 01, LIKE tickets.inc) and is unioned into reporting.fn_tickets_for_map + resolve_ticket_geoms, so CRQ appears on the existing Tickets map once seeded. Verified locally: ruff-clean new files, engine lists/parses both streams against live S3 (crq=52 files, inc unaffected). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 23:16:38 +03:00
david kiania	509338c076	feat(import_tickets): migrate INC ingest to isptickets bucket + --reseed cutover Provider moved the INC CDC feed to a new bucket (tickets -> isptickets, new per-bucket creds; same s3.rahamafresh.com endpoint, identical 32-col schema). This is config + a one-time reseed, not a rewrite — the loader already drains automations/inc/changes/ oldest->newest with a source_max_key watermark. - default _BUCKET -> isptickets (TICKETS_BUCKET still overrides) - add --reseed: ignore the stored watermark and drain every changes/ file once (the old-bucket watermark may post-date the new bucket's first file). Crash-safe via the existing per-file watermark-advance + archive loop. - refresh stale "newest-file / full-snapshot-per-hour" docstring/comments to the CDC reality; .env.example + README updated (new bucket + reseed runbook). Verified live dry-run: 41/41 files drained (watermark None), alarm/sentinel filter active, exit 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 18:20:15 +03:00
david kiania	a4b90a33d8	fix(inc): ingest the incremental changes/ stream (baseline + deltas) The S3 source switched from full hourly snapshots at automations/inc/<ts>.csv to an incremental CDC stream at automations/inc/changes/<ts>.csv (first file = full baseline, each later file = only the rows that changed, keyed by ticket_id; no deletions). The loader still pointed at the old root path and only ingested the single newest file, so after the switch it found nothing (no new tickets ingested) and, even with the path fixed, would silently drop intermediate deltas. Changes: - point ingestion at automations/inc/changes/ (_CHANGE_KEY_RE) - ingest EVERY not-yet-processed file in ascending timestamp order (baseline first, then each delta), upserting each - replace the single-ETag skip with a per-file timestamp watermark (import_meta.metadata->>'source_max_key'); rows + watermark commit in one txn per file, then archive to processed/ — so a mid-run failure leaves a consistent, resumable state - docs: rename n8n-hourly-s3-full-data-exports.md -> n8n-s3-ticket-exports.md and rewrite it for the incremental stream; fix the reference in docs/phase-1-ingestion.md Verified live against prod: re-seeded baseline + 5 deltas (26,529 rows), files archived to processed/, watermark advanced, re-run is a no-op. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 14:37:17 +03:00
david kiania	e71c8914f1	feat(geocode): two-pass estate fallback for building-level location_names Building-level names (e.g. 'KAHAWA WENDANI ALVO HOUSE') aren't in OSM, so the precise forward-geocode 404s and tickets stay on the bare cluster centroid (observed 0/133 placed). geocode_locations now tries an ordered set of candidates per location (compose_queries): full precise -> estate (leading 2 tokens) -> leading token, each constrained by the existing cluster viewbox + 25km distance check, accepting the FIRST in-range hit. This places tickets in the right neighbourhood (e.g. 'KAHAWA WENDANI', 'BAMBURI') instead of the broad cluster centroid. Wrong-area matches for ambiguous coarse tokens are rejected by the distance check and fall through; genuinely unmatchable tickets keep the honest cluster-centroid fallback (no pure-cluster candidate, which would only mislabel the centroid as geo_source='location'). Verified the cascade finds hits against live LocationIQ on real samples. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 18:51:58 +03:00
david kiania	dca2c94c75	fix: address valid findings from 20260618 bug report Verified each finding against the code (+ profiled the 31k-row CSV sample); implemented only the genuinely valid fixes: - import_tickets.py: fold _record_meta into the upsert transaction so rows + snapshot meta commit atomically (BUG 2); guard _ts_from_key against regex-matching-but-invalid dates so the sort can't crash (BUG 11); extract_place now splits glued NW prefixes (~1.7k rows, e.g. NWKIAMBU→KIAMBU) and only drops a trailing '-<seg>' when it's a unit/instruction code, keeping real-word tails like '-MALL' (BUG 14). Scoped glued-split to NW only — CO/NE/SE begin real words (COAST/NEW/SEASONS) per the data. - Dockerfile + pyproject.toml: install from pyproject (single source of truth) instead of mirroring deps; add build-system + py-modules so `pip install .` works for the flat-module layout (BUG 9). - migrations/03_inc_columns.sql: document the eat_ts IMMUTABLE/tzdata footgun and the manual-recompute path (BUG 6). - .gitignore: narrow .json → .local.json so real fixtures can be versioned; ignore build/ and *.egg-info/ (BUG 10). Reclassified/skipped as invalid or by-design: BUG 1, 3, 4, 5, 7, 8, 12, 13. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 13:41:38 +03:00
david kiania	764dee986f	feat: history capture — closure_events + daily backlog snapshot (migration 10) - tickets.closure_events: append-only observed closures (PK ticket_id, closed_at; observed_at = first sighting; survives row churn). - tickets.inc_daily_snapshot: one row per EAT day — open backlog (+ SLA split, by cluster/status) and created/closed flow; upserted each run. - tickets.capture_history(): appends new closures + upserts today's snapshot. - import_tickets calls it after each --apply run (ingest or skip); add --capture-history CLI flag for standalone runs. Verified: backfilled 21,282 closures; today's snapshot recorded (open_total 30). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-16 01:19:23 +03:00
david kiania	68f2b99cd3	feat: S3 via boto3 + Dockerfile for Coolify deploy - Replace the aws-CLI subprocess calls with boto3 (list_objects_v2 paginator, get_object, copy_object+delete_object) using path-style addressing + RUSTFS_* env. Removes the external aws-CLI dependency so it runs in a slim container. - Add boto3 to pyproject dependencies. - Add Dockerfile (python:3.12-slim, deps, TZ=Africa/Nairobi, keep-alive CMD) and .dockerignore for Coolify; document Coolify Scheduled Task setup in README. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 20:08:05 +03:00
david kiania	df054c92be	feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive) Rework import_tickets.py from the retired JSON `latest.json` model to the new hourly full-snapshot CSV export. Strictly INC (CRQ out of scope). - Ingest the newest automations/inc/<EAT-timestamp>.csv; skip-if-unchanged by comparing S3 ETag to tickets.import_meta.metadata.source_etag. - Upsert on ticket_id (PK; no dups, never delete -> closure history accrues). No truncate. On success, move processed files to automations/inc/processed/. - Clean at ingest: drop is_alarm=true + the "EXPORT STOPPED..." sentinel; drop week_, source_s3_/source_snapshot_id, department/source_type; lowercase region, uppercase raw_status; keep service_type + bucket. - Force path-style S3 addressing; --inc-csv for local dev; --from-bucket for cron. - Add migrations/02 (import_meta + freshness); refresh README/.env.example/docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 19:33:16 +03:00
david kiania	4631cc6382	feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema Standalone module extracted from the tracksolid repo (was migrations 21-23 + tools/import_tickets.py). Owns the `tickets` schema in the shared tracksolid_db. - migrations/01_tickets_schema.sql: consolidated final-state schema (tickets.inc/ crq raw-jsonb-first, geo_clusters + geo_locations gazetteers, geom trigger, reporting.fn_tickets_for_map) - import_tickets.py: rustfs bucket ingest + cluster/location geocoding (LocationIQ/OpenCage, viewbox-bounded + cluster-distance guard) - run_migrations.py, shared.py (self-contained), pyproject, .env.example, README The DB stays in tracksolid_db; dashboard_api keeps serving /webhook/tickets; the Tickets map stays a FleetOps tab. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 20:13:50 +03:00

9 commits