Commit graph

22 commits

Author SHA1 Message Date
david kiania
7d3bba8d78 chore(schedule): INC ingest cron -> every 20 min, 06:00-20:40 EAT
Was hourly at :15 (15 7-19 * * *); now */20 6-20 * * * for fresher ticket
data through the working day. Updates the documented schedule in the Coolify
Scheduled Task command, run_ingest.sh, Dockerfile, README, and implementation
notes (the live schedule is set in the Coolify UI).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 18:23:17 +03:00
david kiania
509338c076 feat(import_tickets): migrate INC ingest to isptickets bucket + --reseed cutover
Provider moved the INC CDC feed to a new bucket (tickets -> isptickets, new
per-bucket creds; same s3.rahamafresh.com endpoint, identical 32-col schema).
This is config + a one-time reseed, not a rewrite — the loader already drains
automations/inc/changes/ oldest->newest with a source_max_key watermark.

- default _BUCKET -> isptickets (TICKETS_BUCKET still overrides)
- add --reseed: ignore the stored watermark and drain every changes/ file once
  (the old-bucket watermark may post-date the new bucket's first file). Crash-safe
  via the existing per-file watermark-advance + archive loop.
- refresh stale "newest-file / full-snapshot-per-hour" docstring/comments to the
  CDC reality; .env.example + README updated (new bucket + reseed runbook).

Verified live dry-run: 41/41 files drained (watermark None), alarm/sentinel
filter active, exit 0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 18:20:15 +03:00
david kiania
a4b90a33d8 fix(inc): ingest the incremental changes/ stream (baseline + deltas)
The S3 source switched from full hourly snapshots at
automations/inc/<ts>.csv to an incremental CDC stream at
automations/inc/changes/<ts>.csv (first file = full baseline, each later
file = only the rows that changed, keyed by ticket_id; no deletions).

The loader still pointed at the old root path and only ingested the single
newest file, so after the switch it found nothing (no new tickets ingested)
and, even with the path fixed, would silently drop intermediate deltas.

Changes:
- point ingestion at automations/inc/changes/ (_CHANGE_KEY_RE)
- ingest EVERY not-yet-processed file in ascending timestamp order
  (baseline first, then each delta), upserting each
- replace the single-ETag skip with a per-file timestamp watermark
  (import_meta.metadata->>'source_max_key'); rows + watermark commit in one
  txn per file, then archive to processed/ — so a mid-run failure leaves a
  consistent, resumable state
- docs: rename n8n-hourly-s3-full-data-exports.md -> n8n-s3-ticket-exports.md
  and rewrite it for the incremental stream; fix the reference in
  docs/phase-1-ingestion.md

Verified live against prod: re-seeded baseline + 5 deltas (26,529 rows),
files archived to processed/, watermark advanced, re-run is a no-op.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 14:37:17 +03:00
david kiania
e71c8914f1 feat(geocode): two-pass estate fallback for building-level location_names
Building-level names (e.g. 'KAHAWA WENDANI ALVO HOUSE') aren't in OSM, so the
precise forward-geocode 404s and tickets stay on the bare cluster centroid
(observed 0/133 placed). geocode_locations now tries an ordered set of
candidates per location (compose_queries): full precise -> estate (leading 2
tokens) -> leading token, each constrained by the existing cluster viewbox +
25km distance check, accepting the FIRST in-range hit. This places tickets in
the right neighbourhood (e.g. 'KAHAWA WENDANI', 'BAMBURI') instead of the broad
cluster centroid. Wrong-area matches for ambiguous coarse tokens are rejected by
the distance check and fall through; genuinely unmatchable tickets keep the
honest cluster-centroid fallback (no pure-cluster candidate, which would only
mislabel the centroid as geo_source='location'). Verified the cascade finds
hits against live LocationIQ on real samples.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 18:51:58 +03:00
david kiania
cdb6186dca feat(reporting): add closure-by-engineer analytics to fn_inc_dashboard (migration 12)
CREATE OR REPLACE reporting.fn_inc_dashboard (supersedes 09) adding
metrics.by_owner — a closures-by-engineer leaderboard over the windowed closed
set: array of { owner, closed, breached, avg_mttr_min } sorted desc. owner is
CASE-NORMALIZED (initcap(lower(...))) to merge the offline closing system's
mixed-case duplicates (observed 102 -> 58 distinct engineers; e.g.
'Elikana Mabonga' + 'ELIKANA MABONGA' -> one). owner on the open/closed GeoJSON
features is normalized the same way so the leaderboard and per-ticket drill-down
agree. Everything else unchanged; the dashboard_api route is a passthrough so no
API change is needed. Validated in a rolled-back tx against prod.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 17:53:32 +03:00
david kiania
dca2c94c75 fix: address valid findings from 20260618 bug report
Verified each finding against the code (+ profiled the 31k-row CSV sample);
implemented only the genuinely valid fixes:

- import_tickets.py: fold _record_meta into the upsert transaction so rows +
  snapshot meta commit atomically (BUG 2); guard _ts_from_key against
  regex-matching-but-invalid dates so the sort can't crash (BUG 11);
  extract_place now splits glued NW prefixes (~1.7k rows, e.g. NWKIAMBU→KIAMBU)
  and only drops a trailing '-<seg>' when it's a unit/instruction code, keeping
  real-word tails like '-MALL' (BUG 14). Scoped glued-split to NW only —
  CO/NE/SE begin real words (COAST/NEW/SEASONS) per the data.
- Dockerfile + pyproject.toml: install from pyproject (single source of truth)
  instead of mirroring deps; add build-system + py-modules so `pip install .`
  works for the flat-module layout (BUG 9).
- migrations/03_inc_columns.sql: document the eat_ts IMMUTABLE/tzdata footgun
  and the manual-recompute path (BUG 6).
- .gitignore: narrow *.json → *.local.json so real fixtures can be versioned;
  ignore build/ and *.egg-info/ (BUG 10).

Reclassified/skipped as invalid or by-design: BUG 1, 3, 4, 5, 7, 8, 12, 13.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 13:41:38 +03:00
david kiania
764dee986f feat: history capture — closure_events + daily backlog snapshot (migration 10)
- tickets.closure_events: append-only observed closures (PK ticket_id, closed_at;
  observed_at = first sighting; survives row churn).
- tickets.inc_daily_snapshot: one row per EAT day — open backlog (+ SLA split, by
  cluster/status) and created/closed flow; upserted each run.
- tickets.capture_history(): appends new closures + upserts today's snapshot.
- import_tickets calls it after each --apply run (ingest or skip); add
  --capture-history CLI flag for standalone runs.
Verified: backfilled 21,282 closures; today's snapshot recorded (open_total 30).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:19:23 +03:00
david kiania
da6da9d26f docs: dashboard_api endpoint contract for fn_inc_dashboard (handoff)
GET /webhook/inc-dashboard wrapper spec: query params (cluster/status/window/from/to)
-> SQL passthrough, full response schema, field semantics (open=live vs closed=window,
mttr minutes, derived vs source SLA, map/metrics geocoding gap), examples, caching/auth.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:13:09 +03:00
david kiania
752ac9e418 feat: reporting.fn_inc_dashboard — INC operations dashboard read-API (migration 09)
One parameterized function returns {window, open GeoJSON, closed GeoJSON, metrics,
freshness} for the FleetOps live INC map:
- open  = all is_actionable tickets (live), filtered by cluster/status, with
  sla_state/hours_open (from tickets.inc_open_sla)
- closed= closed_at within the selected window (EAT calendar today/week/month or
  custom [from,to)), filtered by cluster/status
- metrics= open/closed counts, SLA split (open derived, closed source), by status/
  cluster, closure rate + daily series, avg mttr (minutes)
Filters combine with AND; grants to dashboard_ro/grafana_ro. Verified live
(today/month/cluster/status/custom; last-7d closed=913 matches raw).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:10:18 +03:00
david kiania
f2408f113e docs: add docs/ — phase-1/phase-2 PRDs + implementation record
- docs/phase-1-ingestion.md  — Phase 1 PRD (INC hourly CSV ingestion; deployed)
- docs/phase-2-dashboard.md  — Phase 2 PRD (inc_dashboard read-API for FleetOps map)
- docs/implementation.md     — as-built record (pipeline, migrations 01-08, deploy, DQ)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:05:18 +03:00
david kiania
e17553ccbf docs: comprehensive README — column reference, query runbook, DQ/SLA notes, status
Add tickets.inc column reference (typed generated columns + geom/geog), a querying
runbook (map fn, inc_open_sla, closures/day, nearest-vehicle KNN), data-quality &
SLA caveats (source sla_status only valid when closed, ~30% null created_at_service,
mttr semantics, content lag, history gap), and a status/roadmap section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 00:10:27 +03:00
david kiania
8e119e2328 feat: tickets.inc_open_sla view + lowercase legacy region
- Add tickets.inc_open_sla: open tickets with derived SLA (hours_open, sla_state
  vs 48h; clock = created_at_service or first_seen_at fallback) + team/cluster/geog
  for dispatch. (One-time legacy region->lowercase backfill applied to live data.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 00:03:55 +03:00
david kiania
e54e2b7c56 feat: drop constant service_type column (migration 07)
service_type is always 'inc' (cardinality 1) — zero info, redundant in an INC-only
table. Drop the generated column; stays in raw for audit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:54:43 +03:00
david kiania
b86c0b2d13 feat: mttr -> minutes; drop constant alarm/auto flags (migration 06)
mttr generated column is now integer minutes (source raw.mttr is decimal hours),
analytics-friendly. Drop is_alarm/is_auto_created/is_auto_closed generated columns
— all constant `false` in tickets.inc since alarms are filtered at ingest (still
present in raw for audit; loader still filters on raw->>'is_alarm'). is_actionable
kept.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:51:28 +03:00
david kiania
565cd592a0 feat: add geography column + GiST index for routing (migration 05)
geom is geometry(Point,4326) (planar degrees); add geog = geom::geography (STORED
generated) + GiST index so ST_Distance/ST_DWithin/KNN work in real metres for
nearest-vehicle and radius queries.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:33:45 +03:00
david kiania
b323e8a1ac feat: populate inc latitude/longitude from geocoded geom (migration 04)
Feed coords are always empty; redefine the latitude/longitude generated columns to
COALESCE(feed, ST_Y/ST_X(geom)) so they carry the resolved/geocoded position for
every geocoded ticket (precision indicated by geo_source). STORED, recomputes when
geom changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:26:39 +03:00
david kiania
073db9b5b8 feat: unpack tickets.inc.raw into typed generated columns (migration 03)
Add STORED generated columns derived from raw (text/numeric/bool/double + EAT
timestamptz via an IMMUTABLE tickets.eat_ts() wrapper). Computed for all existing
rows and auto-populated on every future ingest — raw stays the source of truth,
no loader change. Indexes on status/cluster/team/closed_at/is_actionable for the
SLA/team/closure queries.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:08:31 +03:00
david kiania
4be5159d21 docs: Coolify scheduled tasks run in EAT — cron 15 7-19 (no UTC shift)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 22:43:01 +03:00
david kiania
68f2b99cd3 feat: S3 via boto3 + Dockerfile for Coolify deploy
- Replace the aws-CLI subprocess calls with boto3 (list_objects_v2 paginator,
  get_object, copy_object+delete_object) using path-style addressing + RUSTFS_*
  env. Removes the external aws-CLI dependency so it runs in a slim container.
- Add boto3 to pyproject dependencies.
- Add Dockerfile (python:3.12-slim, deps, TZ=Africa/Nairobi, keep-alive CMD) and
  .dockerignore for Coolify; document Coolify Scheduled Task setup in README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 20:08:05 +03:00
david kiania
4532643247 chore: add hourly INC ingest cron wrapper + schedule docs
run_ingest.sh loads .env and runs `import_tickets.py --from-bucket --apply`.
Documented crontab: `15 7-19 * * *` in Africa/Nairobi (ingest at :15, 07:00–19:00).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 19:40:50 +03:00
david kiania
df054c92be feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)
Rework import_tickets.py from the retired JSON `latest.json` model to the new
hourly full-snapshot CSV export. Strictly INC (CRQ out of scope).

- Ingest the newest automations/inc/<EAT-timestamp>.csv; skip-if-unchanged by
  comparing S3 ETag to tickets.import_meta.metadata.source_etag.
- Upsert on ticket_id (PK; no dups, never delete -> closure history accrues).
  No truncate. On success, move processed files to automations/inc/processed/.
- Clean at ingest: drop is_alarm=true + the "EXPORT STOPPED..." sentinel; drop
  week_*, source_s3_*/source_snapshot_id, department/source_type; lowercase
  region, uppercase raw_status; keep service_type + bucket.
- Force path-style S3 addressing; --inc-csv for local dev; --from-bucket for cron.
- Add migrations/02 (import_meta + freshness); refresh README/.env.example/docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 19:33:16 +03:00
david kiania
4631cc6382 feat: fleettickets — INC/CRQ ticket ingestion, geocoding + read-schema
Standalone module extracted from the tracksolid repo (was migrations 21-23 +
tools/import_tickets.py). Owns the `tickets` schema in the shared tracksolid_db.

- migrations/01_tickets_schema.sql: consolidated final-state schema (tickets.inc/
  crq raw-jsonb-first, geo_clusters + geo_locations gazetteers, geom trigger,
  reporting.fn_tickets_for_map)
- import_tickets.py: rustfs bucket ingest + cluster/location geocoding
  (LocationIQ/OpenCage, viewbox-bounded + cluster-distance guard)
- run_migrations.py, shared.py (self-contained), pyproject, .env.example, README

The DB stays in tracksolid_db; dashboard_api keeps serving /webhook/tickets; the
Tickets map stays a FleetOps tab.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:13:50 +03:00