fleettickets/docs/phase-1-ingestion.md
david kiania f2408f113e docs: add docs/ — phase-1/phase-2 PRDs + implementation record
- docs/phase-1-ingestion.md  — Phase 1 PRD (INC hourly CSV ingestion; deployed)
- docs/phase-2-dashboard.md  — Phase 2 PRD (inc_dashboard read-API for FleetOps map)
- docs/implementation.md     — as-built record (pipeline, migrations 01-08, deploy, DQ)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:05:18 +03:00

98 lines
5.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PRD (Phase 1) — INC hourly CSV ingestion → tracksolid_db → FleetOps Tickets map
> Status: **complete and deployed** (migrations 0108, boto3 loader, geocoding,
> Coolify hourly `15 7-19 * * *` EAT). This document is the record of the Phase-1
> plan; see `README.md` and `docs/implementation.md` for the as-built state.
## Scope: INC only
**This workflow is strictly for INC** (incident / customer-fault tickets). It
ingests **only** `automations/inc/<EAT-timestamp>.csv`. CRQ (new-installation)
exports at `automations/crq/` are **out of scope** and are not processed here; the
field transforms below are likewise INC-only.
## Context
The client (Rahamafresh / Fireside) runs an n8n workflow that exports field-ops
tickets to our S3-compatible bucket **every hour**:
- `automations/inc/<EAT-timestamp>.csv`**incidents / customer faults** *(in scope)*
- `automations/crq/<EAT-timestamp>.csv` — new-installation requests *(out of scope)*
(See `n8n-hourly-s3-full-data-exports.md`. Sample: `2026-06-15T17-00-00.csv`.)
`fleettickets` owns the **downstream**: the `tickets` schema in the shared
`tracksolid_db` (raw-jsonb-first `tickets.inc`, geocoding gazetteers, and
`reporting.fn_tickets_for_map`, which `dashboard_api` serves to the FleetOps
"Tickets" tab). `tickets.crq` keeps existing but is not fed by this pipeline.
**The problem:** the loader was written for the *old* export model — JSON
`{metadata, records}` envelopes at a stable `automations/inc/latest.json`. That
model is gone; the new exports are **flat CSV, timestamped per hour, with no
`latest` pointer, no envelope, and no deltas** — every hourly file is a **full
current-state snapshot**.
**Two driving objectives this pipeline feeds:**
1. **SLA tracking** — contract requires tickets closed within **48h of
`created_at_service`**; closed carry source `sla_status` + `mttr`, open need a
derived state (`now created_at_service` ≥48h breached / ≥36h at-risk).
2. **Vehicle routing (most important)** — accurately geocoded open tickets so
FleetNow can route nearest vehicles; subsequent: team closure attribution.
## Data contract (verified against live snapshots)
- 32 columns; header + double-quoted values. INC sample = 31,434 rows.
- `ticket_id` is the **primary key**; the same ticket recurs across snapshots as it
moves `open → closed`. Verified: 31,434 distinct ids per file, **0 in-file dups**,
same id set every hour (0 added/dropped) → **upsert is the dedup mechanism, no
TRUNCATE**. Consecutive files are often byte-identical → skip-if-unchanged.
- `is_alarm=true` (~10,132 rows, all `is_actionable=false`) → **dropped**.
- `latitude`/`longitude` are **empty** in the feed → geocoding required.
- A garbage **sentinel row** (`ticket_id = "EXPORT STOPPED DUE TO EXCESSIVE SIZE…"`)
is commonly the first data line → filtered by `ticket_id` prefix.
- Timestamps (filenames + data) are **EAT (Africa/Nairobi, UTC+3)**.
- `bucket` is meaningful (`closed`/`pending`), distinct from `source_s3_bucket`.
## Approach
Keep the **raw-jsonb-first** model and everything downstream; only the loader's
input path changes: JSON-`latest` → **newest timestamped CSV**, plus move-on-success.
- **Newest file** per `automations/inc/` (parse `YYYY-MM-DDTHH-mm-ss.csv`), via
**boto3** (path-style; no aws-CLI dependency).
- **Skip-if-unchanged**: compare newest S3 **ETag** to the last processed ETag
(`tickets.import_meta.metadata.source_etag`); equal → skip DB write.
- **Cleaning at ingest**: drop `is_alarm=true` + sentinel; drop `week_start`,
`week_end`, `source_s3_bucket`, `source_s3_key`, `source_snapshot_id`,
`department`, `source_type`; normalize `region`→lowercase, `raw_status`→UPPERCASE;
keep `service_type`* and `bucket`. (*`service_type` later dropped as constant.)
- **Upsert** on `ticket_id` (`ON CONFLICT DO UPDATE`); never delete → closure
history accumulates. On success **move** the file(s) to
`automations/inc/processed/`.
- Record snapshot freshness in `tickets.import_meta`.
- Geocoding unchanged: `--geocode-clusters` (coarse) + `--geocode-locations`
(precise, actionable INC; keyed LocationIQ; 25 km wrong-city guard).
## Orchestration
Deployed on **Coolify** (own app, `Dockerfile`, keep-alive worker). Ingest runs as a
**Scheduled Task**: `python import_tickets.py --from-bucket --apply`, cron
`15 7-19 * * *` in **EAT**. Env: `DATABASE_URL`, `RUSTFS_*`, `GEOCODER_*`.
## Data-quality findings (carried into Phase 2)
- Source `sla_status` ≠ a plain 48h rule, and `mttr` is not wall-clock — pin the
contract's SLA definition before trusting cross-field SLA math.
- `created_at_service` is null on ~30% of rows (incl. most open) → needs a fallback
clock (`first_seen_at`).
- Split timestamp semantics: lifecycle = `created_at_service`→`closed_at`; export
bookkeeping = `created_at`/`updated_at`/`first_seen_at`/`last_seen_at`.
- `assigned_team` missing ~34% (`owner` better).
- Content lag ~2 days (underlying `…wm_task.xlsx` source date).
## Outcome (as built)
Live in `tracksolid_db`: `tickets.inc` (raw + typed generated columns), geocoded to
~99.99%, alarm/sentinel removed, hourly refresh with ETag skip + archive. See
`docs/implementation.md`.