Was hourly at :15 (15 7-19 * * *); now */20 6-20 * * * for fresher ticket
data through the working day. Updates the documented schedule in the Coolify
Scheduled Task command, run_ingest.sh, Dockerfile, README, and implementation
notes (the live schedule is set in the Coolify UI).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Verified each finding against the code (+ profiled the 31k-row CSV sample);
implemented only the genuinely valid fixes:
- import_tickets.py: fold _record_meta into the upsert transaction so rows +
snapshot meta commit atomically (BUG 2); guard _ts_from_key against
regex-matching-but-invalid dates so the sort can't crash (BUG 11);
extract_place now splits glued NW prefixes (~1.7k rows, e.g. NWKIAMBU→KIAMBU)
and only drops a trailing '-<seg>' when it's a unit/instruction code, keeping
real-word tails like '-MALL' (BUG 14). Scoped glued-split to NW only —
CO/NE/SE begin real words (COAST/NEW/SEASONS) per the data.
- Dockerfile + pyproject.toml: install from pyproject (single source of truth)
instead of mirroring deps; add build-system + py-modules so `pip install .`
works for the flat-module layout (BUG 9).
- migrations/03_inc_columns.sql: document the eat_ts IMMUTABLE/tzdata footgun
and the manual-recompute path (BUG 6).
- .gitignore: narrow *.json → *.local.json so real fixtures can be versioned;
ignore build/ and *.egg-info/ (BUG 10).
Reclassified/skipped as invalid or by-design: BUG 1, 3, 4, 5, 7, 8, 12, 13.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Replace the aws-CLI subprocess calls with boto3 (list_objects_v2 paginator,
get_object, copy_object+delete_object) using path-style addressing + RUSTFS_*
env. Removes the external aws-CLI dependency so it runs in a slim container.
- Add boto3 to pyproject dependencies.
- Add Dockerfile (python:3.12-slim, deps, TZ=Africa/Nairobi, keep-alive CMD) and
.dockerignore for Coolify; document Coolify Scheduled Task setup in README.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>