The tg_ticket_geom trigger resolved feed coords -> cluster centroid -> none, never consulting tickets.geo_locations, so every 20-min delta ingest re-upserted changed rows and downgraded previously-resolved 'location' geoms back to the cluster centroid. Live effect: only 51 of 114k INC (and 0 of 42k CRQ) rows kept the precise geocode the LocationIQ budget paid for. - migration 18: trigger now resolves feed -> geo_locations (precise) -> cluster -> none, mirroring resolve_ticket_geoms() precedence; ends with one resolve pass to repair the backlog. Dry-run against the live DB (rolled back) repaired 7,481 rows: INC location 51 -> 5,339, CRQ 0 -> 2,193. - pipeline.ingest(): re-resolve after every applied run that ingested files, so geoms self-heal even before migration 18 lands. - run_ingest.sh: chain an incremental --geocode-clusters pass (0 API calls when no new clusters) so new clusters map without a manual command (FT-BUG-02). - Dockerfile/.dockerignore: pinned installs from uv.lock, non-root user (FT-SEC-02). - 20260618_bug.txt removed (stale review of a since-rewritten file). Numbered 18 to coexist with 17_drop_unused_geo_indexes.sql (parallel 260702 change). Audit + plan + work log in docs/260702_*. Local only; not applied to prod. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
55 lines
3 KiB
PL/PgSQL
55 lines
3 KiB
PL/PgSQL
-- 18_trigger_location_geom.sql — fleettickets · precise geoms survive re-upserts.
|
|
-- (migration 17 is 17_drop_unused_geo_indexes.sql — a parallel 260702 change.)
|
|
-- ─────────────────────────────────────────────────────────────────────────────
|
|
-- FT-BUG-01 (260702 audit): tg_ticket_geom resolved feed coords → cluster
|
|
-- centroid → none, NEVER consulting tickets.geo_locations. So the precise
|
|
-- location geocodes (808 keys, paid LocationIQ budget) were only applied by the
|
|
-- manual --geocode-* commands' resolve pass — and the very next 20-minute delta
|
|
-- ingest re-upserted changed rows and the trigger downgraded them back to the
|
|
-- cluster centroid. Live effect: 51 of 114k INC rows (and 0 CRQ) carried
|
|
-- geo_source='location'.
|
|
--
|
|
-- Fix: the trigger now resolves feed → LOCATION (geo_locations by
|
|
-- norm_cluster(location_name)) → cluster → none, mirroring
|
|
-- resolve_ticket_geoms()'s precedence exactly. Then one resolve pass repairs the
|
|
-- whole backlog immediately.
|
|
-- Idempotent — CREATE OR REPLACE + a re-runnable resolve.
|
|
-- ─────────────────────────────────────────────────────────────────────────────
|
|
|
|
CREATE OR REPLACE FUNCTION tickets.tg_ticket_geom()
|
|
RETURNS trigger LANGUAGE plpgsql AS $fn$
|
|
DECLARE
|
|
v_lat double precision := NULLIF(NEW.raw->>'latitude','')::double precision;
|
|
v_lng double precision := NULLIF(NEW.raw->>'longitude','')::double precision;
|
|
g geometry(Point, 4326);
|
|
BEGIN
|
|
IF TG_OP = 'UPDATE' AND NEW.raw IS NOT DISTINCT FROM OLD.raw THEN
|
|
RETURN NEW; -- geom/geo_source-only update — keep caller's value
|
|
END IF;
|
|
IF v_lat IS NOT NULL AND v_lng IS NOT NULL
|
|
AND v_lat BETWEEN -90 AND 90 AND v_lng BETWEEN -180 AND 180
|
|
AND NOT (v_lat = 0 AND v_lng = 0) THEN
|
|
NEW.geom := ST_SetSRID(ST_MakePoint(v_lng, v_lat), 4326);
|
|
NEW.geo_source := 'feed';
|
|
RETURN NEW;
|
|
END IF;
|
|
-- [mig-18] precise location geocode BEFORE the coarse cluster centroid
|
|
SELECT gl.geom INTO g FROM tickets.geo_locations gl
|
|
WHERE gl.query_key = tickets.norm_cluster(NEW.raw->>'location_name')
|
|
AND gl.geom IS NOT NULL LIMIT 1;
|
|
IF g IS NOT NULL THEN
|
|
NEW.geom := g; NEW.geo_source := 'location';
|
|
RETURN NEW;
|
|
END IF;
|
|
SELECT gc.geom INTO g FROM tickets.geo_clusters gc
|
|
WHERE gc.cluster_key = tickets.norm_cluster(NEW.raw->>'cluster')
|
|
AND gc.geom IS NOT NULL LIMIT 1;
|
|
IF g IS NOT NULL THEN NEW.geom := g; NEW.geo_source := 'cluster';
|
|
ELSE NEW.geom := NULL; NEW.geo_source := 'none'; END IF;
|
|
RETURN NEW;
|
|
END $fn$;
|
|
|
|
-- Repair the backlog now: re-resolve every non-feed row against the gazetteer
|
|
-- (resolve_ticket_geoms already prefers geo_locations over geo_clusters and only
|
|
-- touches rows whose geom/geo_source would actually change).
|
|
SELECT tickets.resolve_ticket_geoms();
|