Six service connections run as the postgres SUPERUSER across two databases on the shared 100-connection server — the root of the "too many connections" peaks and a standing least-privilege risk. Superuser sessions ignore per-role CONNECTION LIMIT and can consume the superuser-reserved slots. Drafts (apply as postgres; nothing applied here): - scripts/app_roles_tracksolid_db.sql — webhook_app, ingest_app, worker_app, dashboard_app. Capability groups (ts_app_read / ts_app_write), per-app NOSUPERUSER login roles with hard CONNECTION LIMIT + bounded GUCs (statement_timeout, idle_session_timeout, idle_in_transaction, lock_timeout). - scripts/app_roles_fleet_platform.sql — gateway_app, cron_app (the apps on the separate fleet_platform DB), fp_app_rw group over its schemas. - scripts/MIGRATE_APPS_OFF_SUPERUSER.md — runbook: discovery (what each app actually writes / whether it runs DDL), connection-budget table (sum ≈ 81 < 100), the object-ownership step for migration-running apps (reassign app schemas to the existing tracksolid_owner — scoped, never REASSIGN OWNED globally), one-at-a-time cutover, and instant rollback (DATABASE_URL only). Grants are best-effort by app function and explicitly call out where to verify before cutover; all objects are postgres-owned, so row DML works but DDL needs the ownership step. See the runbook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
119 lines
7 KiB
SQL
119 lines
7 KiB
SQL
-- app_roles_tracksolid_db.sql — dedicated NON-SUPERUSER login roles for the apps
|
|
-- that currently connect to tracksolid_db as the `postgres` SUPERUSER.
|
|
-- ─────────────────────────────────────────────────────────────────────────────
|
|
-- WHY: six stack services connect to this Postgres server as the postgres superuser
|
|
-- (webhook_receiver, ingest_worker, worker, the prod dashboard_api backend on
|
|
-- tracksolid_db; gateway + cron on fleet_platform — see the sibling file). That is
|
|
-- both a least-privilege problem AND the root of the "too many connections" error:
|
|
-- superuser sessions ignore per-role connection caps and can exhaust the 100-slot
|
|
-- ceiling (incl. the superuser-reserved slots). Dedicated roles let us pin a hard
|
|
-- CONNECTION LIMIT and timeouts per app.
|
|
--
|
|
-- WHAT THIS DOES (run as the postgres SUPERUSER, on tracksolid_db):
|
|
-- * creates capability GROUP roles (NOLOGIN) for read vs. read-write,
|
|
-- * creates one LOGIN role per app, NOSUPERUSER, with a CONNECTION LIMIT and
|
|
-- bounded GUCs, as a member of the group it needs,
|
|
-- * grants the groups SELECT / DML on the operational schemas.
|
|
--
|
|
-- WHAT IT DOES *NOT* DO: change object ownership. All objects here are owned by
|
|
-- `postgres`, so a non-superuser role can write ROWS but cannot ALTER/DROP existing
|
|
-- tables (i.e. run migrations). If an app runs DDL at deploy, see step 3 in
|
|
-- MIGRATE_APPS_OFF_SUPERUSER.md (reassign the app schemas to `tracksolid_owner` and
|
|
-- add the app role to it). Roles here INHERIT, so membership grants apply directly.
|
|
--
|
|
-- Idempotent. Passwords are supplied as psql vars (never stored in the repo):
|
|
-- docker exec -i <timescale_db> psql -U postgres -d tracksolid_db -v ON_ERROR_STOP=1 \
|
|
-- -v webhook_pw="$(cat ~/.webhook_app.pw)" \
|
|
-- -v ingest_pw="$(cat ~/.ingest_app.pw)" \
|
|
-- -v worker_pw="$(cat ~/.worker_app.pw)" \
|
|
-- -v dash_pw="$(cat ~/.dashboard_app.pw)" \
|
|
-- < scripts/app_roles_tracksolid_db.sql
|
|
|
|
\set ON_ERROR_STOP on
|
|
|
|
-- ── 1. Capability groups (NOLOGIN; apps inherit privileges via membership) ──────
|
|
DO $$
|
|
BEGIN
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='ts_app_read') THEN CREATE ROLE ts_app_read NOLOGIN; END IF;
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='ts_app_write') THEN CREATE ROLE ts_app_write NOLOGIN; END IF;
|
|
END $$;
|
|
|
|
-- Read surface: telemetry + curated reporting layer.
|
|
GRANT USAGE ON SCHEMA tracksolid, reporting TO ts_app_read;
|
|
GRANT SELECT ON ALL TABLES IN SCHEMA tracksolid, reporting TO ts_app_read;
|
|
GRANT SELECT ON reporting.v_trips TO ts_app_read; -- matview (not in ALL TABLES)
|
|
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA reporting TO ts_app_read;
|
|
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA tracksolid, reporting GRANT SELECT ON TABLES TO ts_app_read;
|
|
|
|
-- Write surface for ingestion: row DML on telemetry (NOT DDL — see header).
|
|
GRANT ts_app_read TO ts_app_write; -- write implies read
|
|
GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA tracksolid TO ts_app_write;
|
|
GRANT USAGE, SELECT, UPDATE ON ALL SEQUENCES IN SCHEMA tracksolid TO ts_app_write;
|
|
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA tracksolid
|
|
GRANT INSERT, UPDATE, DELETE ON TABLES TO ts_app_write;
|
|
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA tracksolid
|
|
GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO ts_app_write;
|
|
|
|
-- ── 2. Per-app LOGIN roles ──────────────────────────────────────────────────────
|
|
-- CONNECTION LIMIT is the hard budget cap (sum across all roles must stay < 100).
|
|
-- GUCs are belt-and-braces and tunable per app.
|
|
|
|
-- webhook_receiver — ingests Tracksolid webhooks (writes telemetry; may run migrations).
|
|
DO $$ BEGIN
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='webhook_app') THEN
|
|
CREATE ROLE webhook_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
|
|
END IF; END $$;
|
|
ALTER ROLE webhook_app WITH LOGIN PASSWORD :'webhook_pw' CONNECTION LIMIT 10;
|
|
GRANT CONNECT ON DATABASE tracksolid_db TO webhook_app;
|
|
GRANT ts_app_write TO webhook_app;
|
|
ALTER ROLE webhook_app SET statement_timeout = '120s'; -- bulk inserts
|
|
ALTER ROLE webhook_app SET idle_in_transaction_session_timeout = '120s';
|
|
ALTER ROLE webhook_app SET idle_session_timeout = '10min';
|
|
ALTER ROLE webhook_app SET lock_timeout = '5s';
|
|
|
|
-- ingest_worker — background ingestion/normalisation (writes telemetry).
|
|
DO $$ BEGIN
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='ingest_app') THEN
|
|
CREATE ROLE ingest_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
|
|
END IF; END $$;
|
|
ALTER ROLE ingest_app WITH LOGIN PASSWORD :'ingest_pw' CONNECTION LIMIT 10;
|
|
GRANT CONNECT ON DATABASE tracksolid_db TO ingest_app;
|
|
GRANT ts_app_write TO ingest_app;
|
|
ALTER ROLE ingest_app SET statement_timeout = '120s';
|
|
ALTER ROLE ingest_app SET idle_in_transaction_session_timeout = '120s';
|
|
ALTER ROLE ingest_app SET idle_session_timeout = '10min';
|
|
ALTER ROLE ingest_app SET lock_timeout = '5s';
|
|
-- If ingestion REFRESHes reporting.v_trips, add it to the existing refresher role:
|
|
-- GRANT reporting_refresher TO ingest_app; -- (uncomment after confirming)
|
|
|
|
-- worker — fleet_platform worker that also reads tracksolid_db. Assumed READ-ONLY
|
|
-- here; widen to ts_app_write only if it actually writes telemetry.
|
|
DO $$ BEGIN
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='worker_app') THEN
|
|
CREATE ROLE worker_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
|
|
END IF; END $$;
|
|
ALTER ROLE worker_app WITH LOGIN PASSWORD :'worker_pw' CONNECTION LIMIT 5;
|
|
GRANT CONNECT ON DATABASE tracksolid_db TO worker_app;
|
|
GRANT ts_app_read TO worker_app;
|
|
ALTER ROLE worker_app SET statement_timeout = '60s';
|
|
ALTER ROLE worker_app SET idle_in_transaction_session_timeout = '60s';
|
|
ALTER ROLE worker_app SET idle_session_timeout = '10min';
|
|
ALTER ROLE worker_app SET lock_timeout = '5s';
|
|
|
|
-- dashboard_api (PROD backend, currently postgres). If it only reads, prefer the
|
|
-- existing dashboard_ro. This role is for a backend that ALSO writes app state;
|
|
-- start read-only and widen per discovery.
|
|
DO $$ BEGIN
|
|
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='dashboard_app') THEN
|
|
CREATE ROLE dashboard_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
|
|
END IF; END $$;
|
|
ALTER ROLE dashboard_app WITH LOGIN PASSWORD :'dash_pw' CONNECTION LIMIT 8;
|
|
GRANT CONNECT ON DATABASE tracksolid_db TO dashboard_app;
|
|
GRANT ts_app_read TO dashboard_app;
|
|
ALTER ROLE dashboard_app SET statement_timeout = '30s';
|
|
ALTER ROLE dashboard_app SET idle_in_transaction_session_timeout = '60s';
|
|
ALTER ROLE dashboard_app SET idle_session_timeout = '5min';
|
|
ALTER ROLE dashboard_app SET lock_timeout = '5s';
|
|
|
|
-- ── 3. Verify ───────────────────────────────────────────────────────────────────
|
|
-- \du+ -- inspect roles, CONNECTION LIMIT, and memberships
|