fleetanalytics_mcp/scripts/app_roles_fleet_platform.sql
kiania e1472adc3a infra(db-roles): dedicated non-superuser roles for the six apps on postgres
Six service connections run as the postgres SUPERUSER across two databases on the
shared 100-connection server — the root of the "too many connections" peaks and a
standing least-privilege risk. Superuser sessions ignore per-role CONNECTION LIMIT
and can consume the superuser-reserved slots.

Drafts (apply as postgres; nothing applied here):
- scripts/app_roles_tracksolid_db.sql — webhook_app, ingest_app, worker_app,
  dashboard_app. Capability groups (ts_app_read / ts_app_write), per-app NOSUPERUSER
  login roles with hard CONNECTION LIMIT + bounded GUCs (statement_timeout,
  idle_session_timeout, idle_in_transaction, lock_timeout).
- scripts/app_roles_fleet_platform.sql — gateway_app, cron_app (the apps on the
  separate fleet_platform DB), fp_app_rw group over its schemas.
- scripts/MIGRATE_APPS_OFF_SUPERUSER.md — runbook: discovery (what each app actually
  writes / whether it runs DDL), connection-budget table (sum ≈ 81 < 100), the
  object-ownership step for migration-running apps (reassign app schemas to the
  existing tracksolid_owner — scoped, never REASSIGN OWNED globally), one-at-a-time
  cutover, and instant rollback (DATABASE_URL only).

Grants are best-effort by app function and explicitly call out where to verify before
cutover; all objects are postgres-owned, so row DML works but DDL needs the ownership
step. See the runbook.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 23:51:52 +03:00

71 lines
4.3 KiB
SQL

-- app_roles_fleet_platform.sql — dedicated NON-SUPERUSER login roles for the apps
-- that connect to the fleet_platform database as the `postgres` SUPERUSER.
-- ─────────────────────────────────────────────────────────────────────────────
-- Sibling of app_roles_tracksolid_db.sql, for the OTHER database on the same server.
-- gateway + cron (the fleet_platform Coolify app) connect here as postgres. Same
-- rationale: least privilege + a hard per-role CONNECTION LIMIT so they can't
-- exhaust the server-wide 100-connection ceiling.
--
-- Schemas in fleet_platform: auth, domain, events, geo, ops, serve, slo, state
-- (all owned by postgres). gateway (the API) and cron (scheduled jobs) almost
-- certainly READ+WRITE app state across these, so they get DML; widen/narrow per
-- the discovery step in MIGRATE_APPS_OFF_SUPERUSER.md. As with the sibling file,
-- this does NOT change object ownership, so it does not grant DDL on existing
-- (postgres-owned) objects — see step 3 of the runbook if these apps run migrations.
--
-- Run as the postgres SUPERUSER, on the fleet_platform database:
-- docker exec -i <timescale_db> psql -U postgres -d fleet_platform -v ON_ERROR_STOP=1 \
-- -v gateway_pw="$(cat ~/.gateway_app.pw)" \
-- -v cron_pw="$(cat ~/.cron_app.pw)" \
-- < scripts/app_roles_fleet_platform.sql
\set ON_ERROR_STOP on
-- ── 1. Capability group (read + write across the app schemas) ───────────────────
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='fp_app_rw') THEN CREATE ROLE fp_app_rw NOLOGIN; END IF;
END $$;
DO $grants$
DECLARE s text;
BEGIN
FOREACH s IN ARRAY ARRAY['auth','domain','events','geo','ops','serve','slo','state'] LOOP
EXECUTE format('GRANT USAGE ON SCHEMA %I TO fp_app_rw', s);
EXECUTE format('GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA %I TO fp_app_rw', s);
EXECUTE format('GRANT USAGE, SELECT, UPDATE ON ALL SEQUENCES IN SCHEMA %I TO fp_app_rw', s);
EXECUTE format('GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA %I TO fp_app_rw', s);
EXECUTE format('ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA %I GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO fp_app_rw', s);
EXECUTE format('ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA %I GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO fp_app_rw', s);
EXECUTE format('ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA %I GRANT EXECUTE ON FUNCTIONS TO fp_app_rw', s);
END LOOP;
END $grants$;
-- ── 2. Per-app LOGIN roles ──────────────────────────────────────────────────────
-- gateway — the request-facing API (latency-sensitive: short statement_timeout).
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='gateway_app') THEN
CREATE ROLE gateway_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
END IF; END $$;
ALTER ROLE gateway_app WITH LOGIN PASSWORD :'gateway_pw' CONNECTION LIMIT 15;
GRANT CONNECT ON DATABASE fleet_platform TO gateway_app;
GRANT fp_app_rw TO gateway_app;
ALTER ROLE gateway_app SET statement_timeout = '15s';
ALTER ROLE gateway_app SET idle_in_transaction_session_timeout = '30s';
ALTER ROLE gateway_app SET idle_session_timeout = '5min';
ALTER ROLE gateway_app SET lock_timeout = '3s';
-- cron — scheduled/background jobs (longer queries tolerated).
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname='cron_app') THEN
CREATE ROLE cron_app LOGIN INHERIT NOSUPERUSER NOCREATEDB NOCREATEROLE;
END IF; END $$;
ALTER ROLE cron_app WITH LOGIN PASSWORD :'cron_pw' CONNECTION LIMIT 5;
GRANT CONNECT ON DATABASE fleet_platform TO cron_app;
GRANT fp_app_rw TO cron_app;
ALTER ROLE cron_app SET statement_timeout = '120s';
ALTER ROLE cron_app SET idle_in_transaction_session_timeout = '120s';
ALTER ROLE cron_app SET idle_session_timeout = '10min';
ALTER ROLE cron_app SET lock_timeout = '5s';
-- ── 3. Verify ───────────────────────────────────────────────────────────────────
-- \du+