fleetanalytics_mcp/scripts/analytics_ro_role.sql
kiania c02c127798 fix(connections): shrink MCP DB-connection footprint on a shared 100-conn DB
The DB is at max_connections=100 and several stack services hold persistent
pools (several as the postgres superuser, idle for hours), so peaks hit
"too many connections". The MCP is a minor contributor but easy to bound:

- Dockerfile: uvicorn --workers 2 → 1. The MCP's connection budget is
  workers × MCP_POOL_MAX, so this caps it at 8 backends instead of 16. Scale
  via MCP_POOL_MAX, not workers, so the budget stays obvious. (Pairs with the
  minconn=0 lazy pool already on this branch: 0 connections held when idle.)
- analytics_ro_role.sql: add idle_session_timeout=5min so the DB reaps the
  MCP's idle POOLED connections (idle_in_transaction never reaps them — they're
  idle outside a txn) and returns the slots. Safe because the server now
  discards + transparently retries a reaped connection instead of erroring.

Note: the dominant fix is stack-wide (get the superuser app pools onto bounded,
timed roles; consider PgBouncer; or raise max_connections) — out of this repo's
scope but documented in the review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 23:38:22 +03:00

70 lines
4.4 KiB
SQL

-- analytics_ro_role.sql — dedicated read-only LOGIN role for the analytics MCP server.
--
-- Sibling of dashboard_ro_role.sql, but for the decision & analytics team's MCP
-- server (analytics_mcp_rev.py) rather than the dashboard bridge. A separate role
-- keeps the two access paths independently revocable and lets us apply tighter,
-- human-ad-hoc-query guards (statement_timeout, idle-txn timeout) without touching
-- the dashboard bridge's credential.
--
-- Run as the postgres SUPERUSER (CREATE ROLE), NOT via run_migrations.py (which
-- connects as the app role and may lack CREATEROLE). Apply with
-- scripts/bootstrap_analytics_ro.sh, which supplies the password as the psql
-- variable :ro_pw from a host-only 0600 file — so no secret lives in this repo.
--
-- It grants exactly the read surface the MCP server needs:
-- * SELECT on reporting.* and tracksolid.* (tables + views)
-- * SELECT on the reporting.v_trips MATERIALIZED VIEW — matviews are NOT
-- covered by GRANT ... ON ALL TABLES, so it must be named explicitly
-- * EXECUTE on the reporting.fn_* functions (so analysts can SELECT reporting.fn_...)
-- * DEFAULT PRIVILEGES so future objects created by the migration role are
-- auto-readable (no re-grant when we add views)
-- Read-only: no INSERT/UPDATE/DELETE and not the matview owner, so analytics_ro
-- can never write or REFRESH. Idempotent -> safe to re-apply (also rotates pw).
\set ON_ERROR_STOP on
DO $role$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'analytics_ro') THEN
CREATE ROLE analytics_ro LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE;
END IF;
END $role$;
ALTER ROLE analytics_ro WITH LOGIN PASSWORD :'ro_pw';
GRANT CONNECT ON DATABASE tracksolid_db TO analytics_ro;
GRANT USAGE ON SCHEMA reporting, tracksolid, tickets, fuel TO analytics_ro;
GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO analytics_ro; -- tables + views
GRANT SELECT ON ALL TABLES IN SCHEMA tracksolid TO analytics_ro; -- tables + views
GRANT SELECT ON ALL TABLES IN SCHEMA tickets TO analytics_ro; -- INC/CRQ tickets: tables + views
GRANT SELECT ON ALL TABLES IN SCHEMA fuel TO analytics_ro; -- fuel: tables + views
GRANT SELECT ON reporting.v_trips TO analytics_ro; -- MATERIALIZED VIEW (not in ALL TABLES)
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA reporting TO analytics_ro;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA tickets TO analytics_ro;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA fuel TO analytics_ro;
-- "dynamic": future objects are auto-granted. reporting/tracksolid are created by the
-- migration role (tracksolid_owner); tickets/fuel are owned by postgres, so their
-- default-privilege grants must be keyed to postgres or new objects won't be readable.
-- NOTE: matviews are still never covered — a new matview needs its own explicit GRANT.
ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT SELECT ON TABLES TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA tracksolid GRANT SELECT ON TABLES TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT EXECUTE ON FUNCTIONS TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA tickets GRANT SELECT ON TABLES TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA fuel GRANT SELECT ON TABLES TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA tickets GRANT EXECUTE ON FUNCTIONS TO analytics_ro;
ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA fuel GRANT EXECUTE ON FUNCTIONS TO analytics_ro;
-- Extra hardening over dashboard_ro: this role serves ad-hoc HUMAN queries via the
-- MCP server, so pin read-only at the role level and cap runaway work. These are
-- belt-and-braces alongside the read-only txn the server itself uses.
ALTER ROLE analytics_ro SET default_transaction_read_only = on;
ALTER ROLE analytics_ro SET statement_timeout = '30s';
ALTER ROLE analytics_ro SET idle_in_transaction_session_timeout = '60s';
-- Cap idle POOLED connections too (these sit idle but NOT in a transaction, so the
-- idle_in_transaction timeout never reaps them). On a shared 100-connection DB this
-- returns slots the MCP isn't using. Safe with the server's dead-connection handling:
-- a reaped pooled connection is discarded + transparently retried, not surfaced as an
-- error. (Requires PostgreSQL 14+.)
ALTER ROLE analytics_ro SET idle_session_timeout = '5min';