The pinned tag failed to pull on Coolify deploy. Switching to the untagged edoburu/pgbouncer (rolling latest) so the sidecar can come up. Will revisit pinning to a known-good tag once verified live. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
12 KiB
pgbouncer + pgAdmin4 sidecar deployment
Date: 2026-05-07
Branch: quality-program-2026-04-12
Status: Plan approved; implementation pending
Context
Driver: pgAdmin4 running on the maintainer's laptop has been exhausting
tracksolid_db's max_connections. Each Query Tool tab in pgAdmin holds its
own long-lived backend connection; combined with the existing peak of ~50–60
connections from the ingest pipeline, the budget tips over and cascades —
pgcli (and anything else trying to connect) starts failing.
Goal: Add pgbouncer in front of timescale_db to enforce a connection
budget via transaction-mode pooling, and deploy pgAdmin4 as a Coolify-managed
sidecar that connects through pgbouncer over the Docker network. Net effect:
pgAdmin sprawl is multiplexed onto a small fixed pool of backends, admin
tooling moves on-VM (lower latency, persistent workspace, smaller external
attack surface), and host port 5433 becomes optional/closeable in a follow-up.
Frozen scope (unchanged this round):
- DWH bronze pipeline (
dwh/*.sql,tracksolid_dwh@31.97.44.246:5888) - n8n DWH workflows (
n8n-workflows/dwh_extract*,dwh_load_bronze*) - Grafana provisioning (
grafana/provisioning/datasources/...) - Python ingest containers (
ingest_movement_rev.py,ingest_events_rev.py,webhook_receiver_rev.py) — they keep talking totimescale_db:5432directly. Cutover, if desired, is a separate plan. db_backupsidecar —pg_dumpis incompatible with transaction-mode pooling and stays ontimescale_db:5432.
Phase 1 — pgbouncer sidecar, no client cutover
Add a new service to docker-compose.yaml. Internal Docker network only;
no host port binding.
pgbouncer:
image: edoburu/pgbouncer
restart: always
depends_on:
timescale_db:
condition: service_healthy
env_file: .env
environment:
- DB_HOST=timescale_db
- DB_PORT=5432
- DB_USER=${POSTGRES_USER}
- DB_PASSWORD=${POSTGRES_PASSWORD}
- DB_NAME=${POSTGRES_DB}
- POOL_MODE=transaction
- AUTH_TYPE=scram-sha-256
- MAX_CLIENT_CONN=200
- DEFAULT_POOL_SIZE=15
- MIN_POOL_SIZE=2
- RESERVE_POOL_SIZE=5
- SERVER_RESET_QUERY=DISCARD ALL
- SERVER_IDLE_TIMEOUT=600
- ADMIN_USERS=${POSTGRES_USER}
- LISTEN_PORT=6432
- AUTH_USER=pgbouncer
- AUTH_QUERY=SELECT uname, phash FROM public.user_lookup($$1)
healthcheck:
test: ["CMD-SHELL", "pg_isready -h 127.0.0.1 -p 6432 -U ${POSTGRES_USER}"]
interval: 30s
timeout: 5s
retries: 3
Why these values:
POOL_MODE=transaction— recycles backend on every transaction boundary. Cuts pgAdmin's per-tab idle conn from 1 backend → ~0 when idle.DEFAULT_POOL_SIZE=15— total backend slots per (user, db) pair. Sits comfortably under Postgresmax_connections(default 100) leaving room for ingest's existing ~50–60.MAX_CLIENT_CONN=200— pgAdmin can open as many tabs as it wants; they queue rather than fail.RESERVE_POOL_SIZE=5— emergency slack whendefault_pool_sizesaturates.SERVER_RESET_QUERY=DISCARD ALL— wipes session state between transactions so leakedSETs from one client don't bleed into the next.
Auth: SCRAM passthrough via auth_query
Avoids hand-maintaining userlist.txt. pgbouncer authenticates as a
dedicated pgbouncer Postgres role and looks up SCRAM hashes for the
requesting user via a SECURITY DEFINER function.
New migration 10_pgbouncer_auth.sql (08 and 09 are taken by
08_analytics_config.sql and 09_trips_enrichment.sql):
-- Role created with placeholder password; run_migrations.py:sync_role_passwords
-- replaces it with PGBOUNCER_AUTH_PASSWORD on every container startup.
-- Same convention used today for grafana_ro.
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'pgbouncer') THEN
CREATE ROLE pgbouncer LOGIN PASSWORD 'SET_PASSWORD_IN_ENV';
END IF;
END
$$;
CREATE OR REPLACE FUNCTION public.user_lookup(in_user text,
OUT uname text, OUT phash text) RETURNS record AS $$
BEGIN
SELECT usename, passwd FROM pg_catalog.pg_shadow
WHERE usename = in_user INTO uname, phash;
RETURN;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
REVOKE ALL ON FUNCTION public.user_lookup(text) FROM public;
GRANT EXECUTE ON FUNCTION public.user_lookup(text) TO pgbouncer;
Two changes to run_migrations.py:
- Append
"10_pgbouncer_auth.sql"toMIGRATIONS. - Extend
sync_role_passwords()rolesdict with"pgbouncer": os.getenv("PGBOUNCER_AUTH_PASSWORD").
The migration is applied by the next ingest container restart and recorded
in tracksolid.schema_migrations. sync_role_passwords then ALTER ROLEs
the password from the env var so the placeholder is never live.
New env vars in .env
PGBOUNCER_AUTH_PASSWORD— password for the newpgbouncerPostgres role- (existing vars reused:
POSTGRES_USER,POSTGRES_PASSWORD,POSTGRES_DB)
Phase 1 verification
- Apply migration via ingest container restart; confirm in
tracksolid.schema_migrationsthat10_pgbouncer_auth.sqlis recorded. docker compose up -d pgbouncer.- From inside any compose service:
psql -h pgbouncer -p 6432 -U postgres -d tracksolid_db -c 'SELECT 1' - From the pgbouncer container's admin console:
Confirm pool mode =psql -h 127.0.0.1 -p 6432 -U postgres -d pgbouncer -c 'SHOW POOLS;'transaction, server connections withindefault_pool_size. SHOW STATS;andSHOW CLIENTS;should both respond.- Confirm no client has cut over:
tracksolid.ingestion_logcontinues accumulating; Grafana panels keep refreshing.
Phase 2 — pgAdmin4 sidecar pointed at pgbouncer
Coolify UI maps an HTTPS subdomain (e.g. pgadmin.stage.rahamafresh.com) to
internal port 80, mirroring the Grafana pattern at docker-compose.yaml:78–80.
pgadmin:
image: dpage/pgadmin4:8.14
restart: always
depends_on:
pgbouncer:
condition: service_healthy
env_file: .env
environment:
- PGADMIN_DEFAULT_EMAIL=${PGADMIN_DEFAULT_EMAIL}
- PGADMIN_DEFAULT_PASSWORD=${PGADMIN_DEFAULT_PASSWORD}
- PGADMIN_CONFIG_SERVER_MODE=True
- PGADMIN_CONFIG_MASTER_PASSWORD_REQUIRED=False
- PGADMIN_DISABLE_POSTFIX=True
volumes:
- pgadmin-data:/var/lib/pgadmin
- ./pgadmin/servers.json:/pgadmin4/servers.json:ro
# COOLIFY DOMAIN LOGIC:
# Set the actual URL in the Coolify UI; service exposes port 80 internally.
Add pgadmin-data to the volumes: block at the bottom of the compose file.
Pre-registered server (pgadmin/servers.json)
{
"Servers": {
"1": {
"Name": "tracksolid_db (via pgbouncer)",
"Group": "Servers",
"Host": "pgbouncer",
"Port": 6432,
"MaintenanceDB": "tracksolid_db",
"Username": "postgres",
"SSLMode": "disable",
"ConnectionParameters": {
"sslmode": "disable",
"connect_timeout": 10
}
}
}
}
New env vars in .env
PGADMIN_DEFAULT_EMAILPGADMIN_DEFAULT_PASSWORD
Phase 2 verification
- In the Coolify UI, point a subdomain at the
pgadminservice, port 80. - Open the URL, log in with
PGADMIN_DEFAULT_EMAIL/PGADMIN_DEFAULT_PASSWORD. - The pre-registered "tracksolid_db (via pgbouncer)" server appears in the
left tree. Connect; provide the
postgrespassword when prompted (pgAdmin stores it in its own keyring after first use). - Open a Query Tool, run
SELECT now(), current_user;. - From the
pgbouncercontainer admin console:SHOW POOLS;cl_activeshould reflect the open pgAdmin tab(s);sv_active/sv_idleshould sum to ≤default_pool_size(15). - Stress test: open ~30 Query Tool tabs, run
SELECT pg_sleep(0.1);in each. Confirmtracksolid_dbtotal connection count stays bounded:
Should beSELECT count(*) FROM pg_stat_activity;default_pool_size + reserve_pool_size + (other clients), not the number of pgAdmin tabs.
Files to modify / create
| Path | Change |
|---|---|
260507_pgbouncer_deployment.md |
THIS FILE — runbook for the rollout |
docker-compose.yaml |
Add pgbouncer and pgadmin services; add pgadmin-data volume |
10_pgbouncer_auth.sql |
NEW — creates pgbouncer role + public.user_lookup SECURITY DEFINER function |
pgadmin/servers.json |
NEW — pre-registers pgbouncer:6432 as the default server |
.env |
Add PGBOUNCER_AUTH_PASSWORD, PGADMIN_DEFAULT_EMAIL, PGADMIN_DEFAULT_PASSWORD (do not commit values) |
docs/CONNECTIONS.md |
Add a "pgbouncer + pgAdmin" section: pool mode, exposure, who uses it, how to connect for ad-hoc admin |
CLAUDE.md §3 / §4 |
Note that admin tooling now goes through pgbouncer:6432; ingest/grafana/backup remain direct; reference this runbook |
Files NOT to modify (frozen scope)
grafana/provisioning/datasources/tracksolid_postgres.yamln8n-workflows/dwh_extract*.json,n8n-workflows/dwh_load_bronze*.jsondwh/*.sqlingest_movement_rev.py,ingest_events_rev.py,webhook_receiver_rev.py,ts_shared_rev.pybackup/—pg_dumpkeeps usingtimescale_db:5432directly
Reused conventions and utilities
run_migrations.pyalready applies newNN_*.sqlfiles in order againsttracksolid_dband tracks them intracksolid.schema_migrations. Phase 1 adds10_pgbouncer_auth.sqlto this flow — no new tooling needed.env_file: .env+depends_on: <svc> condition: service_healthymirrors the existing pattern atdocker-compose.yaml:28–31, 39–42, 50–53, 67–70, 87–90.- Coolify domain-via-UI mirrors the Grafana comment at
docker-compose.yaml:78–80and the webhook_receiver comment atdocker-compose.yaml:54–55. - Container-name resolution rule from CLAUDE.md §3 still applies for any
docker execagainst the new services:docker ps --filter name=pgbouncer --format "{{.Names}}" | head -1 docker ps --filter name=pgadmin --format "{{.Names}}" | head -1
Out-of-scope follow-ups (separate plans)
- Cut over Python ingest to pgbouncer. Change
DATABASE_URLin.envfromtimescale_db:5432topgbouncer:6432. Requires verifying psycopg2 pool + SAVEPOINTs against transaction-mode pgbouncer (low risk per exploration — no LISTEN/NOTIFY, no advisory locks across statements, no prepared statements in the codebase). - Close host port 5433 on
timescale_dbonce pgAdmin web UI is the established admin path. Removes the public-IP Postgres exposure entirely. - Rotate
dwh_owner/grafana_roplaintext passwords still indwh/260423_dwh_ddl_v1.sql(pre-existing item from CLAUDE.md §10).
Rollback
If pgbouncer or pgAdmin misbehaves:
- Stop the new services without touching the rest of the stack:
Ingest, Grafana, webhook, backup are unaffected — they were never cut over.docker compose stop pgbouncer pgadmin docker compose rm -f pgbouncer pgadmin - Revert the SQL migration if needed:
DROP FUNCTION public.user_lookup(text); DROP ROLE pgbouncer; DELETE FROM tracksolid.schema_migrations WHERE filename = '10_pgbouncer_auth.sql'; - Revert compose changes by checking out the prior
docker-compose.yaml.
End-to-end verification checklist
10_pgbouncer_auth.sqlapplied — visible intracksolid.schema_migrationspgbouncerservice healthy —docker compose psshowshealthypsql -h pgbouncer -p 6432 -U postgres -d tracksolid_db -c 'SELECT 1'from inside the networkSHOW POOLS;in pgbouncer admin showstransactionmodepgadminservice healthy — Coolify domain reachable over HTTPS- Login + query through pgAdmin succeeds
SELECT count(*) FROM pg_stat_activity;stays bounded under 30-tab stress test- Existing pipelines unaffected:
tracksolid.ingestion_logcontinues growing at current rate; Grafana dashboards still render - pgcli no longer hits "too many connections" when used alongside pgAdmin