tracksolid_timescale_grafan.../docker-compose.yaml
david kiania 76f6915e61 feat(stack): consolidate 7→4 services (merge pollers, drop pgbouncer/grafana)
Collapse the backend from 7 Coolify services to 4 app services + the DB.

- Merge ingest_movement + ingest_events into a single ingest_worker:
  split each poller's main() into reusable startup_catchup()/register_jobs()
  and drive both from one schedule loop in new ingest_worker_rev.py
  (standalone entrypoints retained for local debug).
- docker-compose.yaml: replace the two poller services with ingest_worker;
  remove the pgbouncer service (dormant; transaction-mode pooling is unsafe
  for the advisory-lock'd v_trips refresher) and the grafana service +
  grafana-data volume (redundant with the FleetOps SPA).
- Add reporting.v_ingest_health (migration 19) + dashboard_api GET
  /health/ingest as the pipeline-freshness surface that replaces Grafana's
  health panels.

webhook_receiver stays isolated so a poller fault can't drop inbound pushes.
timescale_db and db_backup are unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:41:05 +03:00

117 lines
4.6 KiB
YAML

services:
timescale_db:
image: timescale/timescaledb-ha:pg16-ts2.15
restart: always
environment:
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
# HA image's PGDATA is /home/postgres/pgdata/data, not /var/lib/postgresql/data.
# Mount the named volume there so data survives container rebuilds.
- PGDATA=/home/postgres/pgdata/data
ports:
- "5433:5432"
volumes:
- timescale-data:/home/postgres/pgdata
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s
timeout: 5s
retries: 5
ingest_worker:
# Merged movement + events pollers (was ingest_movement + ingest_events).
# Both pipelines run in one process via ingest_worker_rev.py — same image,
# same shared connection pool, one `schedule` loop. See ingest_worker_rev.py.
build:
context: .
dockerfile: Dockerfile
command: sh -c "python run_migrations.py && python ingest_worker_rev.py"
restart: always
depends_on:
timescale_db:
condition: service_healthy
env_file: .env
webhook_receiver:
build:
context: .
dockerfile: Dockerfile
command: sh -c "python run_migrations.py && uvicorn webhook_receiver_rev:app --host 0.0.0.0 --port 8888 --workers 2"
restart: always
depends_on:
timescale_db:
condition: service_healthy
env_file: .env
# No host port binding — Coolify's Traefik proxy routes traffic internally.
# Set the webhook domain in Coolify UI pointing to this service on port 8888.
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8888/health"]
interval: 30s
timeout: 5s
retries: 3
dashboard_api:
# Stable read-API for the Live Position + Fleet Trips map dashboards.
# Replaces the n8n webhooks (n8n was only a thin HTTP->SQL proxy).
# Calls reporting.fn_live_positions / fn_vehicle_track / fn_trips_for_map.
build:
context: .
dockerfile: Dockerfile
command: sh -c "uvicorn dashboard_api_rev:app --host 0.0.0.0 --port 8890 --workers 2"
restart: always
depends_on:
timescale_db:
condition: service_healthy
env_file: .env
environment:
# Browser origins allowed to call this API (the dashboard domains).
- DASHBOARD_CORS_ORIGINS=${DASHBOARD_CORS_ORIGINS:-https://liveposition.rahamafresh.com,https://fleetintelligence.rahamafresh.com}
# No host port binding — set a domain (e.g. fleetapi.rahamafresh.com) in the
# Coolify UI pointing to this service on port 8890. The dashboards then point
# their N8N_BASE at that domain; paths (/webhook/...) are unchanged.
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8890/health"]
interval: 30s
timeout: 5s
retries: 3
# grafana — REMOVED 2026-06-10. Fleet visualisation/KPIs are now served by the
# FleetOps SPA (own repo) via the dashboard_api read layer. Pipeline freshness
# (the one thing only Grafana surfaced) is replaced by reporting.v_ingest_health
# (migration 19) exposed on the read-API. The grafana_ro role + reporting.*
# grants are retained (harmless, reusable). Provisioning kept in ./grafana for
# reference. To restore, re-add this service block.
# pgbouncer — REMOVED 2026-06-10. It was deployed but dormant (zero clients
# pointed at :6432; every service connects directly to timescale_db:5432).
# In-process pooling (ts_shared_rev ThreadedConnectionPool) is more than
# sufficient at this scale, and transaction-mode pooling is unsafe for the
# advisory-lock'd v_trips refresher (FIX-D02). Migration 10 (pgbouncer role +
# user_lookup()) is left applied but inert. To restore, re-add this service block.
db_backup:
build:
context: ./backup
dockerfile: Dockerfile
restart: always
depends_on:
timescale_db:
condition: service_healthy
env_file: .env
environment:
# pg_dump → rustfs. Credentials from .env (RUSTFS_*).
# BACKUP_TIMES: comma-separated HH:MM list in local TZ (default Africa/Nairobi).
- TZ=${TZ:-Africa/Nairobi}
- BACKUP_TIMES=${BACKUP_TIMES:-02:30,08:30,14:30,20:30}
- BACKUP_KEEP_DAYS=${BACKUP_KEEP_DAYS:-30}
- BACKUP_RUN_ON_START=${BACKUP_RUN_ON_START:-0}
- RUSTFS_ENDPOINT=${RUSTFS_ENDPOINT}
- RUSTFS_ACCESS_KEY=${RUSTFS_ACCESS_KEY}
- RUSTFS_SECRET_KEY=${RUSTFS_SECRET_KEY}
- RUSTFS_BUCKET=${RUSTFS_BUCKET:-fleet-db}
volumes:
timescale-data:
name: timescale-data
# grafana-data removed with the grafana service (2026-06-10).