Commit graph

24 commits

Author SHA1 Message Date
david kiania
5703d70aa6 feat(api): dedicated FastAPI read-API for map dashboards (replaces n8n)
Some checks failed
Static Analysis / static (push) Has been cancelled
Tests / test (push) Has been cancelled
n8n was a thin HTTP->SQL proxy for the Live Position and Fleet Trips maps and
proved fragile (credential reloads, :latest drift, shared connection limits).
This service calls the same proven reporting.* functions directly, reusing the
existing psycopg2 pool / Docker image / Coolify deploy.

Endpoints mirror the n8n webhook paths so the only frontend change is N8N_BASE:
  GET  /webhook/live-positions  -> {summary, geojson}   (fn_live_positions)
  GET  /webhook/vehicle-track   -> GeoJSON Feature       (fn_vehicle_track)
  GET  /webhook/fleet-dashboard -> filter options
  POST /webhook/fleet-dashboard -> trips payload         (fn_trips_for_map)

Response shapes replicate the n8n "Build response JSON" nodes exactly; empty
filters/sentinels ('', null, undefined) normalize to SQL wildcards. CORS limited
to the dashboard origins. Added dashboard_api service to docker-compose (port
8890, Coolify-routed). SQL contracts validated against prod.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 04:23:37 +03:00
david kiania
e5b0e192d8 chore(repo): reorganize tree into migrations/ data/ legacy/ docs/
Group root-level files (accreted from incremental changes) by purpose
without moving any deployment entrypoint or breaking imports:

- migrations/  : numbered SQL 02-10
- data/        : source CSVs
- legacy/      : superseded pre-_rev scripts + old pipeline notes (not deployed)
- docs/{manuals,reference,reports}/ : loose manuals, references, reports
- strip stray ** / *** prefixes from 5 doc filenames
- delete empty documents.txt / push_webhook.md

Reference updates so nothing breaks:
- run_migrations.py  -> /app/migrations/<file>
- run_migrations.sh  -> $SCRIPT_DIR/migrations
- import_drivers_csv.py -> data/<csv>
- docker-compose.yaml -> runbook path comment
- CLAUDE.md -> codebase map + inline doc references

Deployed Python (3 services + ts_shared_rev + run_migrations) and the
documented ops one-shots stay at root, preserving the flat-import layout
and all documented commands. Verified: py_compile clean across all modules,
every MIGRATIONS entry resolves under migrations/, CI-referenced paths
(tests/, mypy targets, db_audit) and the grafana build context intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 02:27:30 +03:00
David Kiania
3b79d5a62e revert(infra): remove pgAdmin4 sidecar and configs
Some checks failed
Static Analysis / static (push) Has been cancelled
Tests / test (push) Has been cancelled
Reverts the Phase 2 pgAdmin web sidecar from bc020cb. pgbouncer (Phase 1)
stays in place. On the instance the pgadmin container has been stopped
and removed and the pgadmin-data volume dropped; Coolify subdomain and
PGADMIN_DEFAULT_* env vars to be removed in the UI separately.

Files:
- docker-compose.yaml: drop pgadmin service block + pgadmin-data volume
- pgadmin/servers.json: delete (directory removed)
- 260507_pgbouncer_deployment.md: strip Phase 2, runbook is pgbouncer-only

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-08 00:34:10 +03:00
David Kiania
bc020cb1a8 feat(infra): add pgAdmin4 web sidecar pointed at pgbouncer
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Phase 2 of the pgbouncer + pgAdmin rollout. pgAdmin4 runs as a Coolify-
managed container on the same Docker network as pgbouncer, with a
pre-registered server entry so the tracksolid_db (via pgbouncer) tree
appears immediately on first login.

Net effect: admin tooling moves on-VM (low latency, persistent workspace
in pgadmin-data volume) and connects through pgbouncer:6432 in transaction
mode, so opening many Query Tool tabs no longer exhausts max_connections.
The desktop pgAdmin can be retired once this is verified live, after
which host port 5433 can also be closed.

Requires PGADMIN_DEFAULT_EMAIL and PGADMIN_DEFAULT_PASSWORD in the
Coolify env, plus a subdomain mapping to this service on port 80.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-07 14:03:32 +03:00
David Kiania
f3ad612a1c fix(infra): drop pgbouncer image tag — pin 1.23.1 unavailable
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
The pinned tag failed to pull on Coolify deploy. Switching to the
untagged edoburu/pgbouncer (rolling latest) so the sidecar can come up.
Will revisit pinning to a known-good tag once verified live.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-07 13:48:31 +03:00
David Kiania
e811dd8f34 feat(infra): add pgbouncer sidecar to cap tracksolid_db connections
Some checks are pending
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Phase 1 of the pgbouncer + pgAdmin rollout (runbook:
260507_pgbouncer_deployment.md). pgAdmin4 on the maintainer's laptop has
been exhausting tracksolid_db's max_connections, cascading to pgcli and
operations. Adds an internal-only pgbouncer service in transaction mode
with a small backend pool (default 15) so admin-tool sprawl can no
longer starve the ingest pipeline.

No client cutover this round - ingest, Grafana, webhook, and backup all
keep talking to timescale_db:5432 directly. SCRAM passthrough is wired
via a new pgbouncer role + public.user_lookup() function (migration 10).
The role is created with a placeholder password; sync_role_passwords()
in run_migrations.py replaces it from PGBOUNCER_AUTH_PASSWORD on every
container startup, mirroring the existing grafana_ro convention.

Requires PGBOUNCER_AUTH_PASSWORD to be set in .env before deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-07 13:21:35 +03:00
David Kiania
85cb408dea feat(backup): timestamp and schedule in Africa/Nairobi local time
Some checks failed
Static Analysis / static (push) Has been cancelled
Tests / test (push) Has been cancelled
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
- Default TZ=Africa/Nairobi baked into the sidecar image; override via
  compose TZ env var if another region is ever needed.
- Rename BACKUP_TIMES_UTC → BACKUP_TIMES (legacy var still honored for
  back-compat). Times are now interpreted in the container's local TZ,
  so "02:30" means 02:30 EAT, not UTC.
- Log timestamps and dump filenames use %FT%T%z / %Y%m%d_%H%M%S_%Z
  (e.g. tracksolid_db_20260424_115729_EAT.sql.gz) so the TZ is visible
  on every artifact.
- Prune cutoff computed in local time; YYYYMMDD regex unchanged so it
  still matches legacy UTC filenames during the transition.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 11:30:20 +03:00
David Kiania
c585e67482 feat(backup): run pg_dump multiple times per day via BACKUP_TIMES_UTC
Some checks failed
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
Replace the single BACKUP_HOUR/BACKUP_MINUTE slot with a comma-separated
list of UTC times. Scheduler walks all slots and sleeps until the soonest
future one, so four daily backups become a one-line env change:

    BACKUP_TIMES_UTC=02:30,08:30,14:30,20:30  (default)

Legacy BACKUP_HOUR/BACKUP_MINUTE still honored as a single slot for
backwards compatibility with existing .env files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 11:00:02 +03:00
David Kiania
3807d9554c fix(db): mount TimescaleDB HA volume at correct PGDATA path
The timescale/timescaledb-ha image uses /home/postgres/pgdata/data as
PGDATA, not /var/lib/postgresql/data. The previous mount pointed at an
empty directory that postgres never wrote to, so Coolify redeploys
destroyed all data with the container's overlay filesystem.

Pin PGDATA explicitly and move the named timescale-data volume to
/home/postgres/pgdata so the real data dir is persisted.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 10:59:53 +03:00
David Kiania
108c1be057 feat: nightly pg_dump sidecar uploads to rustfs fleet-db bucket
Some checks failed
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
Adds a `db_backup` sidecar that dumps tracksolid_db every night at
02:30 UTC (configurable via BACKUP_HOUR/BACKUP_MINUTE), gzips the
output, and uploads to s3://fleet-db/daily/<dbname>_<ts>.sql.gz on
the rustfs S3-compatible instance (s3.rahamafresh.com). Prunes
objects older than BACKUP_KEEP_DAYS (default 30).

Required .env additions (Coolify UI):
  RUSTFS_ENDPOINT=https://s3.rahamafresh.com
  RUSTFS_ACCESS_KEY=...
  RUSTFS_SECRET_KEY=...
  RUSTFS_BUCKET=fleet-db

Mitigates data loss when Coolify service recreation wipes the
service-ID-scoped timescale-data volume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 12:53:23 +03:00
David Kiania
07ef491695 fix: change DB host port 5888→5433 (5888 already allocated by legacy DB)
Some checks failed
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 14:19:20 +03:00
David Kiania
160f477318 infra: expose timescale_db port 5888 for direct pgcli access
Some checks failed
Static Analysis / static (push) Waiting to run
Tests / test (push) Waiting to run
Static Analysis / static (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
Maps host port 5888 → container port 5432 so the DB can be reached
directly from the MacBook (requires UFW allow 5888/tcp on the server).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 14:14:32 +03:00
David Kiania
fcc745f09d Fix Grafana provisioning: bake datasource/dashboard config into custom image
Coolify only copies docker-compose.yaml and .env to its working directory —
the ./grafana/provisioning bind mount source was always empty on the server,
so Grafana started with no datasource or dashboard configured (causing the
'Failed to load home dashboard' error).

Fix: build a custom Grafana image (grafana/Dockerfile) that COPYs the
provisioning directory at image build time. Grafana substitutes
${GRAFANA_DB_RO_PASSWORD} at startup from the env var now in Coolify's store.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 22:18:44 +03:00
David Kiania
cd6b2ca81a Add Grafana NOC fleet dashboard with provisioning
Adds a fully-provisioned Grafana dashboard for NOC operators to monitor
80 vehicles in real-time: live geomap with direction arrows, speed, driver
info, and color-coded plates. Includes datasource and dashboard provider
YAMLs, dashboard JSON (schemaVersion 39 / Grafana 11.0.0), and
docker-compose updates to mount provisioning at container start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 00:01:52 +03:00
David Kiania
2f3879aa2a Add n8n workflow templates and change webhook port to 8888
Port 8000 was already in use on the host. Updated uvicorn to listen
on 8888. Added 6 importable n8n workflow JSON files for Jimi push
data forwarding (OBD, faults, alarms, GPS, heartbeats, trips).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 18:54:42 +03:00
David Kiania
326764e1a0 Fix migration failures: switch to full TimescaleDB + use psql runner
- Change image from timescaledb-ha:pg16-ts2.15-oss to pg16-ts2.15
  (OSS edition lacks compression, retention, continuous aggregates)
- Add postgresql-client to Dockerfile for psql binary
- Rewrite run_migrations.py to use psql instead of psycopg2
  (psql runs each statement independently; psycopg2 wraps the
  entire file in one transaction so one error rolls back everything)
- Add schema verification: exits 1 if critical tables missing,
  preventing services from starting with broken schema

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 17:17:58 +03:00
David Kiania
3bbf3b777d Run migrations inline at each service startup instead of init service
Coolify doesn't support service_completed_successfully dependency.
Each Python service now runs 'python run_migrations.py' before its
main process. SQL is idempotent so concurrent runs are safe.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 17:11:12 +03:00
David Kiania
4a31de30b1 Add db_migrate init service to auto-run SQL schema on deploy
- New run_migrations.py: executes 02_*.sql and 03_*.sql in order
- New db_migrate service: runs once before all other services start
- All services now depend on db_migrate (service_completed_successfully)
- Tolerates re-deploy: catches errors from already-existing objects

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 17:02:09 +03:00
David Kiania
b59616c7aa Remove webhook_receiver host port binding (Coolify proxy handles routing)
Port 8000 was already allocated on the host. On Coolify, Traefik routes
external traffic to the container internally — no host port needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:49:52 +03:00
David Kiania
2fbd286d29 Fix timescale_db: remove empty ports key causing Coolify deploy failure
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:42:20 +03:00
David Kiania
de70972d6a Add webhook receiver, consolidate shared utilities, expand telemetry coverage
- Add FastAPI webhook receiver (webhook_receiver_rev.py) for Jimi push data:
  OBD diagnostics, DTC fault codes, alarms, GPS, heartbeats, trip reports
- Add schema migration (03_webhook_schema_migration.sql) for webhook tables:
  fault_codes, heartbeats, expanded obd_readings/trips/position_history/alarms
- Consolidate duplicated _safe/_shutdown into shared safe_task/setup_shutdown
  in ts_shared_rev.py (DRY refactor)
- Add auto-commit to get_conn() context manager (prevents forgotten commits)
- Fix poll_trips to capture runTimeSecond and maxSpeed from API
- Add poll_parking via jimi.open.platform.report.parking
- Remove broken poll_obd (OBD is push-only, no polling endpoint exists)
- Fix alarms schema: add lat/lng/acc_status columns + dedup constraint
- Fix obd_readings schema: add dedup constraint
- Fix trigger DO block: replace nonexistent has_column with information_schema
- Narrow api_post exception handling to RequestException/ValueError
- Add webhook_receiver service to docker-compose.yaml
- Add fastapi/uvicorn/python-multipart to pyproject.toml
- Add clean_ts timestamp validator to ts_shared_rev.py
- Add Tracksolid Pro API documentation (tracksolidApiDocumentation.md)
- Populate .gitignore with Python/OS/secrets patterns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:31:17 +03:00
85b50db71a Update docker-compose.yaml 2026-04-07 21:53:11 +00:00
bdd26472e7 Update docker-compose.yaml
Changing port to 5599 to avoid confilict
2026-04-07 20:00:16 +00:00
David Kiania
6205c483ee Deploy v2.0 Production Telemetry Stack 2026-04-07 21:34:40 +03:00