Staging environment + FleetOps split #17
1 changed files with 224 additions and 0 deletions
224
docs/STAGING_FLEETOPS_ARCHITECTURE.md
Normal file
224
docs/STAGING_FLEETOPS_ARCHITECTURE.md
Normal file
|
|
@ -0,0 +1,224 @@
|
|||
# Staging Environment & FleetOps Split — Architecture
|
||||
|
||||
**Status:** approved 2026-06-10 · **Owner:** kianiadee · **Audience:** both developers (mixed
|
||||
technical/ops background — readable without prior context).
|
||||
|
||||
This document describes how we (a) introduce a **staging environment** under the
|
||||
`fivetitude.com` umbrella so the production FleetNow map is never edited directly, and (b)
|
||||
**split the product** into two surfaces: **FleetNow** (live tracking) and **FleetOps** (fleet
|
||||
operations — fuel, analytics, KPIs).
|
||||
|
||||
> **No secrets here.** All connection values come from `.env` at runtime — see
|
||||
> [`CONNECTIONS.md`](CONNECTIONS.md).
|
||||
|
||||
---
|
||||
|
||||
## 1. Why this change
|
||||
|
||||
FleetNow (`fleetnow.rahamafresh.com`) is now the client's **production** map, so we can no
|
||||
longer make feature changes or run tests directly against it. Separately, the client asked us
|
||||
to separate **fleet tracking** from **fleet operations** (fuel management, analytics). That
|
||||
gives us two needs:
|
||||
|
||||
1. A **staging environment** that mirrors production for safe development and testing.
|
||||
2. A **new FleetOps surface** (`fleetops.rahamafresh.com`) distinct from the tracking map.
|
||||
|
||||
### Decisions on record
|
||||
|
||||
| Decision | Choice |
|
||||
|---|---|
|
||||
| Staging umbrella domain | **`fivetitude.com`** — DNS is a **wildcard** (`*.fivetitude.com` → the VPS), so staging subdomains need **no per-host DNS records**, only Traefik/Coolify host rules |
|
||||
| FleetOps surface | **New custom SPA** (FleetNow-style), consuming an extended `dashboard_api` — *not* Grafana |
|
||||
| Staging data backing | **Full stack reading the shared production `reporting.*` read-layer** (read-only, no DB duplication) |
|
||||
| Deploy mechanism | **Forgejo → Coolify webhook deploys** across all Coolify apps (replaces polling/manual) |
|
||||
| FleetOps web server | **Caddy** (greenfield) for the cleaner Caddyfile + native `{env.*}` API-base injection. Chosen for config ergonomics, **not** TLS — Traefik already terminates TLS. Existing nginx SPAs stay as-is (mixed fleet until FleetNow's next touch) |
|
||||
|
||||
---
|
||||
|
||||
## 2. Target topology
|
||||
|
||||
| Environment | FleetNow (tracking) | FleetOps (operations) | Read-API |
|
||||
|---|---|---|---|
|
||||
| **Production** (`rahamafresh.com`) | `fleetnow.rahamafresh.com` — *frozen* | `fleetops.rahamafresh.com` — **new** | `fleetapi.rahamafresh.com` |
|
||||
| **Staging** (`fivetitude.com`) | `fleetnow.fivetitude.com` | `fleetops.fivetitude.com` | `fleetapi.fivetitude.com` |
|
||||
|
||||
- Every product surface (FleetNow/FleetOps × prod/staging) is a **Coolify app** (Dockerfile →
|
||||
static web server), one app per cell, each bound to its own git branch. **FleetOps uses
|
||||
Caddy** (clean Caddyfile, native `{env.*}` for the per-env API base); the existing FleetNow
|
||||
and the two legacy SPAs remain on **nginx**. Both are plain `:80` file servers — **Traefik
|
||||
terminates TLS**, so Caddy's auto-HTTPS is intentionally unused.
|
||||
- The read-API (`dashboard_api`) is a **standalone Traefik-labelled bridge container** — *not*
|
||||
Coolify-managed. It is deployed by a host script and gains a **second staging instance**.
|
||||
- **Staging reads the same production TimescaleDB** over the internal Docker network, but as a
|
||||
**read-only role** with the materialized-view refresher **disabled** (see §6).
|
||||
|
||||
```
|
||||
┌─────────────────────────── VPS (31.97.44.246) ───────────────────────────┐
|
||||
PRODUCTION │ │
|
||||
fleetnow.raha… ──────┼─► Coolify app (FleetNow:main) ─┐ │
|
||||
fleetops.raha… ──────┼─► Coolify app (FleetOps:main) ─┼─► fleetapi.rahamafresh.com (bridge:8890) │
|
||||
│ │ │ app role (rw) + refresher │
|
||||
│ │ ▼ │
|
||||
STAGING │ │ ┌──────────────────────────┐ │
|
||||
fleetnow.fivet… ─────┼─► Coolify app (FleetNow:staging)┼──►│ tracksolid_db │ │
|
||||
fleetops.fivet… ─────┼─► Coolify app (FleetOps:staging)┼─┐ │ reporting.* / v_trips MV │ │
|
||||
│ │ └►│ tracksolid.v_* │ │
|
||||
│ fleetapi.fivetitude.com ──────┘ └──────────────────────────┘ │
|
||||
│ (bridge:8891, read-only role, refresher OFF) │
|
||||
└───────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. The two read-API instances
|
||||
|
||||
The API code is `dashboard_api_rev.py` **in this repo**. Production is deployed by
|
||||
`~/deploy_dashboard_api.sh` (bind-mounts `~/dashboard_api/dashboard_api_rev.py`, **port 8890**,
|
||||
Traefik host `fleetapi.rahamafresh.com`). Staging mirrors it:
|
||||
|
||||
| | Production | Staging |
|
||||
|---|---|---|
|
||||
| Host rule | `fleetapi.rahamafresh.com` | `fleetapi.fivetitude.com` |
|
||||
| Port | 8890 | **8891** |
|
||||
| Code mount | `~/dashboard_api/` | `~/dashboard_api_staging/` (WIP checkout) |
|
||||
| Deploy script | `~/deploy_dashboard_api.sh` | **`deploy_dashboard_api_staging.sh`** (checked into this repo) |
|
||||
| DB role | app role (read/write) | **read-only** (`dashboard_ro` / `grafana_ro`) |
|
||||
| `v_trips` refresher | **owns it** | **disabled** |
|
||||
| CORS origins | `fleetnow.rahamafresh.com`, `fleetintelligence.…`, `liveposition.…`, **+ `fleetops.rahamafresh.com`** | `fleetnow.fivetitude.com`, `fleetops.fivetitude.com` |
|
||||
|
||||
> **CORS must be set unconditionally** in the deploy script (strip any inherited value) — this
|
||||
> is the [FIX-D03](../CLAUDE.md) lesson. Env/CORS changes require a container **recreate**, not
|
||||
> a restart.
|
||||
|
||||
### Analytics endpoints (FleetOps)
|
||||
|
||||
FleetOps consumes new **read-only** routes added to `dashboard_api_rev.py`, reusing the
|
||||
existing psycopg2 pool (`ts_shared_rev.py`), the Content-Type body-parse pattern (FIX-D01), and
|
||||
the JSONB/GeoJSON return style of the existing `/webhook/*` routes:
|
||||
|
||||
| Route | Backed by |
|
||||
|---|---|
|
||||
| `GET /analytics/fleet-summary` | `reporting.v_daily_summary` / `v_weekly_summary` / `v_monthly_summary` + `v_daily_cost_centre` |
|
||||
| `GET /analytics/utilisation` | derived from the `reporting` summaries (idle_pct, km/day) |
|
||||
| `GET /analytics/driver-behaviour` | `tracksolid.v_driver_aggregates_daily` |
|
||||
| `GET /analytics/fuel` | `reporting.v_trips.fuel_consumed_l` + `devices.fuel_100km` — **data-gated** (returns "needs data" flags until populated) |
|
||||
| `GET /analytics/filters` | `reporting.v_filter_*` (mirrors `GET /webhook/fleet-dashboard`) |
|
||||
|
||||
Any aggregation that isn't a thin wrapper becomes a **new numbered migration**
|
||||
(`migrations/15_*.sql`) — never edit an applied migration.
|
||||
|
||||
> **Reuse the existing reporting layer.** The analytics building blocks are `reporting.*`
|
||||
> (migrations 11/14) and the surviving `tracksolid.v_*` views (migration 07). The `ops.*` and
|
||||
> `dwh_gold.*` schemas were **purged 2026-06-05** (migrations 12/13) — do **not** reference
|
||||
> `ops.*`, `dwh_gold.*`, `v_utilisation_daily`, or `v_sla_inflight`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Deploy & promotion (Forgejo → Coolify webhooks)
|
||||
|
||||
All Coolify apps move from polling/manual to **webhook-driven** deploys. For each app, take
|
||||
Coolify's per-app **deploy webhook URL** (+ token) and register it as a **push webhook in the
|
||||
matching Forgejo repo**, scoped to the bound branch.
|
||||
|
||||
**Promotion model** (both FleetNow and FleetOps):
|
||||
|
||||
```
|
||||
feature branch ──merge──► staging ──(Forgejo webhook)──► Coolify deploys *.fivetitude.com
|
||||
│ validate
|
||||
main ◄──merge──────────────────────┘
|
||||
│
|
||||
└──(Forgejo webhook)──► Coolify deploys *.rahamafresh.com (prod)
|
||||
```
|
||||
|
||||
Production is touched **only** by a merge to `main`. That branch discipline is what satisfies
|
||||
"no direct changes to production FleetNow."
|
||||
|
||||
> **Exception:** the `dashboard_api` bridge is **not** Coolify-managed and does **not** deploy
|
||||
> via Forgejo webhook — it is deployed by its host script (`deploy_dashboard_api*.sh`). The API
|
||||
> code's source of truth is this repo; the staging instance bind-mounts a WIP checkout so new
|
||||
> endpoints are validated on `fleetapi.fivetitude.com` before the file is promoted to
|
||||
> `~/dashboard_api/` on prod.
|
||||
|
||||
---
|
||||
|
||||
## 5. FleetOps SPA (new repo)
|
||||
|
||||
- **Remote:** `https://repo.rahamafresh.com/kianiadee/fleetops.git`
|
||||
- **Local working copy:** `~/Downloads/projects/15_fleetops` (scaffolded from empty)
|
||||
- **Shape:** FleetNow-style deploy flow, but **Dockerfile → Caddy** via Coolify; branded for
|
||||
operations/analytics. The Caddyfile is a ~5-line SPA server (`try_files {path} /index.html`,
|
||||
`encode zstd gzip`) on `:80` behind Traefik.
|
||||
- **API base URL is build/runtime configurable** via Caddy's native `{env.API_BASE}`
|
||||
substitution (set per Coolify app): staging → `fleetapi.fivetitude.com`, prod →
|
||||
`fleetapi.rahamafresh.com`.
|
||||
- **FleetNow** gets the same treatment in *its own* repo: a `staging` branch and a
|
||||
parameterized API base URL (assumed currently hardcoded to `fleetapi.rahamafresh.com`).
|
||||
|
||||
---
|
||||
|
||||
## 6. Safety — staging on the shared production read-layer
|
||||
|
||||
Staging hits the **production database**, so isolation is enforced at the **DB-role level**,
|
||||
not by a separate DB:
|
||||
|
||||
- The staging `dashboard_api` connects as a **read-only role** — reuse `grafana_ro`, or add a
|
||||
dedicated `dashboard_ro` with `GRANT SELECT` on `reporting.*` and the `tracksolid.v_*`
|
||||
views. Accidental writes from staging are then impossible.
|
||||
- The **`reporting.v_trips` materialized-view refresher is disabled on staging** — production
|
||||
owns it. The refresher needs write perms and is already pg-advisory-lock guarded (key
|
||||
`920_145`, FIX-D02); a read-only staging role would only log errors, so disable it explicitly
|
||||
(refresh interval `0` / env guard).
|
||||
- New `/analytics/*` queries stay backed by the **indexed `reporting.*` views / matview**, not
|
||||
raw hypertable scans, so staging traffic doesn't load the prod DB.
|
||||
|
||||
---
|
||||
|
||||
## 7. Phased rollout
|
||||
|
||||
Ordered by dependency and risk — prove the foundation and the deploy pipeline first; touch the
|
||||
client's production domains **last**.
|
||||
|
||||
| Phase | Scope | Exit criterion |
|
||||
|---|---|---|
|
||||
| **0 — Foundation** | This document; migrate all Coolify apps to Forgejo webhook deploys; provision the read-only DB role | Every existing Coolify app redeploys via webhook; read-only role can `SELECT` `reporting.*` + `tracksolid.v_*` and nothing else |
|
||||
| **1 — Staging backbone** | Staging `dashboard_api` bridge (`deploy_dashboard_api_staging.sh`, 8891, `fleetapi.fivetitude.com`, read-only, refresher off, staging CORS) | `curl https://fleetapi.fivetitude.com/health` ok; verifiably read-only; no staging rows in `reporting.refresh_log` |
|
||||
| **2 — FleetNow staging** | FleetNow repo: `staging` branch + parameterized API base + `fleetnow.fivetitude.com` Coolify app | Renders against staging API; `staging` push deploys staging only, `main` merge deploys prod only; prod FleetNow untouched |
|
||||
| **3 — FleetOps backend** | `/analytics/*` endpoints in `dashboard_api_rev.py`; `migrations/15_*` if needed; tested on the staging API | Every route returns correct shape on `fleetapi.fivetitude.com`; fuel route returns "needs data" flags |
|
||||
| **4 — FleetOps SPA** | Scaffold `15_fleetops` (git init + remote + SPA/Dockerfile); `fleetops.fivetitude.com` Coolify app | Renders fuel/analytics/utilisation/driver panels from staging endpoints; CORS clean |
|
||||
| **5 — Production cutover** | Promote API to prod + prod CORS add; `fleetops.rahamafresh.com` Coolify app; prod DNS record; update `CLAUDE.md` / `CONNECTIONS.md` / `PLATFORM_OVERVIEW.html` | FleetOps live on prod; prod FleetNow/API otherwise unchanged; docs current |
|
||||
|
||||
---
|
||||
|
||||
## 8. Verification checklist
|
||||
|
||||
1. **Staging API up:** `curl -f https://fleetapi.fivetitude.com/health` → `{status: ok}`;
|
||||
resolve the container via `docker ps --filter name=dashboard_api_staging`.
|
||||
2. **Read-only enforced:** a write attempt from the staging role fails; **no**
|
||||
`reporting.refresh_log` rows carry a staging source.
|
||||
3. **Analytics:** hit each `/analytics/*` on staging, diff the JSON against the underlying view
|
||||
output via `docker exec $DB psql`; fuel returns "needs data" flags.
|
||||
4. **CORS:** browser-load `fleetops.fivetitude.com` and `fleetnow.fivetitude.com`; XHRs to
|
||||
`fleetapi.fivetitude.com` succeed; prod `fleetops.rahamafresh.com` reaches the prod API.
|
||||
5. **Webhook promotion:** push to `staging` → Forgejo webhook fires → **only** the
|
||||
`*.fivetitude.com` app redeploys (check Coolify deploy log + Forgejo webhook delivery);
|
||||
merge to `main` → only the `*.rahamafresh.com` app redeploys.
|
||||
6. **Prod FleetNow untouched:** prod `fleetnow`/`fleetapi` containers not recreated except the
|
||||
intentional prod-CORS add.
|
||||
|
||||
---
|
||||
|
||||
## 9. Risks & open items
|
||||
|
||||
- **FleetNow API-base** parameterization assumes it's currently hardcoded — confirm in that repo.
|
||||
- **Shared-DB load:** staging traffic is light, but watch the prod DB if staging analytics
|
||||
queries get heavy; the read-only role + indexed views are the guardrails.
|
||||
- **Fuel analytics are data-blocked:** `devices.fuel_100km` is NULL fleet-wide and the
|
||||
`/pushoil` + `/pushobd` webhooks aren't registered, so FleetOps fuel views ship as scaffold
|
||||
until those Open Items (CLAUDE.md §10) are closed.
|
||||
- **Naming trap:** `stage.rahamafresh.com` is the *production* host alias (a legacy name). Keep
|
||||
all real staging under `*.fivetitude.com` to avoid confusion.
|
||||
|
||||
---
|
||||
|
||||
*Related: [`CONNECTIONS.md`](CONNECTIONS.md) · [`PLATFORM_OVERVIEW.html`](PLATFORM_OVERVIEW.html) ·
|
||||
[`DWH_PIPELINE.md`](DWH_PIPELINE.md) · root `CLAUDE.md` (§3 map dashboards, §7 fix history).*
|
||||
Loading…
Reference in a new issue