docs: add deployment & operations runbook (Coolify, webhook, bucket cutover)

Capture the operational knowledge from the isptickets cutover: Coolify app/container,
env management (encrypted — UI or artisan tinker), cron, the Forgejo->Coolify auto-deploy
webhook (config + recreate/verify; it was missing), manual deploy trigger, the
source-bucket cutover procedure, and verification queries. Link it from README; refresh
stale tickets-bucket/ETag references in implementation.md to the isptickets CDC model.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
david kiania 2026-06-25 22:24:23 +03:00
parent f06c11fd11
commit 0787d3a185
3 changed files with 186 additions and 7 deletions

View file

@ -120,6 +120,11 @@ For a plain host/VM instead of Coolify, [`run_ingest.sh`](run_ingest.sh) loads `
and runs the ingest; schedule it with a crontab line
(`CRON_TZ=Africa/Nairobi` / `*/20 6-20 * * *`).
Full operational runbook — container, env management (encrypted; via the UI or
`artisan tinker`), the **Forgejo → Coolify auto-deploy webhook**, manual deploys, and the
source-bucket cutover procedure — is in
[`docs/deployment-and-operations.md`](docs/deployment-and-operations.md).
### Bucket cutover (one-time reseed)
When the source provider moves the feed to a new bucket (e.g. `tickets``isptickets`),

View file

@ -0,0 +1,167 @@
# Deployment & Operations — fleettickets
Operational runbook for the INC ingest pipeline as deployed on **Coolify**
(host `kianiadee@twala.rahamafresh.com`, key `~/.ssh/id_ed25519`). Covers the
container, environment, schedule, auto-deploy webhook, the source-bucket cutover
procedure, and verification. Secrets are referenced by **where to retrieve them**,
never by value.
## What's deployed
| Thing | Detail |
|---|---|
| Coolify app | **`fleettickets`** — id `15`, uuid `g14mwzo73q20g70vc6fzumya`, build pack `dockerfile`, git `main` |
| Container | built from this repo's `Dockerfile` (`python:3.12-slim`, `TZ=Africa/Nairobi`); kept alive with `tail -f /dev/null` (no web server) |
| Ingest | a Coolify **Scheduled Task** `inc_tickets` running `python import_tickets.py --from-bucket --apply` |
| DB | `tickets` schema in the shared `tracksolid_db` (internal host `timescale_db:5432`) |
| Source | **`isptickets`** S3 bucket, `automations/inc/changes/<EAT-ts>.csv` CDC stream (see `../n8n-s3-ticket-exports.md` and `../README.md`) |
Resolve the live container name (Coolify appends a random suffix):
```bash
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com \
'docker ps --filter name=g14mwzo73q20g70vc6fzumya --format "{{.Names}}" | head -1'
```
## Schedule (cron)
The Scheduled Task runs **`*/20 6-20 * * *`** — every 20 min, **06:0020:40 EAT**.
Coolify evaluates task cron in the server timezone (`server_settings.server_timezone`
= `Africa/Nairobi`), so **no UTC conversion** — write EAT directly. The `--from-bucket`
run is a cheap no-op when no new change file has arrived (watermark guard), so a dense
schedule is safe.
To change the frequency, edit the task in the Coolify UI, or in `coolify-db`:
```sql
UPDATE scheduled_tasks SET frequency = '*/20 6-20 * * *', updated_at = now()
WHERE name = 'inc_tickets'; -- id 3
```
Coolify's scheduler re-reads `scheduled_tasks` each minute, so the change is picked up
without a redeploy. Execution history: `scheduled_task_executions`.
> The repo's `Dockerfile`, `run_ingest.sh`, and `README.md` document this same cron for
> the plain-host/VM fallback (`CRON_TZ=Africa/Nairobi`).
## Environment variables
Set on the Coolify app (Environment Variables). Names only — values live in Coolify:
| Var | Purpose |
|---|---|
| `DATABASE_URL` | `tracksolid_db` (internal `timescale_db:5432`) |
| `RUSTFS_ENDPOINT` | `https://s3.rahamafresh.com` |
| `RUSTFS_ACCESS_KEY` / `RUSTFS_SECRET_KEY` | `isptickets` bucket credentials |
| `RUSTFS_REGION` | `us-east-1` |
| `TICKETS_BUCKET` | `isptickets` |
| `GEOCODER_PROVIDER` / `GEOCODER_API_KEY` | keyed geocoder (LocationIQ/OpenCage) |
**Env vars are Laravel-encrypted in `coolify-db` — never raw-`UPDATE` them.** Change them
in the Coolify UI, or via `artisan tinker` (which re-encrypts on save):
```bash
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com 'docker exec -i coolify php artisan tinker' <<'PHP'
$e = \App\Models\EnvironmentVariable::where('resourceable_type','App\\Models\\Application')
->where('resourceable_id',15)->where('key','TICKETS_BUCKET')->first();
$e->value = 'isptickets'; $e->save(); echo $e->value.PHP_EOL;
PHP
```
An env change only takes effect after the container is **recreated** (a redeploy — see below),
since Coolify injects env at container create time.
## Deploys
### Auto-deploy (Forgejo → Coolify webhook)
A push to `main` should auto-deploy. This needs **both** the Coolify per-app Auto-Deploy
toggle (Configuration → Advanced) **and** a webhook on the Forgejo repo. The webhook was
missing originally (the toggle alone is not enough); it now exists as hook id `3` on
`kianiadee/fleettickets`:
| Field | Value |
|---|---|
| URL | `https://stage.rahamafresh.com/webhooks/source/gitea/events/manual` |
| Type / content-type | `gitea` / `json` |
| Events / branch filter | `push` / `main` |
| Secret | the app's `manual_webhook_secret_gitea` (Coolify HMAC-validates `X-Hub-Signature-256`) |
Recreate / inspect it via the Forgejo API (auth: `git credential fill`, host
`repo.rahamafresh.com`, basic auth to `/api/v1` — no `tea`/`gh` needed). Get the secret by
decrypting it in Coolify:
```bash
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com \
"docker exec -i coolify php artisan tinker --execute=\"echo \\App\\Models\\Application::find(15)->manual_webhook_secret_gitea;\""
```
```bash
# list / test the webhook (USER:PASS from git credential fill)
curl -s -u "$USER:$PASS" https://repo.rahamafresh.com/api/v1/repos/kianiadee/fleettickets/hooks
curl -s -u "$USER:$PASS" -X POST https://repo.rahamafresh.com/api/v1/repos/kianiadee/fleettickets/hooks/3/tests
```
A successful test shows a webhook hit in `docker logs coolify` (no `invalid_signature`
audit) and a new row in `application_deployment_queues`.
### Manual deploy (no push)
Trigger the same action as Coolify's Deploy button via tinker:
```bash
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com 'docker exec -i coolify php artisan tinker' <<'PHP'
$app = \App\Models\Application::where('uuid','g14mwzo73q20g70vc6fzumya')->first();
$uuid = new \Visus\Cuid2\Cuid2;
echo json_encode(queue_application_deployment(
application: $app, deployment_uuid: $uuid, force_rebuild: false, is_api: true)).PHP_EOL;
echo $uuid.PHP_EOL;
PHP
```
Watch it: `SELECT id, status, created_at FROM application_deployment_queues WHERE
application_id = '15' ORDER BY created_at DESC LIMIT 3;` (note: `application_id` is the
**numeric id stored as text**).
## Source-bucket cutover (when the provider moves buckets)
If the provider moves the INC feed to a new bucket (as happened `tickets``isptickets`,
2026-06-25):
1. **Inspect** the new bucket (read-only) — confirm `automations/inc/changes/` layout,
timestamp range, schema parity. CRQ (`automations/crq/`) stays out of scope.
2. **Update env** (UI or tinker): `RUSTFS_ACCESS_KEY`, `RUSTFS_SECRET_KEY`,
`TICKETS_BUCKET` → the new bucket (endpoint usually unchanged).
3. **Reconcile the DB** to current. The loader drains every `changes/` file newer than the
watermark (`tickets.import_meta.metadata.source_max_key`), oldest→newest, upserting on
`ticket_id`:
- If the watermark **predates** the new bucket's first file, a normal
`--from-bucket --apply` drains the whole new stream — no reseed needed.
- Otherwise use **`--reseed`** (ignores the watermark, drains all `changes/` once):
`python import_tickets.py --from-bucket --reseed --apply` (see README "Bucket cutover").
The new stream's periodic full-state re-emissions make this converge even across the
cutover gap. Idempotent upserts + never-delete make it non-destructive.
- For a one-off, you can run it in the live container with the new creds inlined:
`docker exec -e TICKETS_BUCKET=… -e RUSTFS_ACCESS_KEY=… -e RUSTFS_SECRET_KEY=… <container>
sh -c "cd /app && python import_tickets.py --from-bucket --apply"`.
4. **Re-geocode** new clusters/locations: `--geocode-clusters --apply` then
`--geocode-locations --apply` (existing gazetteer persists; only new keys are looked up).
5. **Redeploy** so the Scheduled Task's container picks up the new env (push `main` → webhook,
or manual deploy). Old bucket is left untouched for rollback.
## Verification
```bash
DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1)
docker exec -i "$DB" psql -U postgres -d tracksolid_db <<'SQL'
-- watermark + freshness
SELECT export_type, records_ingested, ingested_at, metadata->>'source_max_key'
FROM tickets.import_meta WHERE dataset='inc';
-- counts
SELECT count(*) total_inc,
count(*) FILTER (WHERE (raw->>'is_actionable')::boolean) AS open
FROM tickets.inc;
-- map payload sanity
SELECT reporting.fn_tickets_for_map() -> 'summary' ->> 'ticket_count';
SQL
```
- New bucket `changes/` empties as files move to `automations/inc/processed/`.
- A plain `--from-bucket --apply` reports "nothing new" until the next change file lands.
- FleetOps Tickets map freshness reflects the new `ingested_at`.
## Rollback
- **Bucket:** revert the three env vars to the old bucket + creds and redeploy. The old
bucket and its `processed/` history are untouched; upserts are idempotent and rows are
never deleted, so re-running is safe.
- **Cron:** `UPDATE scheduled_tasks SET frequency = <old> WHERE name='inc_tickets';`

View file

@ -5,12 +5,14 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
## Pipeline (`import_tickets.py`)
- **Source:** newest `automations/inc/<EAT-timestamp>.csv` in the rustfs `tickets`
bucket (endpoint `https://s3.rahamafresh.com`, path-style, region `us-east-1`).
- **Source:** the incremental CDC stream `automations/inc/changes/<EAT-timestamp>.csv`
in the **`isptickets`** S3 bucket (endpoint `https://s3.rahamafresh.com`, path-style,
region `us-east-1`; was the `tickets` bucket before the 2026-06-25 cutover).
- **S3 access via boto3** (no aws-CLI dependency): `list_objects_v2` (paginator),
`get_object`, `copy_object` + `delete_object` for archiving.
- **Skip-if-unchanged:** newest S3 **ETag** vs `tickets.import_meta.metadata.source_etag`;
equal → skip the DB write (the export re-emits identical content most hours).
- **Watermark:** drains every `changes/` file newer than
`tickets.import_meta.metadata.source_max_key`, oldest→newest; reruns with no new file
are a cheap no-op. `--reseed` ignores the watermark for a one-time bucket cutover.
- **Cleaning:** drop `is_alarm=true` rows + the `EXPORT STOPPED…` sentinel; drop
`week_start`/`week_end`, `source_s3_bucket`/`source_s3_key`/`source_snapshot_id`,
`department`, `source_type`; normalize `region`→lowercase, `raw_status`→UPPERCASE.
@ -22,8 +24,9 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
- **History capture:** after each `--apply` run (ingest or skip), calls
`tickets.capture_history()` → appends new closures + upserts today's backlog
snapshot.
- CLI: `--from-bucket` (newest INC csv), `--inc-csv <file>` (local dev), `--apply`
(else dry-run), `--geocode-clusters`, `--geocode-locations`, `--capture-history`.
- CLI: `--from-bucket` (drain the INC change stream), `--reseed` (ignore the watermark;
one-time bucket cutover), `--inc-csv <file>` (local dev), `--apply` (else dry-run),
`--geocode-clusters`, `--geocode-locations`, `--capture-history`.
## Schema / migrations (`tracksolid_db`, applied via `run_migrations.py`)
@ -55,9 +58,13 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
web app (`fleet-ops-staging`).
- **Scheduled Task:** `python import_tickets.py --from-bucket --apply`, cron
`*/20 6-20 * * *` in **EAT** (Coolify runs tasks in EAT — no UTC conversion).
- **Env vars** (Coolify): `DATABASE_URL` (internal DB host), `RUSTFS_*`, `GEOCODER_*`.
- **Env vars** (Coolify): `DATABASE_URL` (internal DB host), `RUSTFS_*`
(`isptickets` bucket), `GEOCODER_*`.
- For a plain host/VM, `run_ingest.sh` + a crontab line is the alternative.
Full ops runbook (env management, the Forgejo → Coolify auto-deploy webhook, manual
deploys, bucket cutover, verification): **`docs/deployment-and-operations.md`**.
## State at hand-off
- `tickets.inc` ≈ 21,312 rows (current non-alarm INC + a few aged-out history rows);