docs: add deployment & operations runbook (Coolify, webhook, bucket cutover)
Capture the operational knowledge from the isptickets cutover: Coolify app/container, env management (encrypted — UI or artisan tinker), cron, the Forgejo->Coolify auto-deploy webhook (config + recreate/verify; it was missing), manual deploy trigger, the source-bucket cutover procedure, and verification queries. Link it from README; refresh stale tickets-bucket/ETag references in implementation.md to the isptickets CDC model. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
f06c11fd11
commit
0787d3a185
3 changed files with 186 additions and 7 deletions
|
|
@ -120,6 +120,11 @@ For a plain host/VM instead of Coolify, [`run_ingest.sh`](run_ingest.sh) loads `
|
|||
and runs the ingest; schedule it with a crontab line
|
||||
(`CRON_TZ=Africa/Nairobi` / `*/20 6-20 * * *`).
|
||||
|
||||
Full operational runbook — container, env management (encrypted; via the UI or
|
||||
`artisan tinker`), the **Forgejo → Coolify auto-deploy webhook**, manual deploys, and the
|
||||
source-bucket cutover procedure — is in
|
||||
[`docs/deployment-and-operations.md`](docs/deployment-and-operations.md).
|
||||
|
||||
### Bucket cutover (one-time reseed)
|
||||
|
||||
When the source provider moves the feed to a new bucket (e.g. `tickets` → `isptickets`),
|
||||
|
|
|
|||
167
docs/deployment-and-operations.md
Normal file
167
docs/deployment-and-operations.md
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
# Deployment & Operations — fleettickets
|
||||
|
||||
Operational runbook for the INC ingest pipeline as deployed on **Coolify**
|
||||
(host `kianiadee@twala.rahamafresh.com`, key `~/.ssh/id_ed25519`). Covers the
|
||||
container, environment, schedule, auto-deploy webhook, the source-bucket cutover
|
||||
procedure, and verification. Secrets are referenced by **where to retrieve them**,
|
||||
never by value.
|
||||
|
||||
## What's deployed
|
||||
|
||||
| Thing | Detail |
|
||||
|---|---|
|
||||
| Coolify app | **`fleettickets`** — id `15`, uuid `g14mwzo73q20g70vc6fzumya`, build pack `dockerfile`, git `main` |
|
||||
| Container | built from this repo's `Dockerfile` (`python:3.12-slim`, `TZ=Africa/Nairobi`); kept alive with `tail -f /dev/null` (no web server) |
|
||||
| Ingest | a Coolify **Scheduled Task** `inc_tickets` running `python import_tickets.py --from-bucket --apply` |
|
||||
| DB | `tickets` schema in the shared `tracksolid_db` (internal host `timescale_db:5432`) |
|
||||
| Source | **`isptickets`** S3 bucket, `automations/inc/changes/<EAT-ts>.csv` CDC stream (see `../n8n-s3-ticket-exports.md` and `../README.md`) |
|
||||
|
||||
Resolve the live container name (Coolify appends a random suffix):
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com \
|
||||
'docker ps --filter name=g14mwzo73q20g70vc6fzumya --format "{{.Names}}" | head -1'
|
||||
```
|
||||
|
||||
## Schedule (cron)
|
||||
|
||||
The Scheduled Task runs **`*/20 6-20 * * *`** — every 20 min, **06:00–20:40 EAT**.
|
||||
Coolify evaluates task cron in the server timezone (`server_settings.server_timezone`
|
||||
= `Africa/Nairobi`), so **no UTC conversion** — write EAT directly. The `--from-bucket`
|
||||
run is a cheap no-op when no new change file has arrived (watermark guard), so a dense
|
||||
schedule is safe.
|
||||
|
||||
To change the frequency, edit the task in the Coolify UI, or in `coolify-db`:
|
||||
```sql
|
||||
UPDATE scheduled_tasks SET frequency = '*/20 6-20 * * *', updated_at = now()
|
||||
WHERE name = 'inc_tickets'; -- id 3
|
||||
```
|
||||
Coolify's scheduler re-reads `scheduled_tasks` each minute, so the change is picked up
|
||||
without a redeploy. Execution history: `scheduled_task_executions`.
|
||||
|
||||
> The repo's `Dockerfile`, `run_ingest.sh`, and `README.md` document this same cron for
|
||||
> the plain-host/VM fallback (`CRON_TZ=Africa/Nairobi`).
|
||||
|
||||
## Environment variables
|
||||
|
||||
Set on the Coolify app (Environment Variables). Names only — values live in Coolify:
|
||||
|
||||
| Var | Purpose |
|
||||
|---|---|
|
||||
| `DATABASE_URL` | `tracksolid_db` (internal `timescale_db:5432`) |
|
||||
| `RUSTFS_ENDPOINT` | `https://s3.rahamafresh.com` |
|
||||
| `RUSTFS_ACCESS_KEY` / `RUSTFS_SECRET_KEY` | `isptickets` bucket credentials |
|
||||
| `RUSTFS_REGION` | `us-east-1` |
|
||||
| `TICKETS_BUCKET` | `isptickets` |
|
||||
| `GEOCODER_PROVIDER` / `GEOCODER_API_KEY` | keyed geocoder (LocationIQ/OpenCage) |
|
||||
|
||||
**Env vars are Laravel-encrypted in `coolify-db` — never raw-`UPDATE` them.** Change them
|
||||
in the Coolify UI, or via `artisan tinker` (which re-encrypts on save):
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com 'docker exec -i coolify php artisan tinker' <<'PHP'
|
||||
$e = \App\Models\EnvironmentVariable::where('resourceable_type','App\\Models\\Application')
|
||||
->where('resourceable_id',15)->where('key','TICKETS_BUCKET')->first();
|
||||
$e->value = 'isptickets'; $e->save(); echo $e->value.PHP_EOL;
|
||||
PHP
|
||||
```
|
||||
An env change only takes effect after the container is **recreated** (a redeploy — see below),
|
||||
since Coolify injects env at container create time.
|
||||
|
||||
## Deploys
|
||||
|
||||
### Auto-deploy (Forgejo → Coolify webhook)
|
||||
|
||||
A push to `main` should auto-deploy. This needs **both** the Coolify per-app Auto-Deploy
|
||||
toggle (Configuration → Advanced) **and** a webhook on the Forgejo repo. The webhook was
|
||||
missing originally (the toggle alone is not enough); it now exists as hook id `3` on
|
||||
`kianiadee/fleettickets`:
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| URL | `https://stage.rahamafresh.com/webhooks/source/gitea/events/manual` |
|
||||
| Type / content-type | `gitea` / `json` |
|
||||
| Events / branch filter | `push` / `main` |
|
||||
| Secret | the app's `manual_webhook_secret_gitea` (Coolify HMAC-validates `X-Hub-Signature-256`) |
|
||||
|
||||
Recreate / inspect it via the Forgejo API (auth: `git credential fill`, host
|
||||
`repo.rahamafresh.com`, basic auth to `/api/v1` — no `tea`/`gh` needed). Get the secret by
|
||||
decrypting it in Coolify:
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com \
|
||||
"docker exec -i coolify php artisan tinker --execute=\"echo \\App\\Models\\Application::find(15)->manual_webhook_secret_gitea;\""
|
||||
```
|
||||
```bash
|
||||
# list / test the webhook (USER:PASS from git credential fill)
|
||||
curl -s -u "$USER:$PASS" https://repo.rahamafresh.com/api/v1/repos/kianiadee/fleettickets/hooks
|
||||
curl -s -u "$USER:$PASS" -X POST https://repo.rahamafresh.com/api/v1/repos/kianiadee/fleettickets/hooks/3/tests
|
||||
```
|
||||
A successful test shows a webhook hit in `docker logs coolify` (no `invalid_signature`
|
||||
audit) and a new row in `application_deployment_queues`.
|
||||
|
||||
### Manual deploy (no push)
|
||||
|
||||
Trigger the same action as Coolify's Deploy button via tinker:
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519 kianiadee@twala.rahamafresh.com 'docker exec -i coolify php artisan tinker' <<'PHP'
|
||||
$app = \App\Models\Application::where('uuid','g14mwzo73q20g70vc6fzumya')->first();
|
||||
$uuid = new \Visus\Cuid2\Cuid2;
|
||||
echo json_encode(queue_application_deployment(
|
||||
application: $app, deployment_uuid: $uuid, force_rebuild: false, is_api: true)).PHP_EOL;
|
||||
echo $uuid.PHP_EOL;
|
||||
PHP
|
||||
```
|
||||
Watch it: `SELECT id, status, created_at FROM application_deployment_queues WHERE
|
||||
application_id = '15' ORDER BY created_at DESC LIMIT 3;` (note: `application_id` is the
|
||||
**numeric id stored as text**).
|
||||
|
||||
## Source-bucket cutover (when the provider moves buckets)
|
||||
|
||||
If the provider moves the INC feed to a new bucket (as happened `tickets` → `isptickets`,
|
||||
2026-06-25):
|
||||
|
||||
1. **Inspect** the new bucket (read-only) — confirm `automations/inc/changes/` layout,
|
||||
timestamp range, schema parity. CRQ (`automations/crq/`) stays out of scope.
|
||||
2. **Update env** (UI or tinker): `RUSTFS_ACCESS_KEY`, `RUSTFS_SECRET_KEY`,
|
||||
`TICKETS_BUCKET` → the new bucket (endpoint usually unchanged).
|
||||
3. **Reconcile the DB** to current. The loader drains every `changes/` file newer than the
|
||||
watermark (`tickets.import_meta.metadata.source_max_key`), oldest→newest, upserting on
|
||||
`ticket_id`:
|
||||
- If the watermark **predates** the new bucket's first file, a normal
|
||||
`--from-bucket --apply` drains the whole new stream — no reseed needed.
|
||||
- Otherwise use **`--reseed`** (ignores the watermark, drains all `changes/` once):
|
||||
`python import_tickets.py --from-bucket --reseed --apply` (see README "Bucket cutover").
|
||||
The new stream's periodic full-state re-emissions make this converge even across the
|
||||
cutover gap. Idempotent upserts + never-delete make it non-destructive.
|
||||
- For a one-off, you can run it in the live container with the new creds inlined:
|
||||
`docker exec -e TICKETS_BUCKET=… -e RUSTFS_ACCESS_KEY=… -e RUSTFS_SECRET_KEY=… <container>
|
||||
sh -c "cd /app && python import_tickets.py --from-bucket --apply"`.
|
||||
4. **Re-geocode** new clusters/locations: `--geocode-clusters --apply` then
|
||||
`--geocode-locations --apply` (existing gazetteer persists; only new keys are looked up).
|
||||
5. **Redeploy** so the Scheduled Task's container picks up the new env (push `main` → webhook,
|
||||
or manual deploy). Old bucket is left untouched for rollback.
|
||||
|
||||
## Verification
|
||||
|
||||
```bash
|
||||
DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1)
|
||||
docker exec -i "$DB" psql -U postgres -d tracksolid_db <<'SQL'
|
||||
-- watermark + freshness
|
||||
SELECT export_type, records_ingested, ingested_at, metadata->>'source_max_key'
|
||||
FROM tickets.import_meta WHERE dataset='inc';
|
||||
-- counts
|
||||
SELECT count(*) total_inc,
|
||||
count(*) FILTER (WHERE (raw->>'is_actionable')::boolean) AS open
|
||||
FROM tickets.inc;
|
||||
-- map payload sanity
|
||||
SELECT reporting.fn_tickets_for_map() -> 'summary' ->> 'ticket_count';
|
||||
SQL
|
||||
```
|
||||
- New bucket `changes/` empties as files move to `automations/inc/processed/`.
|
||||
- A plain `--from-bucket --apply` reports "nothing new" until the next change file lands.
|
||||
- FleetOps Tickets map freshness reflects the new `ingested_at`.
|
||||
|
||||
## Rollback
|
||||
|
||||
- **Bucket:** revert the three env vars to the old bucket + creds and redeploy. The old
|
||||
bucket and its `processed/` history are untouched; upserts are idempotent and rows are
|
||||
never deleted, so re-running is safe.
|
||||
- **Cron:** `UPDATE scheduled_tasks SET frequency = <old> WHERE name='inc_tickets';`
|
||||
|
|
@ -5,12 +5,14 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
|
|||
|
||||
## Pipeline (`import_tickets.py`)
|
||||
|
||||
- **Source:** newest `automations/inc/<EAT-timestamp>.csv` in the rustfs `tickets`
|
||||
bucket (endpoint `https://s3.rahamafresh.com`, path-style, region `us-east-1`).
|
||||
- **Source:** the incremental CDC stream `automations/inc/changes/<EAT-timestamp>.csv`
|
||||
in the **`isptickets`** S3 bucket (endpoint `https://s3.rahamafresh.com`, path-style,
|
||||
region `us-east-1`; was the `tickets` bucket before the 2026-06-25 cutover).
|
||||
- **S3 access via boto3** (no aws-CLI dependency): `list_objects_v2` (paginator),
|
||||
`get_object`, `copy_object` + `delete_object` for archiving.
|
||||
- **Skip-if-unchanged:** newest S3 **ETag** vs `tickets.import_meta.metadata.source_etag`;
|
||||
equal → skip the DB write (the export re-emits identical content most hours).
|
||||
- **Watermark:** drains every `changes/` file newer than
|
||||
`tickets.import_meta.metadata.source_max_key`, oldest→newest; reruns with no new file
|
||||
are a cheap no-op. `--reseed` ignores the watermark for a one-time bucket cutover.
|
||||
- **Cleaning:** drop `is_alarm=true` rows + the `EXPORT STOPPED…` sentinel; drop
|
||||
`week_start`/`week_end`, `source_s3_bucket`/`source_s3_key`/`source_snapshot_id`,
|
||||
`department`, `source_type`; normalize `region`→lowercase, `raw_status`→UPPERCASE.
|
||||
|
|
@ -22,8 +24,9 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
|
|||
- **History capture:** after each `--apply` run (ingest or skip), calls
|
||||
`tickets.capture_history()` → appends new closures + upserts today's backlog
|
||||
snapshot.
|
||||
- CLI: `--from-bucket` (newest INC csv), `--inc-csv <file>` (local dev), `--apply`
|
||||
(else dry-run), `--geocode-clusters`, `--geocode-locations`, `--capture-history`.
|
||||
- CLI: `--from-bucket` (drain the INC change stream), `--reseed` (ignore the watermark;
|
||||
one-time bucket cutover), `--inc-csv <file>` (local dev), `--apply` (else dry-run),
|
||||
`--geocode-clusters`, `--geocode-locations`, `--capture-history`.
|
||||
|
||||
## Schema / migrations (`tracksolid_db`, applied via `run_migrations.py`)
|
||||
|
||||
|
|
@ -55,9 +58,13 @@ What is actually built and deployed, as of the Phase-1 completion. Companion to
|
|||
web app (`fleet-ops-staging`).
|
||||
- **Scheduled Task:** `python import_tickets.py --from-bucket --apply`, cron
|
||||
`*/20 6-20 * * *` in **EAT** (Coolify runs tasks in EAT — no UTC conversion).
|
||||
- **Env vars** (Coolify): `DATABASE_URL` (internal DB host), `RUSTFS_*`, `GEOCODER_*`.
|
||||
- **Env vars** (Coolify): `DATABASE_URL` (internal DB host), `RUSTFS_*`
|
||||
(`isptickets` bucket), `GEOCODER_*`.
|
||||
- For a plain host/VM, `run_ingest.sh` + a crontab line is the alternative.
|
||||
|
||||
Full ops runbook (env management, the Forgejo → Coolify auto-deploy webhook, manual
|
||||
deploys, bucket cutover, verification): **`docs/deployment-and-operations.md`**.
|
||||
|
||||
## State at hand-off
|
||||
|
||||
- `tickets.inc` ≈ 21,312 rows (current non-alarm INC + a few aged-out history rows);
|
||||
|
|
|
|||
Loading…
Reference in a new issue