Initial: plan, log-proxy app, README, gitignore
This commit is contained in:
commit
1bc6e57374
7 changed files with 476 additions and 0 deletions
25
.gitignore
vendored
Normal file
25
.gitignore
vendored
Normal file
|
|
@ -0,0 +1,25 @@
|
||||||
|
# Python
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
.venv/
|
||||||
|
venv/
|
||||||
|
.env
|
||||||
|
.env.*
|
||||||
|
!.env.example
|
||||||
|
|
||||||
|
# n8n local exports / dumps
|
||||||
|
n8n/exports/
|
||||||
|
n8n/credentials.json
|
||||||
|
|
||||||
|
# Editor
|
||||||
|
.vscode/
|
||||||
|
.idea/
|
||||||
|
*.swp
|
||||||
|
.DS_Store
|
||||||
|
|
||||||
|
# Secrets — should never be committed
|
||||||
|
secrets/
|
||||||
|
*.pem
|
||||||
|
*.key
|
||||||
|
api-keys.txt
|
||||||
209
260517_docker_n8n_logging.md
Normal file
209
260517_docker_n8n_logging.md
Normal file
|
|
@ -0,0 +1,209 @@
|
||||||
|
# n8n Docker-log Alerting (ntfy + WhatsApp)
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The user runs a Coolify host at `twala.rahamafresh.com` with ~50 containers across ~15 logically distinct services (tracksolid telemetry pipeline, Coolify itself, n8n stacks, Supabase, Chatwoot, Evolution API, Dekart, Forgejo, Ente, Garage, etc.). They want **n8n to read Docker logs directly, segment by service, apply per-service thresholds, and notify via ntfy and WhatsApp**.
|
||||||
|
|
||||||
|
Dozzle is explicitly out of scope as an integration source — it stays as a human-facing log viewer. The integration design must not depend on it.
|
||||||
|
|
||||||
|
Why this matters: today, errors in any container are invisible until someone opens Dozzle. Critical issues (panics, OOMs, ingest failures on the tracksolid pipeline) can sit unnoticed for hours. The goal is per-service alerting with severity-aware routing, with thresholds tunable per service so that noisy services don't drown out quiet ones.
|
||||||
|
|
||||||
|
## Decisions (locked with the user)
|
||||||
|
|
||||||
|
| Choice | Decision |
|
||||||
|
| --- | --- |
|
||||||
|
| n8n instance | `n8n-o55elukmxacgp1s2xcwktyam` (queue mode: main + worker + task-runners + Postgres + Redis) |
|
||||||
|
| Docker log access | New **read-only log-proxy** container — n8n never touches `/var/run/docker.sock` |
|
||||||
|
| Service grouping | Auto-derive from each container's `COOLIFY_RESOURCE_UUID` env var |
|
||||||
|
| Channels | Self-hosted ntfy (new Coolify service) **+** existing Evolution API (WhatsApp) |
|
||||||
|
| Git | Workspace `/Users/kianiadee/Downloads/projects/03_dozzle_n8n` is **not** a git repo yet — user creates separate repo later |
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
n8n queue-mode (o55elukmxacgp1s2xcwktyam)
|
||||||
|
┌────────────────────────────────────────┐
|
||||||
|
Docker Engine log-proxy │ Workflow: Poll & Evaluate (per group) │
|
||||||
|
┌──────────────┐ (new svc) │ 1. GET /logs/<group>?since=<cursor> │
|
||||||
|
│ /var/run/ │ ◄─── RO socket ────► │ 2. regex → severity │
|
||||||
|
│ docker.sock │ HTTP API │ 3. threshold + cooldown via │
|
||||||
|
└──────────────┘ (internal net) ◄─┤ getWorkflowStaticData() │
|
||||||
|
│ 4. emit Alert event │
|
||||||
|
│ │
|
||||||
|
│ Workflow: Notify (single, parametric) │
|
||||||
|
│ severity=critical → ntfy + WhatsApp │
|
||||||
|
│ severity=error → ntfy │
|
||||||
|
│ severity=warn → ntfy (low prio) │
|
||||||
|
└────────────────────┬───────────────────┘
|
||||||
|
│
|
||||||
|
┌──────────────────────┴──────────────────────┐
|
||||||
|
▼ ▼
|
||||||
|
ntfy (self-hosted via Coolify) Evolution API (api-vc4ok...)
|
||||||
|
POST /<topic> POST /message/sendText/<instance>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
### 1. log-proxy (new container)
|
||||||
|
|
||||||
|
**Purpose**: the only thing with `docker.sock` access. Dumb pipe — no alerting logic.
|
||||||
|
|
||||||
|
**Image**: small Python/FastAPI or Node/Fastify app (~50 lines). Build from source in this repo.
|
||||||
|
|
||||||
|
**Mount**: `/var/run/docker.sock` read-only.
|
||||||
|
|
||||||
|
**Network**: joined to the n8n stack's Coolify network so n8n can reach it by hostname; **no Traefik route** (not publicly reachable).
|
||||||
|
|
||||||
|
**API** (no auth needed — internal only; optional bearer token for defence in depth):
|
||||||
|
|
||||||
|
- `GET /services` — `[{ "group": "bo3no...", "name": "tracksolid", "containers": [...] }, ...]`
|
||||||
|
- Groups containers by `COOLIFY_RESOURCE_UUID` env var.
|
||||||
|
- Filtered to the allow-list in `/config/groups.yml` — UUIDs not listed are skipped entirely.
|
||||||
|
- `GET /logs/<group>?since=<unix_ts>&until=<unix_ts>&limit=2000`
|
||||||
|
- Calls Docker Engine API `GET /containers/<id>/logs?stdout=1&stderr=1&since=...&until=...×tamps=1` for every container in the group.
|
||||||
|
- Returns NDJSON or JSON array of `{ container, ts, stream, line }`.
|
||||||
|
- `since` defaults to "now − 60s" if absent; `until` defaults to "now".
|
||||||
|
- `GET /healthz`
|
||||||
|
|
||||||
|
**Why a proxy and not direct socket-into-n8n**: any n8n editor user becomes root-on-host if n8n has the socket. Proxy keeps blast radius small and the API surface inspectable.
|
||||||
|
|
||||||
|
### 2. Self-hosted ntfy
|
||||||
|
|
||||||
|
Deploy via Coolify's one-click marketplace (or as a Docker Compose service).
|
||||||
|
|
||||||
|
- Suggested FQDN (matches your existing pattern): `ntfy.rahamafresh.com`
|
||||||
|
- Auth: enable `auth-default-access: deny-all`; create per-topic users (one publisher user for n8n, plus client users for each subscriber).
|
||||||
|
- Topics: one per service group, e.g. `tracksolid-alerts`, `coolify-alerts`, `evolution-api-alerts`. Subscribe on phones via the ntfy mobile app.
|
||||||
|
|
||||||
|
### 3. n8n workflows (in `n8n-o55elukmxacgp1s2xcwktyam`)
|
||||||
|
|
||||||
|
**A. Poll & Evaluate** (one workflow per service group — easiest to tune independently)
|
||||||
|
|
||||||
|
Nodes:
|
||||||
|
|
||||||
|
1. **Schedule Trigger** — every 30s (tunable per group).
|
||||||
|
2. **Static Data Read** — pull `last_cursor` from `$getWorkflowStaticData('global').cursor`.
|
||||||
|
3. **HTTP Request** — `GET http://log-proxy:8080/logs/<group>?since=<cursor>`.
|
||||||
|
4. **Function (Pattern Match)** — for each line, run severity regexes (from workflow Variables) and emit `{ severity, pattern, container, ts, line, fingerprint }` where `fingerprint = sha256(group:pattern:container)` (used for cooldown).
|
||||||
|
5. **Function (Threshold + Cooldown)**:
|
||||||
|
- `critical`: emit immediately if not in cooldown.
|
||||||
|
- `error`: count rolling matches per fingerprint over `window` minutes; emit when threshold crossed.
|
||||||
|
- `warn`: same but larger window / threshold.
|
||||||
|
- Cooldown: `staticData.cooldowns[fingerprint] = now + cooldown_minutes`; skip while still hot.
|
||||||
|
6. **Static Data Write** — update `cursor = max(ts seen)` and `cooldowns`.
|
||||||
|
7. **Execute Workflow** — call the **Notify** workflow once per emitted Alert.
|
||||||
|
|
||||||
|
**B. Notify** (single parametric workflow; called by each Poll workflow)
|
||||||
|
|
||||||
|
Input: `{ group, severity, pattern, container, ts, line, fingerprint }`
|
||||||
|
|
||||||
|
Nodes:
|
||||||
|
|
||||||
|
1. **Switch** on `severity`.
|
||||||
|
2. **critical** branch:
|
||||||
|
- **HTTP Request** → ntfy: `POST https://ntfy.rahamafresh.com/<group>-alerts` with priority=5, tags=`rotating_light`.
|
||||||
|
- **HTTP Request** → Evolution API: `POST https://<evolution-api-fqdn>/message/sendText/<instance>` with `{ number, text }`. Credentials via n8n credentials store.
|
||||||
|
3. **error** branch: ntfy only, priority=4.
|
||||||
|
4. **warn** branch: ntfy only, priority=3.
|
||||||
|
5. **Append-row** (Postgres node, optional) → `alerts_audit` table for history.
|
||||||
|
|
||||||
|
### 4. Defaults (tunable per group via workflow Variables)
|
||||||
|
|
||||||
|
| Severity | Default patterns | Threshold | Cooldown | Routing |
|
||||||
|
| --- | --- | --- | --- | --- |
|
||||||
|
| critical | `panic`, `FATAL`, `OOMKilled`, `out of memory`, `segmentation fault` | immediate (1 match) | 30 min | ntfy + WhatsApp |
|
||||||
|
| error | `\bERROR\b`, `Exception`, `Traceback`, `5\d\d ` (HTTP 5xx) | 10 / 5 min | 15 min | ntfy |
|
||||||
|
| warn | `\bWARN(ING)?\b`, `deadlock`, `timeout` | 50 / 15 min | 30 min | ntfy (low prio) |
|
||||||
|
|
||||||
|
These live as a JSON object in each workflow's Variables, so per-group tuning is one edit.
|
||||||
|
|
||||||
|
### 5. Group naming
|
||||||
|
|
||||||
|
Friendly names mapped from Coolify resource UUID — sourced from `groups.yml` mounted into log-proxy. **`groups.yml` is also the allow-list**: only UUIDs listed here are monitored. Anything else the proxy sees on the host is ignored — non-mission-critical apps don't generate noise or burn polling cycles.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
bo3nov2ija7g8wn9b1g2paxs: tracksolid
|
||||||
|
o55elukmxacgp1s2xcwktyam: n8n-prod
|
||||||
|
usoksgg8o40044g0cw08s8wc: n8n-simple
|
||||||
|
vc4ok84gw4s0kcgwwg8gooco: evolution-api
|
||||||
|
ks4sc8k4804swk0c0c4kk44c: chatwoot
|
||||||
|
foo048cw4skg8kswwsowwo0c: forgejo
|
||||||
|
u7rj0du43d33ncurig2t6ni1: dekart
|
||||||
|
e11bva63bu7swlq6zyfckxm3: rustfs
|
||||||
|
now8k08wcs044scwggos0wos: dozzle
|
||||||
|
# Coolify core, Supabase, shutterdiplomacy → handled as their own groups
|
||||||
|
#
|
||||||
|
# Explicitly NOT monitored (non-mission-critical, per user 2026-05-17):
|
||||||
|
# dy82njm7qgb5f2m573d1u3rh garage
|
||||||
|
# r77s24tgmfifmpfqe86xyqsp ente
|
||||||
|
# vw0wk0cg8gkwgwogsg4k0gsg excalidraw
|
||||||
|
```
|
||||||
|
|
||||||
|
Implication on the proxy: `GET /services` returns only allow-listed groups; `GET /logs/<group>` 404s for non-allow-listed UUIDs. To start monitoring a service later, add a single line to `groups.yml` and clone a Poll workflow.
|
||||||
|
|
||||||
|
## Workspace layout
|
||||||
|
|
||||||
|
```
|
||||||
|
/Users/kianiadee/Downloads/projects/03_dozzle_n8n/ ← no git yet
|
||||||
|
├── log-proxy/
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── app.py (FastAPI: /services, /logs/<group>, /healthz)
|
||||||
|
│ ├── requirements.txt
|
||||||
|
│ └── groups.yml (UUID → friendly-name map)
|
||||||
|
├── ntfy/
|
||||||
|
│ └── README.md (Coolify deploy notes + topic / user setup)
|
||||||
|
├── n8n/
|
||||||
|
│ └── workflows/
|
||||||
|
│ ├── poll-tracksolid.json
|
||||||
|
│ ├── poll-coolify.json
|
||||||
|
│ ├── poll-evolution.json
|
||||||
|
│ ├── poll-<group>.json ← one per group, derived from a template
|
||||||
|
│ └── notify.json ← parametric fan-out
|
||||||
|
├── coolify/
|
||||||
|
│ └── log-proxy.compose.yml (for Coolify "Docker Compose" service)
|
||||||
|
└── README.md (operating runbook: how to add a group, tune thresholds, rotate ntfy creds)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implementation steps (ordered)
|
||||||
|
|
||||||
|
1. **Build log-proxy** locally (`log-proxy/`). Test against the remote docker socket via `docker context` or just deploy and iterate.
|
||||||
|
2. **Deploy log-proxy via Coolify** as a Docker Compose service. Attach to the same network as `n8n-o55...`. No Traefik route. Verify `GET /services` and `GET /logs/<group>` from inside the n8n container (`docker exec n8n-o55... wget -qO- http://log-proxy:8080/services`).
|
||||||
|
3. **Deploy self-hosted ntfy via Coolify** at `ntfy.rahamafresh.com`. Configure deny-all default and one publisher user. Subscribe phones to test topic.
|
||||||
|
4. **Build the parametric Notify workflow** in n8n. Add credentials: `ntfy_publisher` (HTTP basic), `evolution_api` (header auth). Test by manually firing each branch.
|
||||||
|
5. **Build the Poll & Evaluate workflow** for **one group first** (suggest `tracksolid` — highest business value). Validate thresholds with a synthetic log line (`docker exec ingest_events-bo3no... sh -c 'echo FATAL test'` or similar).
|
||||||
|
6. **Clone the Poll workflow per remaining group**. Tune patterns / thresholds in Variables.
|
||||||
|
7. **Tune & quiet**: run for 24h, capture false positives, adjust regex / thresholds.
|
||||||
|
8. **Document** in `README.md` how to add a new group when Coolify spins up a new service.
|
||||||
|
|
||||||
|
## Critical files
|
||||||
|
|
||||||
|
- `log-proxy/app.py` — the only thing with docker.sock access. Treat as security-sensitive; no write endpoints, no shell-out.
|
||||||
|
- `log-proxy/groups.yml` — single source of truth for UUID → friendly name. Keep in sync as Coolify services are added.
|
||||||
|
- `n8n/workflows/notify.json` — fan-out logic; any new channel (Slack, email) is added here, not in each poll workflow.
|
||||||
|
- `n8n/workflows/poll-<group>.json` — per-group thresholds. Variables block at the top is the only thing operators normally edit.
|
||||||
|
- `coolify/log-proxy.compose.yml` — controls log-proxy deployment + network attachment. Misconfiguring network = n8n can't reach proxy.
|
||||||
|
|
||||||
|
## Reused / existing infrastructure
|
||||||
|
|
||||||
|
- **n8n queue mode** `n8n-o55elukmxacgp1s2xcwktyam` — runs the workflows; its built-in Postgres + Redis cover persistence and queueing. No new DB needed.
|
||||||
|
- **Evolution API** `api-vc4ok84gw4s0kcgwwg8gooco` — already deployed; we only consume its REST API.
|
||||||
|
- **Coolify Sentinel** `coolify-sentinel` — left untouched; could later feed container-down events into the same Notify workflow if desired.
|
||||||
|
- **Coolify networks + Traefik** — handle internal service discovery and TLS for ntfy.
|
||||||
|
- **All Coolify-managed containers already carry `COOLIFY_RESOURCE_UUID`** — confirmed via `docker inspect` on the Dozzle container in the previous session. This is what makes auto-grouping possible without a hand-written container list.
|
||||||
|
|
||||||
|
## Open items to gather at implementation time
|
||||||
|
|
||||||
|
- `ntfy.rahamafresh.com` DNS record (or chosen FQDN).
|
||||||
|
- Evolution API: instance name, API key, target WhatsApp number(s).
|
||||||
|
- Confirmation of which Coolify network `n8n-o55...` runs on (read from `docker inspect` at implementation start).
|
||||||
|
- Optional: bearer token value for log-proxy if defence-in-depth is wanted.
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
1. **log-proxy unit checks**: from inside n8n container, `curl http://log-proxy:8080/services` returns all groups; `curl http://log-proxy:8080/logs/tracksolid?since=$(date -d '5 minutes ago' +%s)` returns recent lines from all tracksolid containers.
|
||||||
|
2. **End-to-end critical alert**: run `docker run --rm alpine sh -c 'echo "FATAL synthetic test from $(date)"'` inside a tracksolid container; within 30s, ntfy topic `tracksolid-alerts` receives a high-priority message AND WhatsApp number receives the same.
|
||||||
|
3. **Threshold smoke test**: emit 11 lines containing `ERROR` to a single container over 30s; expect exactly one ntfy notification, not eleven.
|
||||||
|
4. **Cooldown smoke test**: trigger the same critical alert twice within the cooldown window; expect only one notification.
|
||||||
|
5. **Cursor durability**: restart the n8n worker; confirm cursor in `getWorkflowStaticData` persisted in Postgres and no logs were re-processed or skipped.
|
||||||
|
6. **Per-group isolation**: deliberately spam errors in one group; confirm other groups' workflows are unaffected (separate workflow = separate static data, separate schedule).
|
||||||
|
7. **Read-only safety**: from inside n8n, attempt `POST http://log-proxy:8080/anything` — expect 404/405. Confirm `docker.sock` is not mounted inside n8n.
|
||||||
53
README.md
Normal file
53
README.md
Normal file
|
|
@ -0,0 +1,53 @@
|
||||||
|
# dozzle_n8n_logging
|
||||||
|
|
||||||
|
n8n-driven Docker log alerting for the Coolify host at `twala.rahamafresh.com`. Critical errors fan out to **ntfy** (self-hosted) and **WhatsApp** (via Evolution API); lower-severity events go to ntfy only.
|
||||||
|
|
||||||
|
Dozzle stays as the human-facing log viewer. This project does **not** integrate with Dozzle — it reads Docker logs independently via a small read-only proxy.
|
||||||
|
|
||||||
|
## Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
log-proxy/ FastAPI app, only thing with docker.sock access. /services + /logs/<group> + /healthz.
|
||||||
|
coolify/ Coolify Docker Compose file for log-proxy.
|
||||||
|
ntfy/ Deploy notes for self-hosted ntfy.
|
||||||
|
n8n/ Exported workflow JSON (poll-<group>.json + notify.json).
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Docker Engine log-proxy (RO sock) n8n queue-mode
|
||||||
|
socket ────► HTTP /logs/<group> ────► poll → match → threshold → notify
|
||||||
|
│
|
||||||
|
┌───────────────────┴───────────────────┐
|
||||||
|
▼ ▼
|
||||||
|
ntfy.rahamafresh.com Evolution API (WhatsApp)
|
||||||
|
```
|
||||||
|
|
||||||
|
Service groups are auto-derived from each container's `COOLIFY_RESOURCE_UUID`. The allow-list (and friendly names) live in `log-proxy/groups.yml`.
|
||||||
|
|
||||||
|
Severity defaults:
|
||||||
|
|
||||||
|
| Severity | Threshold | Cooldown | Channels |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| critical | 1 match (immediate) | 30 min | ntfy + WhatsApp |
|
||||||
|
| error | 10 / 5 min | 15 min | ntfy |
|
||||||
|
| warn | 50 / 15 min | 30 min | ntfy (low prio) |
|
||||||
|
|
||||||
|
See `260517_docker_n8n_logging.md` for the full design rationale.
|
||||||
|
|
||||||
|
## Adding a new service group
|
||||||
|
|
||||||
|
1. Append a line to `log-proxy/groups.yml`: `<coolify-uuid>: <friendly-name>`
|
||||||
|
2. Restart the `log-proxy` Coolify service
|
||||||
|
3. In n8n: duplicate `poll-tracksolid` workflow, retarget its `group` Variable, tune severity patterns/thresholds, activate
|
||||||
|
|
||||||
|
## Operating
|
||||||
|
|
||||||
|
- Tune thresholds: edit the `severity` Variables block at the top of each `poll-<group>` workflow.
|
||||||
|
- Silence during maintenance: deactivate the workflow in n8n (or set a global `silenced=true` flag in the Notify workflow's Variables).
|
||||||
|
- Rotate ntfy publisher credential: update the credential in n8n; restart workflows.
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Bootstrap (2026-05-17): log-proxy code complete; n8n workflows, ntfy deploy, and Coolify deploy still pending.
|
||||||
18
log-proxy/Dockerfile
Normal file
18
log-proxy/Dockerfile
Normal file
|
|
@ -0,0 +1,18 @@
|
||||||
|
FROM python:3.12-slim
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
COPY app.py .
|
||||||
|
|
||||||
|
ENV GROUPS_PATH=/config/groups.yml \
|
||||||
|
PORT=8080
|
||||||
|
|
||||||
|
EXPOSE 8080
|
||||||
|
|
||||||
|
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
|
||||||
|
CMD python -c "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:8080/healthz', timeout=3).status==200 else 1)"
|
||||||
|
|
||||||
|
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080", "--log-level", "info"]
|
||||||
148
log-proxy/app.py
Normal file
148
log-proxy/app.py
Normal file
|
|
@ -0,0 +1,148 @@
|
||||||
|
"""log-proxy: read-only Docker logs API for n8n.
|
||||||
|
|
||||||
|
Only endpoint surface:
|
||||||
|
GET /healthz liveness
|
||||||
|
GET /services list allow-listed Coolify service groups + their containers
|
||||||
|
GET /logs/<group> pull recent log lines from every container in the group
|
||||||
|
|
||||||
|
No write endpoints. No shell-out. Docker socket is RO-mounted at /var/run/docker.sock.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import time
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Iterable
|
||||||
|
|
||||||
|
import docker
|
||||||
|
import yaml
|
||||||
|
from fastapi import FastAPI, HTTPException, Query
|
||||||
|
from fastapi.responses import JSONResponse
|
||||||
|
|
||||||
|
GROUPS_PATH = os.getenv("GROUPS_PATH", "/config/groups.yml")
|
||||||
|
DOCKER_SOCK = os.getenv("DOCKER_SOCK", "unix:///var/run/docker.sock")
|
||||||
|
COOLIFY_UUID_ENV = "COOLIFY_RESOURCE_UUID"
|
||||||
|
|
||||||
|
app = FastAPI(title="log-proxy", version="0.1.0")
|
||||||
|
docker_client = docker.DockerClient(base_url=DOCKER_SOCK, timeout=30)
|
||||||
|
|
||||||
|
# ISO timestamp prefix Docker emits when timestamps=True
|
||||||
|
TS_NANO_RE = re.compile(r"^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?Z) (.*)$")
|
||||||
|
|
||||||
|
|
||||||
|
def load_groups() -> dict[str, str]:
|
||||||
|
"""Read the UUID -> friendly-name allow-list. Empty if file missing."""
|
||||||
|
try:
|
||||||
|
with open(GROUPS_PATH) as fh:
|
||||||
|
data = yaml.safe_load(fh) or {}
|
||||||
|
except FileNotFoundError:
|
||||||
|
return {}
|
||||||
|
return {k: v for k, v in data.items() if isinstance(k, str) and isinstance(v, str)}
|
||||||
|
|
||||||
|
|
||||||
|
def container_uuid(container) -> str | None:
|
||||||
|
env_list = container.attrs.get("Config", {}).get("Env") or []
|
||||||
|
for item in env_list:
|
||||||
|
if item.startswith(f"{COOLIFY_UUID_ENV}="):
|
||||||
|
return item.split("=", 1)[1]
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_group(name_or_uuid: str, allowed: dict[str, str]) -> str | None:
|
||||||
|
"""Accept either the UUID or the friendly name. Return UUID, or None if unknown."""
|
||||||
|
if name_or_uuid in allowed:
|
||||||
|
return name_or_uuid
|
||||||
|
for uuid, friendly in allowed.items():
|
||||||
|
if friendly == name_or_uuid:
|
||||||
|
return uuid
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def monitored_containers(allowed: dict[str, str]) -> Iterable[tuple[object, str]]:
|
||||||
|
"""Yield (container, uuid) for every running container whose UUID is allow-listed."""
|
||||||
|
for c in docker_client.containers.list(all=False):
|
||||||
|
uuid = container_uuid(c)
|
||||||
|
if uuid and uuid in allowed:
|
||||||
|
yield c, uuid
|
||||||
|
|
||||||
|
|
||||||
|
def parse_ts(prefix: str) -> float | None:
|
||||||
|
"""Docker emits nanosecond precision; Python only takes microseconds. Truncate."""
|
||||||
|
iso = prefix
|
||||||
|
if iso.endswith("Z"):
|
||||||
|
iso = iso[:-1] + "+00:00"
|
||||||
|
iso = re.sub(r"(\.\d{6})\d+", r"\1", iso)
|
||||||
|
try:
|
||||||
|
return datetime.fromisoformat(iso).timestamp()
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/healthz")
|
||||||
|
def healthz():
|
||||||
|
try:
|
||||||
|
docker_client.ping()
|
||||||
|
except Exception as exc:
|
||||||
|
raise HTTPException(status_code=503, detail=f"docker unreachable: {exc}") from exc
|
||||||
|
return {"ok": True}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/services")
|
||||||
|
def services():
|
||||||
|
allowed = load_groups()
|
||||||
|
by_uuid: dict[str, dict] = {}
|
||||||
|
for c, uuid in monitored_containers(allowed):
|
||||||
|
entry = by_uuid.setdefault(uuid, {"group": uuid, "name": allowed[uuid], "containers": []})
|
||||||
|
entry["containers"].append(c.name)
|
||||||
|
return JSONResponse([by_uuid[u] for u in sorted(by_uuid)])
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/logs/{group}")
|
||||||
|
def logs(
|
||||||
|
group: str,
|
||||||
|
since: int | None = Query(None, description="Unix seconds; default now-60"),
|
||||||
|
until: int | None = Query(None, description="Unix seconds; default now"),
|
||||||
|
limit: int = Query(2000, ge=1, le=10000),
|
||||||
|
):
|
||||||
|
allowed = load_groups()
|
||||||
|
target_uuid = resolve_group(group, allowed)
|
||||||
|
if target_uuid is None:
|
||||||
|
raise HTTPException(status_code=404, detail=f"Unknown group: {group}")
|
||||||
|
|
||||||
|
now = int(time.time())
|
||||||
|
since_ts = since if since is not None else now - 60
|
||||||
|
until_ts = until if until is not None else now
|
||||||
|
|
||||||
|
out: list[dict] = []
|
||||||
|
for c in docker_client.containers.list(all=False):
|
||||||
|
if container_uuid(c) != target_uuid:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
raw = c.logs(
|
||||||
|
stdout=True,
|
||||||
|
stderr=True,
|
||||||
|
since=since_ts,
|
||||||
|
until=until_ts,
|
||||||
|
timestamps=True,
|
||||||
|
tail=limit,
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
if not raw:
|
||||||
|
continue
|
||||||
|
for raw_line in raw.decode("utf-8", errors="replace").splitlines():
|
||||||
|
if not raw_line.strip():
|
||||||
|
continue
|
||||||
|
match = TS_NANO_RE.match(raw_line)
|
||||||
|
if match:
|
||||||
|
ts_val = parse_ts(match.group(1)) or float(since_ts)
|
||||||
|
msg = match.group(2)
|
||||||
|
else:
|
||||||
|
ts_val = float(since_ts)
|
||||||
|
msg = raw_line
|
||||||
|
out.append({"container": c.name, "ts": ts_val, "line": msg})
|
||||||
|
|
||||||
|
out.sort(key=lambda m: m["ts"])
|
||||||
|
return JSONResponse(out[:limit])
|
||||||
19
log-proxy/groups.yml
Normal file
19
log-proxy/groups.yml
Normal file
|
|
@ -0,0 +1,19 @@
|
||||||
|
# UUID -> friendly name. This file is also the allow-list:
|
||||||
|
# containers whose COOLIFY_RESOURCE_UUID is not listed here are ignored entirely.
|
||||||
|
#
|
||||||
|
# Add a line, restart log-proxy, clone a Poll workflow in n8n -> new group is live.
|
||||||
|
|
||||||
|
bo3nov2ija7g8wn9b1g2paxs: tracksolid
|
||||||
|
o55elukmxacgp1s2xcwktyam: n8n-prod
|
||||||
|
usoksgg8o40044g0cw08s8wc: n8n-simple
|
||||||
|
vc4ok84gw4s0kcgwwg8gooco: evolution-api
|
||||||
|
ks4sc8k4804swk0c0c4kk44c: chatwoot
|
||||||
|
foo048cw4skg8kswwsowwo0c: forgejo
|
||||||
|
u7rj0du43d33ncurig2t6ni1: dekart
|
||||||
|
e11bva63bu7swlq6zyfckxm3: rustfs
|
||||||
|
now8k08wcs044scwggos0wos: dozzle
|
||||||
|
|
||||||
|
# Explicitly NOT monitored (non-mission-critical, per user 2026-05-17):
|
||||||
|
# dy82njm7qgb5f2m573d1u3rh garage
|
||||||
|
# r77s24tgmfifmpfqe86xyqsp ente
|
||||||
|
# vw0wk0cg8gkwgwogsg4k0gsg excalidraw
|
||||||
4
log-proxy/requirements.txt
Normal file
4
log-proxy/requirements.txt
Normal file
|
|
@ -0,0 +1,4 @@
|
||||||
|
fastapi>=0.115,<1
|
||||||
|
uvicorn[standard]>=0.32,<1
|
||||||
|
docker>=7.1,<8
|
||||||
|
PyYAML>=6.0,<7
|
||||||
Loading…
Reference in a new issue