commit 1eda59fe0646ab9eb469f67088bc8c0b1a5c4dfe Author: david kiania Date: Tue Jun 16 23:43:24 2026 +0300 feat: read-only Fleet Analytics MCP server Standalone, hosted MCP server that lets the decision & analytics team query the fleet database (reporting.* / tracksolid.*) from Claude — read-only, for reporting and decisions, never edit/delete. - analytics_mcp.py: FastMCP streamable-HTTP server. Tools: query (guarded single SELECT/WITH, auto-LIMIT, write/DDL blocked), list_schemas, list_tables, describe_table, list_functions, sample_table. Per-analyst Bearer auth; /healthz exempt. No ts_shared_rev import (carries no ingestion secrets). - Read-only enforced at four layers: analytics_ro GRANTs, default_transaction_read_only=on, rolled-back txn, SQL keyword guard. - scripts/: analytics_ro_role.sql + bootstrap_analytics_ro.sh (dedicated least-privilege role, password in host-only ~/.analytics_ro.pw). - Dockerfile + pyproject (uv, package=false) for Coolify build; deploy.sh manual host fallback (standalone Traefik bridge on the tracksolid_db host). - docs/ANALYTICS_MCP.{md,html} + README: architecture, deploy runbook, add-to-Claude, verification, security notes. Co-Authored-By: Claude Opus 4.8 diff --git a/.dockerignore b/.dockerignore new file mode 100644 index 0000000..eed53de --- /dev/null +++ b/.dockerignore @@ -0,0 +1,13 @@ +.git +.venv +__pycache__ +*.pyc +.env +*.env +.analytics_ro.pw +docs +scripts +deploy.sh +.ruff_cache +.mypy_cache +README.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..43847f8 --- /dev/null +++ b/.gitignore @@ -0,0 +1,10 @@ +.venv/ +__pycache__/ +*.pyc +.env +*.env +.analytics_ro.pw +.DS_Store +.ruff_cache/ +.mypy_cache/ +uv.lock diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 0000000..7e069e8 --- /dev/null +++ b/Dockerfile @@ -0,0 +1,25 @@ +# fleetanalytics-mcp — read-only Fleet Analytics MCP server. +# Coolify auto-detects this Dockerfile: set the app port to 8892, attach the +# domain (e.g. fleetmcp.rahamafresh.com) in the Coolify UI, set DATABASE_URL +# (analytics_ro DSN) + MCP_AUTH_TOKENS as secrets, and connect the app to the +# network that can reach timescale_db. See README.md / docs/ANALYTICS_MCP.md. +FROM python:3.12-slim + +# uv for fast, reproducible dependency installs. +COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ + +WORKDIR /app + +# Install ONLY dependencies (flat module — the project itself is not a package). +COPY pyproject.toml ./ +RUN uv sync --no-dev --no-install-project +ENV PATH="/app/.venv/bin:$PATH" + +COPY analytics_mcp.py ./ + +EXPOSE 8892 + +HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \ + CMD python -c "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:8892/healthz').status==200 else 1)" || exit 1 + +CMD ["uvicorn", "analytics_mcp:app", "--host", "0.0.0.0", "--port", "8892", "--workers", "2"] diff --git a/README.md b/README.md new file mode 100644 index 0000000..3db12f8 --- /dev/null +++ b/README.md @@ -0,0 +1,93 @@ +# Fleet Analytics MCP (read-only) + +A **read-only MCP server** that lets the decision & analytics team query the Fireside +fleet database (`tracksolid_db` — PostgreSQL 16 + TimescaleDB + PostGIS) directly from +**Claude** — for reporting and decisions, **never edit/delete**. + +It exposes a guarded general `SELECT` tool plus schema-introspection tools over the +`reporting.*` (curated analytics layer) and `tracksolid.*` (raw telemetry) schemas, +connecting as a dedicated least-privilege **`analytics_ro`** role. It is hosted on the +same Coolify host as the database (the DB is internal-only and not reachable from a +laptop), and authed with per-analyst Bearer tokens. + +> Sibling of the `tracksolid_timescale_grafana_prod` backend (the DB/ingestion stack) +> and the `dashboard_api` read bridge. This repo owns only the analytics MCP server and +> its `analytics_ro` role. + +## Read-only is enforced at four layers + +1. **Role GRANTs** — `analytics_ro` has only `USAGE`+`SELECT` on `reporting`/`tracksolid` + and `EXECUTE` on `reporting` functions; no INSERT/UPDATE/DELETE, not the matview owner. +2. **`default_transaction_read_only = on`** — set on the role and on every connection. +3. **Rolled-back transactions** — every query runs in a txn that is rolled back, never committed. +4. **SQL guard** — the `query` tool accepts a single `SELECT`/`WITH` statement only and + rejects write/DDL keywords (clean errors instead of DB faults). + +It deliberately does **not** import the backend's `ts_shared_rev`, so it carries none of +the Tracksolid ingestion secrets — it needs only `DATABASE_URL` + `MCP_AUTH_TOKENS`. + +## Tools + +| Tool | Purpose | +|---|---| +| `query(sql, max_rows=1000)` | guarded read-only SELECT/WITH; auto-LIMIT; returns `{row_count, truncated, rows}` | +| `list_schemas()` | readable schemas + object counts | +| `list_tables(schema)` | tables + views in a schema | +| `describe_table(schema, table)` | columns, types, nullability, defaults | +| `list_functions(schema='reporting')` | `reporting.fn_*` signatures | +| `sample_table(schema, table, n=20)` | first `n` rows (wrapper over `query`) | + +## Layout + +``` +analytics_mcp.py # the MCP server (FastMCP streamable-HTTP; uvicorn target analytics_mcp:app) +Dockerfile # Coolify-buildable image (port 8892) +deploy.sh # manual host deploy (standalone Traefik bridge) — fallback to Coolify +scripts/analytics_ro_role.sql # the read-only role DDL +scripts/bootstrap_analytics_ro.sh# host bootstrap: generate pw → apply role SQL +docs/ANALYTICS_MCP.md / .html # full implementation guide + runbook +``` + +## Deploy + +The DB is internal-only, so the server runs on the **same Coolify host as `timescale_db`**. + +**0. Create the read-only role (once, on the host):** +```bash +scp scripts/analytics_ro_role.sql scripts/bootstrap_analytics_ro.sh kianiadee@twala.rahamafresh.com:~/ +ssh kianiadee@twala.rahamafresh.com 'bash ~/bootstrap_analytics_ro.sh' # writes ~/.analytics_ro.pw (0600) +``` + +**1a. Coolify-managed app (recommended):** create a Coolify application from this repo +(Forgejo `repo.rahamafresh.com/kianiadee/fleetanalytics_mcp.git`), Dockerfile build, app +port `8892`, attach the domain `fleetmcp.rahamafresh.com` (prod) / `fleetmcp.fivetitude.com` +(staging). Set as **secrets**: +- `DATABASE_URL=postgresql://analytics_ro:@timescale_db:5432/tracksolid_db` +- `MCP_AUTH_TOKENS=alice:,bob:` (per-analyst) + +Then **connect the app to the network that can reach `timescale_db`** (the tracksolid +stack's network) so the `timescale_db` hostname resolves. Coolify manages the Traefik +labels + TLS from the domain you set. + +**1b. Manual host deploy (fallback):** check this repo out on the host and run `deploy.sh` +— it builds the image, derives the read-only `DATABASE_URL` from the running stack, and +runs a standalone Traefik bridge. See the script header. + +## Add to Claude (for analysts) + +```bash +claude mcp add --transport http fireside-analytics https://fleetmcp.rahamafresh.com \ + --header "Authorization: Bearer " +claude mcp list # → "fireside-analytics: connected" +``` +Claude Desktop / claude.ai: add a custom connector with the same URL and +`Authorization: Bearer ` header. + +## Local dev + +```bash +uv sync +DATABASE_URL=postgresql://... MCP_AUTH_TOKENS=me:dev uv run uvicorn analytics_mcp:app --port 8892 +``` + +Full reference, security notes, and verification checklist: [`docs/ANALYTICS_MCP.md`](docs/ANALYTICS_MCP.md). diff --git a/analytics_mcp.py b/analytics_mcp.py new file mode 100644 index 0000000..b6b74d4 --- /dev/null +++ b/analytics_mcp.py @@ -0,0 +1,279 @@ +""" +analytics_mcp_rev.py — Fireside Communications · Read-only Analytics MCP Server +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Hosted MCP server for the decision & analytics team. Exposes the fleet reporting +data (reporting.* + tracksolid.*) to Claude as READ-ONLY query + introspection +tools — for reporting and decisions, never edit/delete. + +It is a STANDALONE Traefik-labelled bridge (not Coolify-managed), the same shape +as the dashboard_api staging bridge: it reuses the webhook_receiver image, joins +the `coolify` network, and connects to the internal DB over psycopg2 as the +dedicated read-only `analytics_ro` role (deploy_analytics_mcp.sh sets DATABASE_URL +to that DSN). Served over streamable HTTP with Bearer-token auth. + +READ-ONLY is enforced at FOUR layers: + 1. the analytics_ro GRANTs (no INSERT/UPDATE/DELETE; not the matview owner) + 2. role + connection default_transaction_read_only = on + 3. every query runs in a transaction that is ROLLED BACK (never committed) + 4. the `query` tool's single-statement / keyword guard (clean errors, not DB faults) + +Env: + DATABASE_URL analytics_ro DSN (set by the deploy script) + MCP_AUTH_TOKENS "alice:tok1,bob:tok2" — per-analyst Bearer tokens (revocable + audited) + MCP_MAX_ROWS hard ceiling on rows returned (default 10000) + MCP_POOL_MAX max read-only pool connections (default 8) +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +""" +from __future__ import annotations + +import logging +import os +import re +import time +from contextlib import contextmanager + +import psycopg2 +import psycopg2.extras +import psycopg2.pool +from mcp.server.fastmcp import FastMCP +from starlette.middleware.base import BaseHTTPMiddleware +from starlette.responses import JSONResponse + + +def _get_logger(name: str) -> logging.Logger: + """Standalone logger mirroring ts_shared_rev's format. Intentionally NOT + importing ts_shared_rev: that module eagerly requires the Tracksolid ingestion + secrets (APP_KEY/SECRET/PWD), which this read-only analytics server has no + business holding.""" + root = logging.getLogger("analytics_mcp") + if not root.handlers: + handler = logging.StreamHandler() + handler.setFormatter( + logging.Formatter( + "%(asctime)s [%(levelname)s] %(name)s — %(message)s", + datefmt="%Y-%m-%d %H:%M:%S", + ) + ) + root.addHandler(handler) + root.setLevel(logging.INFO) + return root.getChild(name) + + +log = _get_logger("server") + +DATABASE_URL = os.environ["DATABASE_URL"] # analytics_ro DSN (set by deploy) +MAX_ROWS_CEIL = int(os.getenv("MCP_MAX_ROWS", "10000")) +READABLE_SCHEMAS = ("reporting", "tracksolid") + +# ── Read-only connection pool ──────────────────────────────────────────────── +# Force read-only + a statement timeout at the connection level (belt + braces; +# the analytics_ro role already sets these, but a self-contained server is safer +# in case it is ever pointed at a less-restricted DSN). +_pool = psycopg2.pool.ThreadedConnectionPool( + 1, + int(os.getenv("MCP_POOL_MAX", "8")), + DATABASE_URL, + options="-c default_transaction_read_only=on -c statement_timeout=30000 -c client_encoding=UTF8", +) + + +@contextmanager +def _ro_conn(): + """Read-only connection; the transaction is ALWAYS rolled back (never commits).""" + conn = _pool.getconn() + try: + conn.set_session(readonly=True, autocommit=False) + yield conn + finally: + try: + conn.rollback() + finally: + _pool.putconn(conn) + + +def _rows(cur) -> list[dict]: + """Materialise the cursor as a list of JSON-safe dicts.""" + if cur.description is None: + return [] + cols = [d[0] for d in cur.description] + out = [] + for row in cur.fetchall(): + out.append({c: _jsonable(v) for c, v in zip(cols, row)}) + return out + + +def _jsonable(v): + """Coerce non-JSON-native values (dates, Decimal, etc.) to str.""" + if v is None or isinstance(v, (bool, int, float, str)): + return v + return str(v) + + +# ── SQL guard for the general query tool ───────────────────────────────────── +# The analytics_ro role + read-only txn already make writes impossible; this guard +# exists to return CLEAN errors (and block multi-statements / SET that could relax +# read-only) instead of letting the DB raise. +_FORBIDDEN = re.compile( + r"\b(insert|update|delete|drop|alter|create|grant|revoke|truncate|copy|call|do|merge|" + r"vacuum|reindex|refresh|comment|lock|set|reset)\b", + re.IGNORECASE, +) + + +def _strip_comments(sql: str) -> str: + sql = re.sub(r"/\*.*?\*/", " ", sql, flags=re.DOTALL) # block comments + sql = re.sub(r"--[^\n]*", " ", sql) # line comments + return sql.strip() + + +def _guard(sql: str) -> str: + """Validate a single read-only statement; return the cleaned statement.""" + stripped = _strip_comments(sql) + if not stripped: + raise ValueError("Empty query.") + parts = [p for p in stripped.split(";") if p.strip()] # allow one trailing ; + if len(parts) != 1: + raise ValueError("Only a single statement is allowed.") + stmt = parts[0].strip() + if not re.match(r"^(select|with)\b", stmt, re.IGNORECASE): + raise ValueError("Only SELECT / WITH queries are allowed.") + if _FORBIDDEN.search(stmt): + raise ValueError("Query contains a forbidden (write/DDL) keyword.") + return stmt + + +# ── MCP server + tools ─────────────────────────────────────────────────────── +mcp = FastMCP("fireside-analytics", stateless_http=True) + + +@mcp.tool() +def query(sql: str, max_rows: int = 1000) -> dict: + """Run a read-only SELECT/WITH query against the fleet database. + + Only the reporting.* and tracksolid.* schemas are readable. Single statement + only; write/DDL is rejected. Returns up to `max_rows` rows (default 1000, hard + cap 10000). A LIMIT is auto-applied when absent. Result: {row_count, truncated, rows}. + """ + stmt = _guard(sql) + cap = max(1, min(int(max_rows), MAX_ROWS_CEIL)) + if not re.search(r"\blimit\b", stmt, re.IGNORECASE): + stmt = f"{stmt}\nLIMIT {cap + 1}" # +1 row to detect truncation + t0 = time.monotonic() + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute(stmt) + rows = _rows(cur) + truncated = len(rows) > cap + rows = rows[:cap] + dur_ms = int((time.monotonic() - t0) * 1000) + log.info("query rows=%d trunc=%s %dms :: %s", len(rows), truncated, dur_ms, sql[:200]) + return {"row_count": len(rows), "truncated": truncated, "rows": rows} + + +@mcp.tool() +def list_schemas() -> list[dict]: + """List the readable schemas (reporting, tracksolid) with their object counts.""" + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT table_schema AS schema, count(*) AS objects " + "FROM information_schema.tables WHERE table_schema = ANY(%s) " + "GROUP BY 1 ORDER BY 1", + (list(READABLE_SCHEMAS),), + ) + return _rows(cur) + + +@mcp.tool() +def list_tables(schema: str) -> list[dict]: + """List tables + views in a schema (must be reporting or tracksolid).""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT table_name AS name, table_type AS kind " + "FROM information_schema.tables WHERE table_schema = %s " + "ORDER BY 1", + (schema,), + ) + return _rows(cur) + + +@mcp.tool() +def describe_table(schema: str, table: str) -> list[dict]: + """Describe a table/view: columns, types, nullability, defaults.""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT column_name AS column, data_type AS type, " + "is_nullable AS nullable, column_default AS default " + "FROM information_schema.columns " + "WHERE table_schema = %s AND table_name = %s ORDER BY ordinal_position", + (schema, table), + ) + return _rows(cur) + + +@mcp.tool() +def list_functions(schema: str = "reporting") -> list[dict]: + """List callable functions (e.g. reporting.fn_*) with their argument signatures.""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT p.proname AS name, pg_get_function_arguments(p.oid) AS args " + "FROM pg_proc p JOIN pg_namespace n ON n.oid = p.pronamespace " + "WHERE n.nspname = %s ORDER BY 1", + (schema,), + ) + return _rows(cur) + + +_IDENT = re.compile(r"^[a-z_][a-z0-9_]*$", re.IGNORECASE) + + +@mcp.tool() +def sample_table(schema: str, table: str, n: int = 20) -> dict: + """Return the first `n` rows of a table/view (convenience over query).""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + if not _IDENT.match(table): + raise ValueError("table must be a simple identifier") + return query(f'SELECT * FROM "{schema}"."{table}"', max_rows=n) + + +# ── Bearer-token auth ───────────────────────────────────────────────────────── +# MCP_AUTH_TOKENS = "alice:tok1,bob:tok2" → {token: name}. Per-analyst tokens make +# access revocable (edit the env + redeploy) and attributable in the logs. +_TOKENS = { + t.split(":", 1)[1]: t.split(":", 1)[0] + for t in os.getenv("MCP_AUTH_TOKENS", "").split(",") + if ":" in t +} + + +class BearerAuth(BaseHTTPMiddleware): + async def dispatch(self, request, call_next): + if request.url.path == "/healthz": + return await call_next(request) + auth = request.headers.get("authorization", "") + token = auth[7:] if auth.lower().startswith("bearer ") else "" + caller = _TOKENS.get(token) + if caller is None: + return JSONResponse({"error": "unauthorized"}, status_code=401) + request.state.caller = caller + return await call_next(request) + + +async def healthz(_request): + return JSONResponse({"ok": True, "tokens": len(_TOKENS)}) + + +app = mcp.streamable_http_app() +app.add_middleware(BearerAuth) +# Starlette exposes add_route (not a Flask-style @app.route decorator). +app.add_route("/healthz", healthz, methods=["GET"]) + + +if not _TOKENS: + log.warning("MCP_AUTH_TOKENS is empty — every request will be rejected with 401.") +log.info("Analytics MCP starting. Tokens loaded=%d. Readable schemas=%s.", len(_TOKENS), READABLE_SCHEMAS) diff --git a/deploy.sh b/deploy.sh new file mode 100755 index 0000000..9047818 --- /dev/null +++ b/deploy.sh @@ -0,0 +1,77 @@ +#!/usr/bin/env bash +# deploy.sh — manual host deploy for the read-only Fleet Analytics MCP server. +# ───────────────────────────────────────────────────────────────────────────── +# Use this if you are NOT letting Coolify build the Dockerfile (see README §Deploy +# for the Coolify-managed path, which is the recommended default). This script +# builds the image from this repo ON THE HOST and runs it as a standalone +# Traefik-labelled bridge — the same proven pattern as the dashboard_api bridges: +# it joins the network that can reach timescale_db, derives a READ-ONLY DATABASE_URL +# for the analytics_ro role, and exposes the MCP over HTTPS with Bearer auth. +# +# Prereqs ON THE HOST: +# * the analytics_ro role exists -> scripts/bootstrap_analytics_ro.sh (writes ~/.analytics_ro.pw) +# * this repo is checked out (e.g. ~/fleetanalytics_mcp) — run this script from inside it +# +# Run: +# cd ~/fleetanalytics_mcp && git pull +# MCP_AUTH_TOKENS="alice:$(openssl rand -hex 16)" bash deploy.sh +# +# A token/env change needs a container RECREATE — this script does that. Record +# each analyst's token securely; it is only shown once (when you generate it). +# ───────────────────────────────────────────────────────────────────────────── +set -euo pipefail + +NAME=analytics_mcp +PORT=8892 +HOST_DOMAIN="${HOST_DOMAIN:-fleetmcp.fivetitude.com}" # prod: fleetmcp.rahamafresh.com +IMAGE="fleetanalytics-mcp:latest" +ENV_FILE="$(pwd)/.deploy.env" +: "${MCP_AUTH_TOKENS:?set MCP_AUTH_TOKENS=name:token[,name:token...] before running (per-analyst Bearer tokens)}" + +# Resolve the network + DB DSN from the running webhook_receiver (it sits on the +# same internal network as timescale_db and holds a DATABASE_URL we can reuse the +# host:port/dbname from). This avoids hardcoding the internal DB hostname. +WH=$(docker ps --filter name=webhook_receiver --format "{{.Names}}" | head -1) +[ -n "$WH" ] || { echo "ERROR: webhook_receiver container not found (need the tracksolid stack running)"; exit 1; } +APPNET=$(docker inspect "$WH" --format '{{range $k,$v := .NetworkSettings.Networks}}{{$k}}{{end}}') +SRC_DB_URL=$(docker inspect "$WH" --format '{{range .Config.Env}}{{println .}}{{end}}' | sed -n 's/^DATABASE_URL=//p' | head -1) +[ -n "$SRC_DB_URL" ] || { echo "ERROR: DATABASE_URL not found in $WH env"; exit 1; } +echo "Reusing network $APPNET (from $WH)" + +# Build a READ-ONLY DATABASE_URL: app DB host:port/dbname + analytics_ro creds. +RO_PW=$(cat "${ANALYTICS_RO_PW_FILE:-$HOME/.analytics_ro.pw}" 2>/dev/null || true) +[ -n "$RO_PW" ] || { echo "ERROR: ~/.analytics_ro.pw missing — run scripts/bootstrap_analytics_ro.sh first"; exit 1; } +HOSTPART="${SRC_DB_URL#*@}" # host:port/dbname[?params] +RO_DB_URL="postgresql://analytics_ro:${RO_PW}@${HOSTPART}" + +# Build the image from this repo. +echo "Building $IMAGE ..." +docker build -t "$IMAGE" . + +# Minimal env (read-only DSN + auth only — no Tracksolid ingestion secrets). +{ echo "DATABASE_URL=${RO_DB_URL}"; echo "MCP_AUTH_TOKENS=${MCP_AUTH_TOKENS}"; } > "$ENV_FILE" +chmod 600 "$ENV_FILE" + +docker rm -f "$NAME" 2>/dev/null || true +docker run -d --name "$NAME" --restart unless-stopped \ + --network "$APPNET" \ + --env-file "$ENV_FILE" \ + --label 'traefik.enable=true' \ + --label 'traefik.docker.network=coolify' \ + --label 'traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https' \ + --label "traefik.http.routers.http-0-fleetmcp.entryPoints=http" \ + --label "traefik.http.routers.http-0-fleetmcp.middlewares=redirect-to-https" \ + --label "traefik.http.routers.http-0-fleetmcp.rule=Host(\`${HOST_DOMAIN}\`)" \ + --label "traefik.http.routers.https-0-fleetmcp.entryPoints=https" \ + --label "traefik.http.routers.https-0-fleetmcp.rule=Host(\`${HOST_DOMAIN}\`)" \ + --label "traefik.http.routers.https-0-fleetmcp.tls=true" \ + --label "traefik.http.routers.https-0-fleetmcp.tls.certresolver=letsencrypt" \ + --label "traefik.http.services.fleetmcp.loadbalancer.server.port=${PORT}" \ + "$IMAGE" + +docker network connect coolify "$NAME" 2>/dev/null || true +rm -f "$ENV_FILE" +sleep 5 +echo "== container =="; docker ps --filter name="$NAME" --format "{{.Names}} | {{.Status}}" +echo "== DB role (expect analytics_ro) =="; docker exec "$NAME" sh -lc 'printenv DATABASE_URL | sed -E "s#://([^:]+):[^@]+@#://\1:@#"' +echo "== health =="; docker exec "$NAME" sh -lc "curl -s http://localhost:${PORT}/healthz" 2>&1 | head diff --git a/docs/ANALYTICS_MCP.html b/docs/ANALYTICS_MCP.html new file mode 100644 index 0000000..939be71 --- /dev/null +++ b/docs/ANALYTICS_MCP.html @@ -0,0 +1,223 @@ + + + +Read-only Analytics MCP Server — Implementation Guide +
+ +

Read-only Analytics MCP Server

+

Implementation guide · standalone repo fleetanalytics_mcp, hosted on the tracksolid_db Coolify host · 2026-06-16 · built — pending deploy

+ +

1. Purpose & context

+

The decision & analytics team needs to pull fleet reporting data (fuel, utilisation, +driver behaviour, INC tickets, raw telemetry) from tracksolid_db to make +decisions — read-only, never edit/delete. The only programmatic surface today is the +dashboard_api FastAPI bridge with a fixed set of /analytics/* / +/webhook/* endpoints — too rigid for ad-hoc analysis.

+

This adds a hosted, read-only MCP server that lets analysts query the database +directly from Claude: a guarded general SELECT tool plus schema-introspection +tools, pointed at the existing PostgreSQL 16 + TimescaleDB + PostGIS database through a +new least-privilege analytics_ro role.

+

The DB is internal-only (DATABASE_URLtimescale_db:5432 on the +Docker network, not reachable from a laptop), so the server is hosted on the same Coolify +host as the DB. It ships as its own repo with its own Dockerfile +(Coolify-buildable) and joins the network that can reach timescale_db; a +deploy.sh manual fallback mirrors the proven dashboard_api bridge pattern.

+
Read-only is enforced at four layers: the analytics_ro +GRANTs (no INSERT/UPDATE/DELETE) · a session default_transaction_read_only = on +· a transaction that is rolled back (never committed) · a single-statement / keyword +SQL guard in the query tool.
+ +

Where this sits

+
+
Analyst's ClaudeCode / Desktop / claude.ai
+
+
Traefikfleetmcp.fivetitude.com · HTTPS + Bearer
+
+
analytics_mcpuvicorn :8892 · coolify net
role = analytics_ro · READ ONLY
+
+
timescale_db:5432tracksolid_db
reporting.* · tracksolid.*
+
+

Ports in use: 8890 prod dashboard_api · 8891 staging dashboard_api · 8892 analytics_mcp.

+ +

2. Repo contents

+ + + + + + + + + +
FileWhat
analytics_mcp.pythe MCP server (FastMCP streamable-HTTP; uvicorn target analytics_mcp:app)
DockerfileCoolify-buildable image (port 8892)
pyproject.tomldeps (mcp[cli], psycopg2-binary, uvicorn)
deploy.shmanual host deploy (standalone Traefik bridge) — fallback to Coolify
scripts/analytics_ro_role.sqlread-only role DDL (modelled on the backend's dashboard_ro_role.sql + hardening)
scripts/bootstrap_analytics_ro.shhost bootstrap: generate pw → apply role SQL
docs/ANALYTICS_MCP.md / .htmlthis guide
+ +

3. Step 1 — the analytics_ro role

+

Modelled on scripts/dashboard_ro_role.sql. Run as the postgres +superuser (it does CREATE ROLE), with the password supplied as psql var +:'ro_pw'no secret in the repo.

+

scripts/analytics_ro_role.sql

+
-- read-only LOGIN role for the analytics MCP server. Apply via bootstrap_analytics_ro.sh.
+\set ON_ERROR_STOP on
+DO $role$ BEGIN
+  IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'analytics_ro') THEN
+    CREATE ROLE analytics_ro LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE;
+  END IF; END $role$;
+ALTER ROLE analytics_ro WITH LOGIN PASSWORD :'ro_pw';
+
+GRANT CONNECT ON DATABASE tracksolid_db TO analytics_ro;
+GRANT USAGE   ON SCHEMA reporting, tracksolid TO analytics_ro;
+GRANT SELECT  ON ALL TABLES IN SCHEMA reporting  TO analytics_ro;  -- tables + views
+GRANT SELECT  ON ALL TABLES IN SCHEMA tracksolid TO analytics_ro;
+GRANT SELECT  ON reporting.v_trips TO analytics_ro;            -- matview (not in ALL TABLES)
+GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA reporting TO analytics_ro;
+-- future objects auto-grant
+ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting  GRANT SELECT  ON TABLES TO analytics_ro;
+ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA tracksolid GRANT SELECT  ON TABLES TO analytics_ro;
+ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting  GRANT EXECUTE ON FUNCTIONS TO analytics_ro;
+-- extra hardening over dashboard_ro: this role serves ad-hoc HUMAN queries
+ALTER ROLE analytics_ro SET default_transaction_read_only = on;
+ALTER ROLE analytics_ro SET statement_timeout = '30s';
+ALTER ROLE analytics_ro SET idle_in_transaction_session_timeout = '60s';
+

scripts/bootstrap_analytics_ro.sh

+

Clone of bootstrap_dashboard_ro.sh — generates ~/.analytics_ro.pw +(0600) on first run, applies the SQL via docker exec … psql -v ro_pw=…. The +password is never printed and never leaves the host.

+ +

4. The MCP server (analytics_mcp.py)

+

FastMCP streamable-HTTP server, served by uvicorn (target analytics_mcp:app). +It uses its own read-only psycopg2 pool and a small local logger — it deliberately does +not import the backend's ts_shared_rev (that module eagerly requires the +Tracksolid ingestion secrets, which this read-only server has no business holding). Tools exposed:

+ + + + + + + + +
ToolPurpose
query(sql, max_rows=1000)guarded read-only SELECT/WITH; single statement, keyword-blocked, auto-LIMIT; returns rows + truncated flag
list_schemas()readable schemas (reporting, tracksolid) + object counts
list_tables(schema)tables + views in a schema
describe_table(schema, table)columns, types, nullability, defaults
list_functions(schema='reporting')reporting.fn_* signatures
sample_table(schema, table, n=20)first n rows (thin wrapper over query)
+

The core guard + connection logic:

+
# read-only pool: force read-only + statement timeout at connection level (belt + braces)
+_pool = psycopg2.pool.ThreadedConnectionPool(1, 8, DATABASE_URL,
+    options="-c default_transaction_read_only=on -c statement_timeout=30000")
+
+@contextmanager
+def _ro_conn():                              # txn is ALWAYS rolled back — never commits
+    conn = _pool.getconn()
+    try:
+        conn.set_session(readonly=True, autocommit=False)
+        yield conn
+    finally:
+        conn.rollback(); _pool.putconn(conn)
+
+def _guard(sql):                              # single SELECT/WITH, no write/DDL keywords
+    stmt = _strip_comments(sql)
+    parts = [p for p in stmt.split(";") if p.strip()]
+    if len(parts) != 1: raise ValueError("Only a single statement is allowed.")
+    stmt = parts[0].strip()
+    if not re.match(r"^(select|with)\b", stmt, re.I): raise ValueError("Only SELECT/WITH allowed.")
+    if _FORBIDDEN.search(stmt): raise ValueError("Forbidden (write/DDL) keyword.")
+    return stmt
+

Auth is a Starlette BaseHTTPMiddleware that requires +Authorization: Bearer <token>. Tokens come from env +MCP_AUTH_TOKENS="alice:tok1,bob:tok2" (per-analyst → revocable + attributable in +logs); /healthz is exempt. The app is mounted via +app = mcp.streamable_http_app(), then app.add_middleware(BearerAuth) +and app.add_route("/healthz", …) (Starlette exposes add_route, not a +Flask-style @app.route decorator — verified against the installed mcp).

+
Full, current source is the repo's analytics_mcp.py; the excerpt +above is abridged.
+ +

5. Packaging — Dockerfile + pyproject.toml

+

Self-contained: pyproject.toml declares the deps (mcp[cli], +psycopg2-binary, uvicorn[standard]) and the Dockerfile +builds a slim image running uvicorn analytics_mcp:app on port 8892. The project is +a flat single module, so [tool.uv] package = false and the Dockerfile installs +dependencies only (uv sync --no-dev --no-install-project) — no dependency on +the backend image.

+ +

6. Deploy

+

The DB is internal-only, so the server runs on the same Coolify host as +timescale_db.

+

Recommended — Coolify-managed app. Create a Coolify app from this repo, Dockerfile +build, app port 8892, domain fleetmcp.rahamafresh.com (prod) / +fleetmcp.fivetitude.com (staging). Set secrets +DATABASE_URL=postgresql://analytics_ro:<pw>@timescale_db:5432/tracksolid_db and +MCP_AUTH_TOKENS=alice:<tok>,bob:<tok>, then connect the app to the network +that can reach timescale_db so the hostname resolves. Coolify manages Traefik + +TLS from the domain; auto-deploys on push via the Forgejo webhook.

+

Fallback — deploy.sh. Check the repo out on the host and run it: it builds +the image, resolves the DB network + DSN from the running stack, swaps in the +analytics_ro credentials, and runs a standalone Traefik bridge.

+
cd ~/fleetanalytics_mcp && git pull
+MCP_AUTH_TOKENS="alice:$(openssl rand -hex 16)" bash deploy.sh
+ +

7. Deploy runbook (ordered)

+
    +
  1. Role (once): scp the role SQL + bootstrap to twala.rahamafresh.com, run bootstrap_analytics_ro.sh (writes ~/.analytics_ro.pw).
  2. +
  3. App: point Coolify at this repo (§6) or run deploy.sh on the host. Record each analyst's token (shown once).
  4. +
  5. Network: ensure the MCP container shares a Docker network with timescale_db so the DSN host resolves.
  6. +
  7. DNS/Traefik: ensure fleetmcp.* resolves to the host; Coolify/Traefik issues the cert.
  8. +
+ +

8. Add to Claude (for analysts)

+
# Claude Code
+claude mcp add --transport http fireside-analytics https://fleetmcp.fivetitude.com \
+  --header "Authorization: Bearer <your-token>"
+claude mcp list      # → "fireside-analytics: connected"
+

Claude Desktop / claude.ai: add a custom connector with the same URL and an +Authorization: Bearer <your-token> header. Example prompts: "list the +schemas", "describe reporting.v_daily_summary", "top 10 cost centres by distance +in the last 30 days".

+ +

9. Verification checklist

+
    +
  • psql -U analytics_ro … "SELECT count(*) FROM reporting.v_daily_summary" succeeds.
  • +
  • psql -U analytics_ro … "CREATE TABLE x(i int)" fails (permission denied) — proves read-only.
  • +
  • the image builds (docker build . or Coolify build); analytics_mcp is Up; the container can reach timescale_db.
  • +
  • DATABASE_URL shows analytics_ro (pw masked); curl localhost:8892/healthz returns {"ok":true,…}.
  • +
  • claude mcp list shows connected; list_schemas / describe_table / a real query return data.
  • +
  • query("UPDATE reporting.refresh_log …") is rejected by the guard.
  • +
  • A request with a missing/bad bearer token returns 401.
  • +
  • docker logs analytics_mcp shows one audit line per query (caller, SQL, rows, ms).
  • +
+ +

10. Security notes

+
    +
  • Four read-only layers: role GRANTs · default_transaction_read_only=on (role + connection) · rolled-back txn · SQL keyword guard.
  • +
  • Least privilege: analytics_ro only has USAGE+SELECT on reporting/tracksolid and EXECUTE on reporting functions.
  • +
  • Per-analyst tokens make access revocable and queries attributable; rotate via MCP_AUTH_TOKENS + redeploy (recreate).
  • +
  • Resource guards: statement_timeout=30s, idle-txn timeout, row cap (1000 default / 10000 ceiling).
  • +
  • Future: swap static Bearer for OAuth if the team scales; add a column deny-list if PII lives in tracksolid.*.
  • +
+ +

Companion file: docs/ANALYTICS_MCP.md (full source for all four new files).

+
diff --git a/docs/ANALYTICS_MCP.md b/docs/ANALYTICS_MCP.md new file mode 100644 index 0000000..4ac713f --- /dev/null +++ b/docs/ANALYTICS_MCP.md @@ -0,0 +1,425 @@ +# Read-only Analytics MCP Server — Implementation Guide + +> **Audience:** engineer deploying/maintaining the server. **Status:** built — pending deploy. +> **Repo:** `fleetanalytics_mcp` (standalone; `repo.rahamafresh.com/kianiadee/fleetanalytics_mcp.git`). +> Hosted on the same Coolify host as `tracksolid_db`. **Date:** 2026-06-16. + +## 1. Purpose & context + +The decision & analytics team needs to pull fleet reporting data (fuel, utilisation, driver +behaviour, INC tickets, raw telemetry) from `tracksolid_db` to make decisions — **read-only, +never edit/delete**. The only programmatic surface today is the `dashboard_api` FastAPI +bridge with a fixed set of `/analytics/*` / `/webhook/*` endpoints — too rigid for ad-hoc +analysis. + +This adds a **hosted, read-only MCP server** that lets analysts query the database directly +from Claude: a guarded general `SELECT` tool plus schema-introspection tools, pointed at the +existing PostgreSQL 16 + TimescaleDB + PostGIS database through a **new least-privilege +`analytics_ro` role**. + +The DB is internal-only (`DATABASE_URL` → `timescale_db:5432` on the Docker network, not +reachable from a laptop), so the server is **hosted on the same Coolify host as the DB**. +It ships as its **own repo with its own Dockerfile** (Coolify-buildable), and joins the +network that can reach `timescale_db`. A `deploy.sh` is included as a manual host-deploy +fallback that mirrors the proven `dashboard_api` bridge pattern. + +**Read-only is enforced at four layers:** the `analytics_ro` GRANTs (no +INSERT/UPDATE/DELETE), a session `default_transaction_read_only = on`, a transaction that is +**rolled back** (never committed), and a single-statement / keyword SQL guard in the `query` +tool. + +### Where this sits + +``` +analyst's Claude ──HTTPS (Bearer)──► fleetmcp.fivetitude.com (Traefik) + │ + analytics_mcp container (uvicorn :8892, coolify net) + │ psycopg2, role = analytics_ro, READ ONLY + ▼ + timescale_db:5432 (tracksolid_db) + reporting.* · tracksolid.* +``` + +Ports in use: `8890` prod dashboard_api · `8891` staging dashboard_api · **`8892` analytics_mcp**. + +--- + +## 2. Repo contents + +| File | What | +|---|---| +| `analytics_mcp.py` | the MCP server (FastMCP streamable-HTTP; uvicorn target `analytics_mcp:app`) | +| `Dockerfile` | Coolify-buildable image (port 8892) | +| `pyproject.toml` | dependencies (`mcp[cli]`, `psycopg2-binary`, `uvicorn`) | +| `deploy.sh` | manual host deploy (standalone Traefik bridge) — fallback to Coolify | +| `scripts/analytics_ro_role.sql` | read-only role DDL (modelled on the backend's `dashboard_ro_role.sql` + hardening) | +| `scripts/bootstrap_analytics_ro.sh` | host bootstrap: generate pw → apply role SQL | +| `docs/ANALYTICS_MCP.md` / `.html` | this guide | + +> The backend repo (`tracksolid_timescale_grafana_prod`) keeps only a pointer note in its +> `CLAUDE.md` recording that `analytics_ro` exists and is owned by this repo. + +--- + +## 3. Step 1 — the `analytics_ro` role + +### `scripts/analytics_ro_role.sql` + +Modelled on `scripts/dashboard_ro_role.sql`. Run as the **postgres superuser** (it does +`CREATE ROLE`), supplied a password as psql var `:'ro_pw'` — **no secret in the repo**. + +```sql +-- analytics_ro_role.sql — dedicated read-only LOGIN role for the analytics MCP server. +-- Run as postgres SUPERUSER via scripts/bootstrap_analytics_ro.sh (NOT run_migrations.py). +-- Grants exactly the read surface: SELECT on reporting.* + tracksolid.*, the v_trips +-- matview, and EXECUTE on reporting.fn_*. No INSERT/UPDATE/DELETE, not the matview owner, +-- so analytics_ro can never write or REFRESH. Idempotent — safe to re-apply (rotates pw). +\set ON_ERROR_STOP on + +DO $role$ +BEGIN + IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'analytics_ro') THEN + CREATE ROLE analytics_ro LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE; + END IF; +END $role$; + +ALTER ROLE analytics_ro WITH LOGIN PASSWORD :'ro_pw'; + +GRANT CONNECT ON DATABASE tracksolid_db TO analytics_ro; +GRANT USAGE ON SCHEMA reporting, tracksolid TO analytics_ro; + +GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO analytics_ro; -- tables + views +GRANT SELECT ON ALL TABLES IN SCHEMA tracksolid TO analytics_ro; -- tables + views +GRANT SELECT ON reporting.v_trips TO analytics_ro; -- MATERIALIZED VIEW +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA reporting TO analytics_ro; + +-- Future objects created by the migration role auto-grant (matviews still need explicit GRANT). +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT SELECT ON TABLES TO analytics_ro; +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA tracksolid GRANT SELECT ON TABLES TO analytics_ro; +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT EXECUTE ON FUNCTIONS TO analytics_ro; + +-- Extra hardening over dashboard_ro: this role serves ad-hoc HUMAN queries. +ALTER ROLE analytics_ro SET default_transaction_read_only = on; +ALTER ROLE analytics_ro SET statement_timeout = '30s'; +ALTER ROLE analytics_ro SET idle_in_transaction_session_timeout = '60s'; +``` + +### `scripts/bootstrap_analytics_ro.sh` + +Clone of `scripts/bootstrap_dashboard_ro.sh` — generates `~/.analytics_ro.pw` (0600) on +first run, applies the SQL via `docker exec ... psql`. + +```bash +#!/usr/bin/env bash +# bootstrap_analytics_ro.sh — create/refresh the analytics_ro read-only role. +# Run ON THE HOST. Generates a strong pw into ~/.analytics_ro.pw (0600) on first +# run, then applies scripts/analytics_ro_role.sql as the postgres superuser. +# Deploy: +# scp scripts/analytics_ro_role.sql scripts/bootstrap_analytics_ro.sh kianiadee@twala.rahamafresh.com:~/ +# ssh kianiadee@twala.rahamafresh.com 'bash ~/bootstrap_analytics_ro.sh' +set -euo pipefail + +PW_FILE="${ANALYTICS_RO_PW_FILE:-$HOME/.analytics_ro.pw}" +SQL_FILE="${1:-$HOME/analytics_ro_role.sql}" +test -f "$SQL_FILE" || { echo "ERROR: role SQL not found at $SQL_FILE"; exit 1; } + +if [ ! -s "$PW_FILE" ]; then + ( umask 077; openssl rand -hex 24 > "$PW_FILE" ); chmod 600 "$PW_FILE" + echo "Generated new analytics_ro password -> $PW_FILE (0600)" +else + echo "Reusing existing analytics_ro password from $PW_FILE" +fi +PW=$(cat "$PW_FILE") + +DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1) +[ -n "$DB" ] || { echo "ERROR: timescale_db container not found"; exit 1; } + +echo "Applying analytics_ro role DDL to $DB as postgres ..." +docker exec -i "$DB" psql -U postgres -d tracksolid_db -v ON_ERROR_STOP=1 -v ro_pw="$PW" < "$SQL_FILE" +echo "analytics_ro ready (password not printed). Now (re)run deploy_analytics_mcp.sh." +``` + +--- + +## 4. The MCP server (`analytics_mcp.py`) + +FastMCP streamable-HTTP server, served by uvicorn (target `analytics_mcp:app`). It uses its +**own** read-only psycopg2 pool and a small local logger — it deliberately does **not** import +the backend's `ts_shared_rev` (that module eagerly requires the Tracksolid ingestion secrets, +which this read-only server has no business holding). The canonical source is +[`../analytics_mcp.py`](../analytics_mcp.py); the abridged version below shows the guard, the +tools, and the auth: + +```python +""" +analytics_mcp.py — Fireside Communications · Read-only Analytics MCP Server +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Hosted MCP server for the decision & analytics team. Exposes the fleet reporting +data (reporting.* + tracksolid.*) to Claude as READ-ONLY query + introspection +tools. Connects as the analytics_ro role; every query runs in a read-only txn +that is rolled back. Served over streamable HTTP behind Traefik with Bearer auth. +""" +from __future__ import annotations + +import os +import re +import time +from contextlib import contextmanager + +import psycopg2 +import psycopg2.extras +import psycopg2.pool +from mcp.server.fastmcp import FastMCP +from starlette.middleware.base import BaseHTTPMiddleware +from starlette.responses import JSONResponse + +# NOTE: a small local logger is defined here (see analytics_mcp.py); we do NOT import +# ts_shared_rev, so this read-only server carries none of the ingestion secrets. +log = _get_logger("server") + +DATABASE_URL = os.environ["DATABASE_URL"] # set to the analytics_ro DSN by deploy +MAX_ROWS_CEIL = int(os.getenv("MCP_MAX_ROWS", "10000")) +READABLE_SCHEMAS = ("reporting", "tracksolid") + +# ── read-only connection pool ──────────────────────────────────────────────── +# Force read-only + a statement timeout at the connection level (belt + braces; +# the analytics_ro role already sets these, but a self-contained server is safer). +_pool = psycopg2.pool.ThreadedConnectionPool( + 1, int(os.getenv("MCP_POOL_MAX", "8")), DATABASE_URL, + options="-c default_transaction_read_only=on -c statement_timeout=30000 -c client_encoding=UTF8", +) + +@contextmanager +def _ro_conn(): + """Read-only connection; the transaction is ALWAYS rolled back (never commits).""" + conn = _pool.getconn() + try: + conn.set_session(readonly=True, autocommit=False) + yield conn + finally: + conn.rollback() + _pool.putconn(conn) + +def _rows(cur): + cols = [d[0] for d in cur.description] + return [dict(zip(cols, r)) for r in cur.fetchall()] + +# ── SQL guard for the general query tool ───────────────────────────────────── +_FORBIDDEN = re.compile( + r"\b(insert|update|delete|drop|alter|create|grant|revoke|truncate|copy|call|do|merge|" + r"vacuum|reindex|refresh|comment|lock|set|reset)\b", re.IGNORECASE) + +def _strip_comments(sql: str) -> str: + sql = re.sub(r"/\*.*?\*/", " ", sql, flags=re.DOTALL) + sql = re.sub(r"--[^\n]*", " ", sql) + return sql.strip() + +def _guard(sql: str) -> str: + stripped = _strip_comments(sql) + if not stripped: + raise ValueError("Empty query.") + # exactly one statement (allow a single trailing ;) + parts = [p for p in stripped.split(";") if p.strip()] + if len(parts) != 1: + raise ValueError("Only a single statement is allowed.") + stmt = parts[0].strip() + if not re.match(r"^(select|with)\b", stmt, re.IGNORECASE): + raise ValueError("Only SELECT / WITH queries are allowed.") + if _FORBIDDEN.search(stmt): + raise ValueError("Query contains a forbidden (write/DDL) keyword.") + return stmt + +mcp = FastMCP("fireside-analytics", stateless_http=True) + +@mcp.tool() +def query(sql: str, max_rows: int = 1000) -> dict: + """Run a read-only SELECT/WITH query against the fleet database. + Only the reporting.* and tracksolid.* schemas are readable. Returns up to + `max_rows` rows (default 1000, hard cap 10000). Auto-applies LIMIT if absent.""" + stmt = _guard(sql) + cap = max(1, min(int(max_rows), MAX_ROWS_CEIL)) + if not re.search(r"\blimit\b", stmt, re.IGNORECASE): + stmt = f"{stmt}\nLIMIT {cap + 1}" # +1 to detect truncation + t0 = time.monotonic() + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute(stmt) + rows = _rows(cur) + truncated = len(rows) > cap + rows = rows[:cap] + dur_ms = int((time.monotonic() - t0) * 1000) + log.info("query rows=%d trunc=%s %dms :: %s", len(rows), truncated, dur_ms, sql[:200]) + return {"row_count": len(rows), "truncated": truncated, "rows": rows} + +@mcp.tool() +def list_schemas() -> list[dict]: + """List the readable schemas and their table/view counts.""" + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT table_schema AS schema, count(*) AS objects " + "FROM information_schema.tables WHERE table_schema = ANY(%s) " + "GROUP BY 1 ORDER BY 1", (list(READABLE_SCHEMAS),)) + return _rows(cur) + +@mcp.tool() +def list_tables(schema: str) -> list[dict]: + """List tables + views in a schema (reporting or tracksolid).""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT table_name AS name, table_type AS kind " + "FROM information_schema.tables WHERE table_schema = %s " + "ORDER BY 1", (schema,)) + return _rows(cur) + +@mcp.tool() +def describe_table(schema: str, table: str) -> list[dict]: + """Describe a table/view: columns, types, nullability, defaults.""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT column_name AS column, data_type AS type, is_nullable AS nullable, " + "column_default AS default FROM information_schema.columns " + "WHERE table_schema = %s AND table_name = %s ORDER BY ordinal_position", + (schema, table)) + return _rows(cur) + +@mcp.tool() +def list_functions(schema: str = "reporting") -> list[dict]: + """List callable functions (e.g. reporting.fn_*) with their signatures.""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + with _ro_conn() as conn, conn.cursor() as cur: + cur.execute( + "SELECT p.proname AS name, pg_get_function_arguments(p.oid) AS args " + "FROM pg_proc p JOIN pg_namespace n ON n.oid = p.pronamespace " + "WHERE n.nspname = %s ORDER BY 1", (schema,)) + return _rows(cur) + +@mcp.tool() +def sample_table(schema: str, table: str, n: int = 20) -> dict: + """Return the first `n` rows of a table/view (convenience over query).""" + if schema not in READABLE_SCHEMAS: + raise ValueError(f"schema must be one of {READABLE_SCHEMAS}") + # quote_ident via format() on validated identifiers + return query(f'SELECT * FROM "{schema}"."{table}"', max_rows=n) + +# ── Bearer-token auth middleware ────────────────────────────────────────────── +# MCP_AUTH_TOKENS = "alice:tok1,bob:tok2" (per-analyst → revocable + attributable) +_TOKENS = { + t.split(":", 1)[1]: t.split(":", 1)[0] + for t in os.getenv("MCP_AUTH_TOKENS", "").split(",") + if ":" in t +} + +class BearerAuth(BaseHTTPMiddleware): + async def dispatch(self, request, call_next): + if request.url.path == "/healthz": + return await call_next(request) + auth = request.headers.get("authorization", "") + token = auth[7:] if auth.lower().startswith("bearer ") else "" + if token not in _TOKENS: + return JSONResponse({"error": "unauthorized"}, status_code=401) + request.state.caller = _TOKENS[token] + return await call_next(request) + +async def healthz(_request): + return JSONResponse({"ok": True, "tokens": len(_TOKENS)}) + +app = mcp.streamable_http_app() +app.add_middleware(BearerAuth) +app.add_route("/healthz", healthz, methods=["GET"]) # Starlette: add_route, not @app.route +``` + +> **Notes.** `stateless_http=True` suits a behind-proxy / multi-worker deploy. The token map +> is `token → name` so each query is attributable in the logs. The `_FORBIDDEN` guard also +> blocks `SET`/`RESET` so a query can't relax `default_transaction_read_only`. Validate the +> exact FastMCP app-factory method name against the installed `mcp` version +> (`streamable_http_app()` vs `http_app()`); the deploy command's `app` target must match. + +--- + +## 5. Packaging — `Dockerfile` + `pyproject.toml` + +This repo is self-contained: its `pyproject.toml` declares the deps (`mcp[cli]`, +`psycopg2-binary`, `uvicorn[standard]`) and the `Dockerfile` builds a slim image that runs +`uvicorn analytics_mcp:app` on port 8892. The project is a flat single module, so +`[tool.uv] package = false` and the Dockerfile installs **dependencies only** +(`uv sync --no-dev --no-install-project`) — it never tries to build the module as a package. +No dependency on the backend image. + +--- + +## 6. Deploy + +The DB is internal-only, so the server runs on the **same Coolify host as `timescale_db`**. + +**Recommended — Coolify-managed app.** Create a Coolify application from this repo +(`repo.rahamafresh.com/kianiadee/fleetanalytics_mcp.git`), Dockerfile build, app port `8892`, +domain `fleetmcp.rahamafresh.com` (prod) / `fleetmcp.fivetitude.com` (staging). Set as +secrets `DATABASE_URL=postgresql://analytics_ro:@timescale_db:5432/tracksolid_db` and +`MCP_AUTH_TOKENS=alice:,bob:`, then **connect the app to the network that can reach +`timescale_db`** (the tracksolid stack's network) so the hostname resolves. Coolify manages +the Traefik labels + TLS from the domain. Auto-deploys on push via the Forgejo webhook. + +**Fallback — manual host deploy (`deploy.sh`).** If not using the Coolify UI, check the repo +out on the host and run `deploy.sh` — it builds the image, resolves the DB network + DSN from +the running stack, swaps in the `analytics_ro` credentials, and runs a standalone +Traefik-labelled bridge (the proven `dashboard_api` pattern). See the script header. + +```bash +cd ~/fleetanalytics_mcp && git pull +MCP_AUTH_TOKENS="alice:$(openssl rand -hex 16)" bash deploy.sh +``` + +--- + +## 7. Deploy runbook (ordered) + +1. **Role (once):** `scp scripts/analytics_ro_role.sql scripts/bootstrap_analytics_ro.sh kianiadee@twala.rahamafresh.com:~/` then `ssh ... 'bash ~/bootstrap_analytics_ro.sh'` (writes `~/.analytics_ro.pw`). +2. **App:** either point Coolify at this repo (§6 recommended) or run `deploy.sh` on the host. Record each analyst's token securely (it is shown once when generated). +3. **Network:** ensure the MCP container shares a Docker network with `timescale_db` so the DSN host resolves (Coolify network setting, or `deploy.sh` reuses the stack network automatically). +4. **DNS/Traefik:** ensure `fleetmcp.*` resolves to the host; Coolify/Traefik issues the cert. + +--- + +## 8. Add to Claude (for analysts) + +**Claude Code** +```bash +claude mcp add --transport http fireside-analytics https://fleetmcp.fivetitude.com \ + --header "Authorization: Bearer " +claude mcp list # should show "fireside-analytics: connected" +``` + +**Claude Desktop / claude.ai** — add a custom connector with the same URL and an +`Authorization: Bearer ` header. + +Example session prompts: *"list the schemas"*, *"describe reporting.v_daily_summary"*, +*"top 10 cost centres by distance in the last 30 days"* (the model writes the SELECT and +calls `query`). + +--- + +## 9. Verification checklist + +- [ ] `psql -U analytics_ro -d tracksolid_db -c "SELECT count(*) FROM reporting.v_daily_summary"` **succeeds**. +- [ ] `psql -U analytics_ro ... -c "CREATE TABLE x(i int)"` **fails** (permission denied) — proves read-only. +- [ ] the image builds (`docker build .` or Coolify build succeeds); `analytics_mcp` container is `Up`. +- [ ] `DATABASE_URL` shows `analytics_ro` (pw masked); `curl localhost:8892/healthz` returns `{"ok":true,...}`. +- [ ] the container can resolve/reach `timescale_db` (shares its network). +- [ ] `claude mcp list` shows the server connected; `list_schemas` / `describe_table` / a real `query` return data. +- [ ] `query("UPDATE reporting.refresh_log SET notes='x'")` is **rejected** by the guard. +- [ ] A request with a missing/bad bearer token returns **401**. +- [ ] `docker logs analytics_mcp` shows one audit line per query (caller name, SQL, rows, ms). + +--- + +## 10. Security notes + +- **Four read-only layers:** role GRANTs · `default_transaction_read_only=on` (role + connection) · rolled-back txn · SQL keyword guard. +- **Least privilege:** `analytics_ro` only has `USAGE`+`SELECT` on `reporting`/`tracksolid` and `EXECUTE` on `reporting` functions — no other schema, no write of any kind. +- **Per-analyst tokens** make access revocable and queries attributable; rotate by editing `MCP_AUTH_TOKENS` and re-running the deploy (a recreate). +- **Resource guards:** `statement_timeout=30s`, idle-txn timeout, row cap (default 1000 / ceiling 10000) protect the DB from runaway analyst queries. +- **Future:** swap static Bearer tokens for OAuth (MCP supports it) if/when the team scales; consider a column-level deny-list if any PII lives in `tracksolid.*`. diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 0000000..7ab77ff --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,35 @@ +[project] +name = "fleetanalytics-mcp" +version = "1.0.0" +description = "Fireside Communications — read-only Fleet Analytics MCP server (decision & analytics team)" +readme = "README.md" +requires-python = ">=3.12" +authors = [ + { name = "Fireside DevOps", email = "devops@firesideafrica.cloud" } +] +dependencies = [ + "mcp[cli]>=1.2", # MCP server SDK (FastMCP, streamable HTTP) + "psycopg2-binary>=2.9.9", # Postgres driver (binary wheels — easy in Docker) + "uvicorn[standard]>=0.30.0", # ASGI server + "starlette>=0.37", # Bearer-auth middleware + /healthz route (pulled in by mcp, pinned for clarity) +] + +[project.optional-dependencies] +dev = [ + "ruff>=0.4", + "mypy>=1.10", +] + +[tool.uv] +# Flat single-module project (analytics_mcp.py) — don't try to build/install it as +# a package; just manage the dependency venv. +package = false + +[tool.ruff] +target-version = "py312" +line-length = 100 +select = ["E", "W", "F", "B", "UP", "SIM"] + +[tool.mypy] +python_version = "3.12" +ignore_missing_imports = true diff --git a/scripts/analytics_ro_role.sql b/scripts/analytics_ro_role.sql new file mode 100644 index 0000000..35c0ba4 --- /dev/null +++ b/scripts/analytics_ro_role.sql @@ -0,0 +1,55 @@ +-- analytics_ro_role.sql — dedicated read-only LOGIN role for the analytics MCP server. +-- +-- Sibling of dashboard_ro_role.sql, but for the decision & analytics team's MCP +-- server (analytics_mcp_rev.py) rather than the dashboard bridge. A separate role +-- keeps the two access paths independently revocable and lets us apply tighter, +-- human-ad-hoc-query guards (statement_timeout, idle-txn timeout) without touching +-- the dashboard bridge's credential. +-- +-- Run as the postgres SUPERUSER (CREATE ROLE), NOT via run_migrations.py (which +-- connects as the app role and may lack CREATEROLE). Apply with +-- scripts/bootstrap_analytics_ro.sh, which supplies the password as the psql +-- variable :ro_pw from a host-only 0600 file — so no secret lives in this repo. +-- +-- It grants exactly the read surface the MCP server needs: +-- * SELECT on reporting.* and tracksolid.* (tables + views) +-- * SELECT on the reporting.v_trips MATERIALIZED VIEW — matviews are NOT +-- covered by GRANT ... ON ALL TABLES, so it must be named explicitly +-- * EXECUTE on the reporting.fn_* functions (so analysts can SELECT reporting.fn_...) +-- * DEFAULT PRIVILEGES so future objects created by the migration role are +-- auto-readable (no re-grant when we add views) +-- Read-only: no INSERT/UPDATE/DELETE and not the matview owner, so analytics_ro +-- can never write or REFRESH. Idempotent -> safe to re-apply (also rotates pw). + +\set ON_ERROR_STOP on + +DO $role$ +BEGIN + IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'analytics_ro') THEN + CREATE ROLE analytics_ro LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE; + END IF; +END $role$; + +ALTER ROLE analytics_ro WITH LOGIN PASSWORD :'ro_pw'; + +GRANT CONNECT ON DATABASE tracksolid_db TO analytics_ro; +GRANT USAGE ON SCHEMA reporting, tracksolid TO analytics_ro; + +GRANT SELECT ON ALL TABLES IN SCHEMA reporting TO analytics_ro; -- tables + views +GRANT SELECT ON ALL TABLES IN SCHEMA tracksolid TO analytics_ro; -- tables + views +GRANT SELECT ON reporting.v_trips TO analytics_ro; -- MATERIALIZED VIEW (not in ALL TABLES) +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA reporting TO analytics_ro; + +-- "dynamic": future objects created by the migration role (tracksolid_owner) +-- are auto-granted. NOTE: matviews are still never covered — a new matview needs +-- its own explicit GRANT SELECT (as above for v_trips). +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT SELECT ON TABLES TO analytics_ro; +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA tracksolid GRANT SELECT ON TABLES TO analytics_ro; +ALTER DEFAULT PRIVILEGES FOR ROLE tracksolid_owner IN SCHEMA reporting GRANT EXECUTE ON FUNCTIONS TO analytics_ro; + +-- Extra hardening over dashboard_ro: this role serves ad-hoc HUMAN queries via the +-- MCP server, so pin read-only at the role level and cap runaway work. These are +-- belt-and-braces alongside the read-only txn the server itself uses. +ALTER ROLE analytics_ro SET default_transaction_read_only = on; +ALTER ROLE analytics_ro SET statement_timeout = '30s'; +ALTER ROLE analytics_ro SET idle_in_transaction_session_timeout = '60s'; diff --git a/scripts/bootstrap_analytics_ro.sh b/scripts/bootstrap_analytics_ro.sh new file mode 100755 index 0000000..86d4239 --- /dev/null +++ b/scripts/bootstrap_analytics_ro.sh @@ -0,0 +1,39 @@ +#!/usr/bin/env bash +# bootstrap_analytics_ro.sh — create/refresh the analytics_ro read-only role. +# ───────────────────────────────────────────────────────────────────────────── +# Run ON THE HOST. Generates a strong password into ~/.analytics_ro.pw (0600) on +# first run (reused thereafter), then applies scripts/analytics_ro_role.sql to the +# prod DB as the postgres superuser. The password is NEVER printed and never +# leaves the host — the MCP deploy script (deploy_analytics_mcp.sh) reads the same +# ~/.analytics_ro.pw. +# +# Deploy: +# scp scripts/analytics_ro_role.sql scripts/bootstrap_analytics_ro.sh \ +# kianiadee@twala.rahamafresh.com:~/ +# ssh kianiadee@twala.rahamafresh.com 'bash ~/bootstrap_analytics_ro.sh' +# +# Idempotent: re-running rotates nothing unless ~/.analytics_ro.pw is deleted +# first (then it generates + sets a fresh password and you must redeploy the MCP). +# ───────────────────────────────────────────────────────────────────────────── +set -euo pipefail + +PW_FILE="${ANALYTICS_RO_PW_FILE:-$HOME/.analytics_ro.pw}" +SQL_FILE="${1:-$HOME/analytics_ro_role.sql}" + +test -f "$SQL_FILE" || { echo "ERROR: role SQL not found at $SQL_FILE (scp scripts/analytics_ro_role.sql to ~ first)"; exit 1; } + +if [ ! -s "$PW_FILE" ]; then + ( umask 077; openssl rand -hex 24 > "$PW_FILE" ) + chmod 600 "$PW_FILE" + echo "Generated new analytics_ro password -> $PW_FILE (0600)" +else + echo "Reusing existing analytics_ro password from $PW_FILE" +fi +PW=$(cat "$PW_FILE") + +DB=$(docker ps --filter name=timescale_db --format "{{.Names}}" | head -1) +[ -n "$DB" ] || { echo "ERROR: timescale_db container not found"; exit 1; } + +echo "Applying analytics_ro role DDL to $DB as postgres ..." +docker exec -i "$DB" psql -U postgres -d tracksolid_db -v ON_ERROR_STOP=1 -v ro_pw="$PW" < "$SQL_FILE" +echo "analytics_ro ready (password not printed). Now (re)run deploy_analytics_mcp.sh."