fleetanalytics_mcp/docs/ANALYTICS_MCP.html

224 lines
18 KiB
HTML
Raw Normal View History

<!doctype html>
<html lang="en"><head><meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>Read-only Analytics MCP Server — Implementation Guide</title>
<style>
:root{--bg:#0d1117;--panel:#161b22;--bd:#30363d;--fg:#e6edf3;--mut:#9da7b3;--acc:#3b82f6;--grn:#3fb950;--amb:#d29922;--red:#f85149;--mono:'SF Mono',ui-monospace,Menlo,Consolas,monospace}
*{box-sizing:border-box}
body{margin:0;background:var(--bg);color:var(--fg);font:15px/1.6 -apple-system,Segoe UI,Roboto,sans-serif}
.wrap{max-width:980px;margin:0 auto;padding:34px 22px 90px}
h1{font-size:30px;margin:0 0 4px} h2{font-size:22px;margin:42px 0 12px;padding-bottom:6px;border-bottom:1px solid var(--bd)}
h3{font-size:17px;margin:26px 0 8px;color:#c9d4e0}
.sub{color:var(--mut)} a{color:var(--acc);text-decoration:none} a:hover{text-decoration:underline}
code,.mono{font-family:var(--mono);font-size:13px}
p code,li code,td code{background:#1c232c;border:1px solid var(--bd);border-radius:5px;padding:.5px 5px;font-size:12.5px}
pre{background:#10151c;border:1px solid var(--bd);border-radius:8px;padding:14px 16px;overflow:auto;font-family:var(--mono);font-size:12.5px;line-height:1.55}
pre .c{color:var(--mut)} pre .k{color:#ff7b72} pre .s{color:#a5d6ff} pre .f{color:#d2a8ff}
table{width:100%;border-collapse:collapse;margin:8px 0 18px;background:var(--panel);border:1px solid var(--bd);border-radius:8px;overflow:hidden}
th,td{text-align:left;padding:8px 11px;border-bottom:1px solid var(--bd);vertical-align:top;font-size:13.5px}
th{background:#1c232c;color:#c9d4e0;font-weight:600}
.flow{display:flex;flex-wrap:wrap;gap:10px;align-items:stretch;margin:14px 0}
.node{background:var(--panel);border:1px solid var(--bd);border-radius:10px;padding:12px 14px;min-width:150px}
.node b{display:block;color:#fff} .node small{color:var(--mut)}
.arrow{display:flex;align-items:center;color:var(--acc);font-weight:700;font-size:20px}
.note{background:#13202b;border-left:3px solid var(--acc);padding:10px 14px;border-radius:6px;margin:12px 0}
.warn{background:#241d10;border-left:3px solid var(--amb);padding:10px 14px;border-radius:6px;margin:12px 0}
.pill{display:inline-block;font-size:11px;padding:1px 8px;border-radius:20px;border:1px solid var(--bd);color:var(--mut)}
.pill.new{color:var(--grn);border-color:#27512f} .pill.edit{color:var(--amb);border-color:#5a4a1f}
ul.chk{list-style:none;padding-left:0} ul.chk li{padding:3px 0 3px 26px;position:relative}
ul.chk li:before{content:"☐";position:absolute;left:0;color:var(--mut)}
.muted{color:var(--mut);font-size:13px}
.lh{display:flex;gap:8px;align-items:baseline;flex-wrap:wrap}
</style></head><body><div class="wrap">
<h1>Read-only Analytics MCP Server</h1>
<p class="sub">Implementation guide · standalone repo <code>fleetanalytics_mcp</code>, hosted on the <code>tracksolid_db</code> Coolify host · 2026-06-16 · <span class="pill">built — pending deploy</span></p>
<h2>1. Purpose &amp; context</h2>
<p>The decision &amp; analytics team needs to pull fleet reporting data (fuel, utilisation,
driver behaviour, INC tickets, raw telemetry) from <code>tracksolid_db</code> to make
decisions — <b>read-only, never edit/delete</b>. The only programmatic surface today is the
<code>dashboard_api</code> FastAPI bridge with a fixed set of <code>/analytics/*</code> /
<code>/webhook/*</code> endpoints — too rigid for ad-hoc analysis.</p>
<p>This adds a <b>hosted, read-only MCP server</b> that lets analysts query the database
directly from Claude: a guarded general <code>SELECT</code> tool plus schema-introspection
tools, pointed at the existing PostgreSQL 16 + TimescaleDB + PostGIS database through a
<b>new least-privilege <code>analytics_ro</code> role</b>.</p>
<p>The DB is internal-only (<code>DATABASE_URL</code><code>timescale_db:5432</code> on the
Docker network, not reachable from a laptop), so the server is <b>hosted on the same Coolify
host as the DB</b>. It ships as its own repo with its own <code>Dockerfile</code>
(Coolify-buildable) and joins the network that can reach <code>timescale_db</code>; a
<code>deploy.sh</code> manual fallback mirrors the proven <code>dashboard_api</code> bridge pattern.</p>
<div class="note"><b>Read-only is enforced at four layers:</b> the <code>analytics_ro</code>
GRANTs (no INSERT/UPDATE/DELETE) · a session <code>default_transaction_read_only = on</code>
· a transaction that is <b>rolled back</b> (never committed) · a single-statement / keyword
SQL guard in the <code>query</code> tool.</div>
<h3>Where this sits</h3>
<div class="flow">
<div class="node"><b>Analyst's Claude</b><small>Code / Desktop / claude.ai</small></div>
<div class="arrow"></div>
<div class="node"><b>Traefik</b><small>fleetmcp.fivetitude.com · HTTPS + Bearer</small></div>
<div class="arrow"></div>
<div class="node"><b>analytics_mcp</b><small>uvicorn :8892 · coolify net<br>role = analytics_ro · READ ONLY</small></div>
<div class="arrow"></div>
<div class="node"><b>timescale_db:5432</b><small>tracksolid_db<br>reporting.* · tracksolid.*</small></div>
</div>
<p class="muted">Ports in use: <code>8890</code> prod dashboard_api · <code>8891</code> staging dashboard_api · <b><code>8892</code> analytics_mcp</b>.</p>
<h2>2. Repo contents</h2>
<table>
<tr><th>File</th><th>What</th></tr>
<tr><td><code>analytics_mcp.py</code></td><td>the MCP server (FastMCP streamable-HTTP; uvicorn target <code>analytics_mcp:app</code>)</td></tr>
<tr><td><code>Dockerfile</code></td><td>Coolify-buildable image (port 8892)</td></tr>
<tr><td><code>pyproject.toml</code></td><td>deps (<code>mcp[cli]</code>, <code>psycopg2-binary</code>, <code>uvicorn</code>)</td></tr>
<tr><td><code>deploy.sh</code></td><td>manual host deploy (standalone Traefik bridge) — fallback to Coolify</td></tr>
<tr><td><code>scripts/analytics_ro_role.sql</code></td><td>read-only role DDL (modelled on the backend's <code>dashboard_ro_role.sql</code> + hardening)</td></tr>
<tr><td><code>scripts/bootstrap_analytics_ro.sh</code></td><td>host bootstrap: generate pw → apply role SQL</td></tr>
<tr><td><code>docs/ANALYTICS_MCP.md</code> / <code>.html</code></td><td>this guide</td></tr>
</table>
<h2>3. Step 1 — the <code>analytics_ro</code> role</h2>
<p>Modelled on <code>scripts/dashboard_ro_role.sql</code>. Run as the <b>postgres
superuser</b> (it does <code>CREATE ROLE</code>), with the password supplied as psql var
<code>:'ro_pw'</code><b>no secret in the repo</b>.</p>
<h3><code>scripts/analytics_ro_role.sql</code></h3>
<pre><span class="c">-- read-only LOGIN role for the analytics MCP server. Apply via bootstrap_analytics_ro.sh.</span>
\set ON_ERROR_STOP on
<span class="k">DO</span> $role$ <span class="k">BEGIN</span>
<span class="k">IF NOT EXISTS</span> (<span class="k">SELECT</span> 1 <span class="k">FROM</span> pg_roles <span class="k">WHERE</span> rolname = <span class="s">'analytics_ro'</span>) <span class="k">THEN</span>
<span class="k">CREATE ROLE</span> analytics_ro LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE;
<span class="k">END IF</span>; <span class="k">END</span> $role$;
<span class="k">ALTER ROLE</span> analytics_ro <span class="k">WITH</span> LOGIN PASSWORD :'ro_pw';
<span class="k">GRANT</span> CONNECT <span class="k">ON DATABASE</span> tracksolid_db <span class="k">TO</span> analytics_ro;
<span class="k">GRANT</span> USAGE <span class="k">ON SCHEMA</span> reporting, tracksolid <span class="k">TO</span> analytics_ro;
<span class="k">GRANT</span> SELECT <span class="k">ON ALL TABLES IN SCHEMA</span> reporting <span class="k">TO</span> analytics_ro; <span class="c">-- tables + views</span>
<span class="k">GRANT</span> SELECT <span class="k">ON ALL TABLES IN SCHEMA</span> tracksolid <span class="k">TO</span> analytics_ro;
<span class="k">GRANT</span> SELECT <span class="k">ON</span> reporting.v_trips <span class="k">TO</span> analytics_ro; <span class="c">-- matview (not in ALL TABLES)</span>
<span class="k">GRANT</span> EXECUTE <span class="k">ON ALL FUNCTIONS IN SCHEMA</span> reporting <span class="k">TO</span> analytics_ro;
<span class="c">-- future objects auto-grant</span>
<span class="k">ALTER DEFAULT PRIVILEGES FOR ROLE</span> tracksolid_owner <span class="k">IN SCHEMA</span> reporting <span class="k">GRANT</span> SELECT <span class="k">ON TABLES TO</span> analytics_ro;
<span class="k">ALTER DEFAULT PRIVILEGES FOR ROLE</span> tracksolid_owner <span class="k">IN SCHEMA</span> tracksolid <span class="k">GRANT</span> SELECT <span class="k">ON TABLES TO</span> analytics_ro;
<span class="k">ALTER DEFAULT PRIVILEGES FOR ROLE</span> tracksolid_owner <span class="k">IN SCHEMA</span> reporting <span class="k">GRANT</span> EXECUTE <span class="k">ON FUNCTIONS TO</span> analytics_ro;
<span class="c">-- extra hardening over dashboard_ro: this role serves ad-hoc HUMAN queries</span>
<span class="k">ALTER ROLE</span> analytics_ro <span class="k">SET</span> default_transaction_read_only = on;
<span class="k">ALTER ROLE</span> analytics_ro <span class="k">SET</span> statement_timeout = <span class="s">'30s'</span>;
<span class="k">ALTER ROLE</span> analytics_ro <span class="k">SET</span> idle_in_transaction_session_timeout = <span class="s">'60s'</span>;</pre>
<h3><code>scripts/bootstrap_analytics_ro.sh</code></h3>
<p>Clone of <code>bootstrap_dashboard_ro.sh</code> — generates <code>~/.analytics_ro.pw</code>
(0600) on first run, applies the SQL via <code>docker exec … psql -v ro_pw=…</code>. The
password is never printed and never leaves the host.</p>
<h2>4. The MCP server (<code>analytics_mcp.py</code>)</h2>
<p>FastMCP streamable-HTTP server, served by uvicorn (target <code>analytics_mcp:app</code>).
It uses its <b>own</b> read-only psycopg2 pool and a small local logger — it deliberately does
<b>not</b> import the backend's <code>ts_shared_rev</code> (that module eagerly requires the
Tracksolid ingestion secrets, which this read-only server has no business holding). Tools exposed:</p>
<table>
<tr><th>Tool</th><th>Purpose</th></tr>
<tr><td><code>query(sql, max_rows=1000)</code></td><td>guarded read-only SELECT/WITH; single statement, keyword-blocked, auto-LIMIT; returns rows + <code>truncated</code> flag</td></tr>
<tr><td><code>list_schemas()</code></td><td>readable schemas (<code>reporting</code>, <code>tracksolid</code>) + object counts</td></tr>
<tr><td><code>list_tables(schema)</code></td><td>tables + views in a schema</td></tr>
<tr><td><code>describe_table(schema, table)</code></td><td>columns, types, nullability, defaults</td></tr>
<tr><td><code>list_functions(schema='reporting')</code></td><td><code>reporting.fn_*</code> signatures</td></tr>
<tr><td><code>sample_table(schema, table, n=20)</code></td><td>first <code>n</code> rows (thin wrapper over <code>query</code>)</td></tr>
</table>
<p>The core guard + connection logic:</p>
<pre><span class="c"># read-only pool: force read-only + statement timeout at connection level (belt + braces)</span>
_pool = psycopg2.pool.ThreadedConnectionPool(1, 8, DATABASE_URL,
options=<span class="s">"-c default_transaction_read_only=on -c statement_timeout=30000"</span>)
<span class="f">@contextmanager</span>
<span class="k">def</span> _ro_conn(): <span class="c"># txn is ALWAYS rolled back — never commits</span>
conn = _pool.getconn()
<span class="k">try</span>:
conn.set_session(readonly=<span class="k">True</span>, autocommit=<span class="k">False</span>)
<span class="k">yield</span> conn
<span class="k">finally</span>:
conn.rollback(); _pool.putconn(conn)
<span class="k">def</span> _guard(sql): <span class="c"># single SELECT/WITH, no write/DDL keywords</span>
stmt = _strip_comments(sql)
parts = [p <span class="k">for</span> p <span class="k">in</span> stmt.split(<span class="s">";"</span>) <span class="k">if</span> p.strip()]
<span class="k">if</span> len(parts) != 1: <span class="k">raise</span> ValueError(<span class="s">"Only a single statement is allowed."</span>)
stmt = parts[0].strip()
<span class="k">if not</span> re.match(<span class="s">r"^(select|with)\b"</span>, stmt, re.I): <span class="k">raise</span> ValueError(<span class="s">"Only SELECT/WITH allowed."</span>)
<span class="k">if</span> _FORBIDDEN.search(stmt): <span class="k">raise</span> ValueError(<span class="s">"Forbidden (write/DDL) keyword."</span>)
<span class="k">return</span> stmt</pre>
<p>Auth is a Starlette <code>BaseHTTPMiddleware</code> that requires
<code>Authorization: Bearer &lt;token&gt;</code>. Tokens come from env
<code>MCP_AUTH_TOKENS="alice:tok1,bob:tok2"</code> (per-analyst → revocable + attributable in
logs); <code>/healthz</code> is exempt. The app is mounted via
<code>app = mcp.streamable_http_app()</code>, then <code>app.add_middleware(BearerAuth)</code>
and <code>app.add_route("/healthz", …)</code> (Starlette exposes <code>add_route</code>, not a
Flask-style <code>@app.route</code> decorator — verified against the installed <code>mcp</code>).</p>
<div class="note">Full, current source is the repo's <code>analytics_mcp.py</code>; the excerpt
above is abridged.</div>
<h2>5. Packaging — <code>Dockerfile</code> + <code>pyproject.toml</code></h2>
<p>Self-contained: <code>pyproject.toml</code> declares the deps (<code>mcp[cli]</code>,
<code>psycopg2-binary</code>, <code>uvicorn[standard]</code>) and the <code>Dockerfile</code>
builds a slim image running <code>uvicorn analytics_mcp:app</code> on port 8892. The project is
a flat single module, so <code>[tool.uv] package = false</code> and the Dockerfile installs
<b>dependencies only</b> (<code>uv sync --no-dev --no-install-project</code>) — no dependency on
the backend image.</p>
<h2>6. Deploy</h2>
<p>The DB is internal-only, so the server runs on the <b>same Coolify host as
<code>timescale_db</code></b>.</p>
<p><b>Recommended — Coolify-managed app.</b> Create a Coolify app from this repo, Dockerfile
build, app port <code>8892</code>, domain <code>fleetmcp.rahamafresh.com</code> (prod) /
<code>fleetmcp.fivetitude.com</code> (staging). Set secrets
<code>DATABASE_URL=postgresql://analytics_ro:&lt;pw&gt;@timescale_db:5432/tracksolid_db</code> and
<code>MCP_AUTH_TOKENS=alice:&lt;tok&gt;,bob:&lt;tok&gt;</code>, then <b>connect the app to the network
that can reach <code>timescale_db</code></b> so the hostname resolves. Coolify manages Traefik +
TLS from the domain; auto-deploys on push via the Forgejo webhook.</p>
<p><b>Fallback — <code>deploy.sh</code>.</b> Check the repo out on the host and run it: it builds
the image, resolves the DB network + DSN from the running stack, swaps in the
<code>analytics_ro</code> credentials, and runs a standalone Traefik bridge.</p>
<pre>cd ~/fleetanalytics_mcp &amp;&amp; git pull
MCP_AUTH_TOKENS=<span class="s">"alice:$(openssl rand -hex 16)"</span> bash deploy.sh</pre>
<h2>7. Deploy runbook (ordered)</h2>
<ol>
<li><b>Role (once):</b> <code>scp</code> the role SQL + bootstrap to <code>twala.rahamafresh.com</code>, run <code>bootstrap_analytics_ro.sh</code> (writes <code>~/.analytics_ro.pw</code>).</li>
<li><b>App:</b> point Coolify at this repo (§6) or run <code>deploy.sh</code> on the host. Record each analyst's token (shown once).</li>
<li><b>Network:</b> ensure the MCP container shares a Docker network with <code>timescale_db</code> so the DSN host resolves.</li>
<li><b>DNS/Traefik:</b> ensure <code>fleetmcp.*</code> resolves to the host; Coolify/Traefik issues the cert.</li>
</ol>
<h2>8. Add to Claude (for analysts)</h2>
<pre><span class="c"># Claude Code</span>
claude mcp add --transport http fireside-analytics https://fleetmcp.fivetitude.com \
--header <span class="s">"Authorization: Bearer &lt;your-token&gt;"</span>
claude mcp list <span class="c"># → "fireside-analytics: connected"</span></pre>
<p><b>Claude Desktop / claude.ai:</b> add a custom connector with the same URL and an
<code>Authorization: Bearer &lt;your-token&gt;</code> header. Example prompts: <i>"list the
schemas"</i>, <i>"describe reporting.v_daily_summary"</i>, <i>"top 10 cost centres by distance
in the last 30 days"</i>.</p>
<h2>9. Verification checklist</h2>
<ul class="chk">
<li><code>psql -U analytics_ro … "SELECT count(*) FROM reporting.v_daily_summary"</code> <b>succeeds</b>.</li>
<li><code>psql -U analytics_ro … "CREATE TABLE x(i int)"</code> <b>fails</b> (permission denied) — proves read-only.</li>
<li>the image builds (<code>docker build .</code> or Coolify build); <code>analytics_mcp</code> is <code>Up</code>; the container can reach <code>timescale_db</code>.</li>
<li><code>DATABASE_URL</code> shows <code>analytics_ro</code> (pw masked); <code>curl localhost:8892/healthz</code> returns <code>{"ok":true,…}</code>.</li>
<li><code>claude mcp list</code> shows connected; <code>list_schemas</code> / <code>describe_table</code> / a real <code>query</code> return data.</li>
<li><code>query("UPDATE reporting.refresh_log …")</code> is <b>rejected</b> by the guard.</li>
<li>A request with a missing/bad bearer token returns <b>401</b>.</li>
<li><code>docker logs analytics_mcp</code> shows one audit line per query (caller, SQL, rows, ms).</li>
</ul>
<h2>10. Security notes</h2>
<ul>
<li><b>Four read-only layers:</b> role GRANTs · <code>default_transaction_read_only=on</code> (role + connection) · rolled-back txn · SQL keyword guard.</li>
<li><b>Least privilege:</b> <code>analytics_ro</code> only has <code>USAGE</code>+<code>SELECT</code> on <code>reporting</code>/<code>tracksolid</code> and <code>EXECUTE</code> on <code>reporting</code> functions.</li>
<li><b>Per-analyst tokens</b> make access revocable and queries attributable; rotate via <code>MCP_AUTH_TOKENS</code> + redeploy (recreate).</li>
<li><b>Resource guards:</b> <code>statement_timeout=30s</code>, idle-txn timeout, row cap (1000 default / 10000 ceiling).</li>
<li><b>Future:</b> swap static Bearer for OAuth if the team scales; add a column deny-list if PII lives in <code>tracksolid.*</code>.</li>
</ul>
<p class="muted" style="margin-top:40px">Companion file: <code>docs/ANALYTICS_MCP.md</code> (full source for all four new files).</p>
</div></body></html>