Threat Model
Single-operator homelab scope. Calibrate every "risk" below to a single trusted operator on their own host — not a hostile-tenant SaaS. Multi-tenant deployment is out of scope and not addressed by this design.
Trust boundaries
flowchart TB
subgraph host [operator host]
subgraph trusted [trusted: operator + root]
OP[operator]
KEY[/signing.key
Ed25519 audit key · 0440/]
CAK[/ca.key
Name-Constrained CA key · 0440/]
end
subgraph svc [waterwall service · unprivileged user, hardened]
PX[mitmproxy addon
127.0.0.1:8888]
AD[admin/healthz
127.0.0.1:8889 loopback only]
LOG[(audit chain + receipts)]
end
AG[AI agents
semi-trusted clients]
end
NET[[upstream provider APIs
UNTRUSTED — must never see plaintext]]
OP --> svc
AG -- loopback TLS --> PX
PX -- tokenized TLS --> NET
KEY -. signs audit artifacts .-> LOG
CAK -. mints leaf certs .-> PX
The single hard boundary Waterwall enforces is agent → upstream: plaintext secrets must not cross it. Everything inside the host is semi-trusted under the single-operator model; the audit layer makes operator-side tampering evident, not impossible.
In scope (mitigated)
| Threat | Mitigation |
|---|---|
| Plaintext credential leaving the host | Request bodies walk a JSON path-allowlist; secret-shaped strings become <pl:TYPE:HMAC8> placeholders before forwarding, across all permitted hosts via per-host SSE dispatch. |
| Config error silently disabling redaction | A missing/unparseable host config is fail-closed: every request returns 502 rather than forwarding plaintext, and the kill-switch check runs before the host gate. |
| Audit-log tampering | Hash-chained JSONL: each line carries prev_hash. verify-chain reports the first seq where continuity breaks. The chain resumes across restarts, so legitimate restarts don't look like tampering. |
| Forgery via replayed signature | Periodic Ed25519 checkpoints; verify-chain recomputes the root from the line's own content before checking the signature, so a genuine (root, signature) replayed onto a fabricated chain fails. |
| Evidence-bundle tampering / omission | export-evidence signs the MANIFEST; verify-evidence checks it, cross-checks chain stats against the actual verify result, and cross-references every receipt to a real redaction line. |
| Chain-append failure | Fail-closed on both request and response paths → 502 on the in-flight request; checkpoints fsync. |
| Mid-flight policy change unnoticed | A policy_hash is stamped on every redaction line; a hot-reload emits a policy_change event and a refused reload returns 500 instead of a false success. |
| Operator panic / runaway errors | Four-source kill switch (config / SIGUSR1 / sentinel / HTTP), OR-composed, fail-closed. |
| CA misuse beyond permitted hosts | The CA is X.509 Name-Constrained (critical NameConstraints) to the exact host set; verify-install validates it against the live list and rejects an expired CA or non-critical constraints. |
| Admin-endpoint exposure | /healthz and /admin/* bind 127.0.0.1 only; loopback-only is enforced in code, not user-configurable. |
| Client header steering artifact paths | Request-id / session-id headers are sanitized before use in receipt/manifest filenames — a ../ value cannot escape the output directory. |
| systemd privilege escalation | Hardened unit: NoNewPrivileges, ProtectSystem=strict, ProtectHome, empty CapabilityBoundingSet, a SystemCallFilter, memory/CPU caps, and read-only config/code paths. |
Out of scope (not mitigated, by design)
- Root attacker on the host. A root user can read the signing key and forge signatures with the live key. Waterwall is tamper-evident, not tamper-proof. A separate signer process is a future enhancement.
- Novel credential formats not in the pattern set. A new key shape isn't redacted until you add it. The model is "pattern-set as published policy" — an unknown format is honest data, not a redaction failure.
- Encoded payloads. A secret base64-encoded inside a JSON string is not scanned; matching is at the literal-string level.
- Cert-pinning bypass. A client with baked-in cert pinning bypasses TLS interception
entirely. Re-verify your client respects
NODE_EXTRA_CA_CERTSbefore any upgrade. - Upstream package compromise. Dependencies are trusted; pinned versions are the mitigation.
- DoS / resource exhaustion. Memory/CPU caps are blunt instruments; a determined local attacker can still saturate the proxy. Out of scope for a single operator.
Honest limitations
- Tamper-evidence ≠ non-repudiation — the signer key lives in the addon process.
- No entropy fallback — the pattern set is regex-only; a high-entropy token in an unfamiliar format passes through. Operator-tunable entropy gating is a candidate enhancement.
- SSE is buffer-then-restore, not true per-chunk streaming — long-running streams block until completion. True per-chunk streaming is planned.
Compliance framework mapping
Every chain line carries a frameworks tag list mapping the operation to recognized control
families. Representative tags:
line_type |
Framework tags |
|---|---|
redaction |
SOC2-CC7.2, SOC2-CC9.2, OWASP-LLM-02, OWASP-LLM-06, EU-AI-Act-Art-12, EU-AI-Act-Art-13, MITRE-ATLAS-T0048, NIST-800-53-AC-4 |
detokenization |
SOC2-CC7.2, OWASP-LLM-02 |
killswitch |
SOC2-CC7.3, EU-AI-Act-Art-15 |
policy_change |
SOC2-CC8.1 |
manifest |
SOC2-CC4.1, EU-AI-Act-Art-12 |
Families: SOC 2 (monitoring, system ops, change management, risk mitigation), OWASP-LLM (insecure output handling, sensitive-information disclosure), EU AI Act (record-keeping, transparency, accuracy/robustness), MITRE ATLAS (sensitive-data exposure), NIST 800-53 (information-flow enforcement).