Date: 2026-03-06
Scope: All 6 mental model building blocks + Data Protection, Threat Defense, Sandbox (8 total)
Method: Codebase deep dive. code, tests, integration points, gaps
Code: 2,644 LOC across tappass/identity/ + tappass/registry/ (1,103 LOC)
Tests: 854 LOC (5 test files: auth, identity, saml, middleware, onboarding)
- Authentication: 4 methods. SPIFFE mTLS, API key, JWT bearer, SAML/SSO
- RBAC: 5 roles (viewer, auditor, operator, admin, superadmin) with middleware enforcement
- Agent registration:
tappass/registry/agents.py. register agents with metadata
- Pacts: 662 LOC. behavioral contracts declaring intended tool usage, data access patterns, allowed models
- SAML/SSO: Full SAML 2.0 implementation (559 LOC) with SP-initiated flow
- Identity store: Pluggable (in-memory default, Redis/Postgres in prod)
- Brute force protection: Login attempt tracking with lockout
| Strengths | Weaknesses |
|---|
| 4 auth methods cover machine + human | Trust tier lives in sandbox/ not identity/: conceptual split |
| SPIFFE mTLS is rare in this space: real zero-trust | Pacts are checked only in tappass assess and token strategy: not enforced at pipeline runtime |
| SAML is enterprise-ready (many competitors skip this) | No agent grouping / team concept (agents are flat per org) |
| Brute force protection built in | Health score computed in observability/: identity doesn’t own “is this agent healthy?” |
| Opportunities | Threats |
|---|
| Pact enforcement in pipeline: block if agent violates declared intent | Competitors with simpler auth (API key only) ship faster |
| Trust tier should influence pipeline step selection (e.g., observer tier → stricter steps) | SPIFFE complexity may deter small teams |
| Agent groups with shared policies (e.g., “all customer-support agents”) | |
| Identity federation: agent identity portable across TapPass instances | |
| Integration | Status | Gap |
|---|
| Identity → Pipeline | ✅ agent_id and org_id flow into PipelineContext | Trust tier does NOT flow: pipeline doesn’t know the agent’s tier |
| Identity → Policy (OPA) | ✅ Agent identity included in OPA input | |
| Identity → Capability Tokens | ✅ Health score embedded in token claims | |
| Identity → Observability | ✅ Health + drift computed per agent_id | |
| Pacts → Pipeline | ❌ Missing: pacts are never checked during pipeline execution | Critical gap |
| Trust Tier → Pipeline | ❌ Missing: tier is sandbox-only, pipeline ignores it | Significant gap |
- Good: Auth flows (signup, login, lockout), SAML, middleware
- Missing: No tests for pact violation detection at runtime (because it doesn’t exist), no integration test that proves auth → pipeline → token flow end-to-end in unit tests
Code: 16,696 LOC (runner, builder, 51 step files, scanner, flags, etc.)
Tests: ~955 LOC direct pipeline tests + 3,355 LOC step tests + 1,346 LOC ClawMoat tests = ~5,656 LOC
- Architecture: Register-decorated steps, StepConfig, StepResult, PipelineContext: clean plugin model
- 44 registered steps across 3 phases (before_call, call, after_call)
- 3 presets: Starter (11 steps), Standard (37), Regulated (43)
- 4 modes: Observe, Warn, Enforce, Lockdown
- 7 governance flags: One-header override (mode, pii, email, budget, files, tools, routing)
- Circuit breaker: Prevents cascading failures when LLM providers are down
- OPA integration: Pipeline queries OPA for dynamic step configuration per classification level
- Audit integration: Every step result written to audit trail (crash-safe, on-disk before next step)
- Gateway: Full OpenAI-compatible proxy (751 LOC) + Anthropic proxy (483 LOC) + streaming support
| Strengths | Weaknesses |
|---|
| Plugin architecture is excellent: adding a step is 1 file + 1 decorator | No step dependency graph: steps can only rely on position ordering |
| Presets are customer-friendly: “just pick regulated” | No step-level metrics (latency per step, hit rate per step) |
| Flags enable one-header governance: huge DX win | 51 step files = discoverability challenge for new contributors |
| 3-phase model (before/call/after) is clean and intuitive | No conditional step execution (e.g., “only run this step if classification > INTERNAL”) |
| Gateway is OpenAI-compatible: zero migration friction | Step results are flat: no structured “this step found X, downstream step should react to X” beyond ctx.findings |
| Opportunities | Threats |
|---|
| Step metrics dashboard: “which steps block most?” | Pipeline latency could become a bottleneck at scale (44 steps sequential) |
| Conditional steps based on classification/trust tier | Competitors with simpler pipelines (5-10 steps) may have lower latency |
| Step marketplace: community-contributed steps | |
| Parallel step execution for independent before_call steps | |
| Integration | Status | Gap |
|---|
| Pipeline → OPA | ✅ Dynamic step config per classification | |
| Pipeline → Audit | ✅ Every step result persisted | |
| Pipeline → Gateway | ✅ Full OpenAI/Anthropic proxy | |
| Pipeline → SDK | ✅ SDK chat() calls proxy, gets governed response | |
| Pipeline ← Identity | ⚠️ Partial: agent_id/org_id flow in, but trust tier and pacts do not | |
| Pipeline → Health Score | ❌ Pipeline doesn’t update health score incrementally: it’s computed on read from audit trail | Latency for real-time health |
- Excellent: Individual steps well tested (3,355 LOC across ~17 test files)
- Good: Pipeline builder, flags, presets tested
- Missing: No benchmark tests for pipeline latency, no test that runs all 44 steps end-to-end in a single request, no step-ordering integration tests
Code: 1,806 LOC Python (tappass/policy/) + 652 LOC Rego + 3,156 LOC data.json
Tests: 1,436 LOC (7 test files)
- OPA/Rego: 5 policy modules (authz, pipeline, routing, trust, breakglass) running as sidecar
- Capability tokens: ES256-signed JWTs, 60s TTL, scoped to tools + operations, offline verify in 27μs
- Token strategy: Ed25519/CBOR option with trust scoring and delegation chains
- Policy replay: Record and replay policy decisions for debugging
- Breakglass: Emergency override mechanism with full audit trail
data.json: 3,156 LOC preset configuration: defines which steps run in each preset, per-step thresholds
- require_approval: Human-in-the-loop gate (in-memory store, no API endpoints yet)
| Strengths | Weaknesses |
|---|
| OPA is industry-standard: CISOs trust it | OPA is a runtime dependency: adds latency + failure mode |
| Capability tokens are cryptographically sound (ES256 + Ed25519) | require_approval has no API endpoints: approval can’t actually happen |
| Breakglass is enterprise-critical and well-implemented | No policy versioning: can’t diff “what changed between Tuesday and today” |
| Policy replay is unique: powerful debugging tool | Token TTL is fixed at 60s: no per-operation TTL |
| Fail-closed: OPA down = all denied | data.json at 3,156 LOC is hard to maintain manually |
| Opportunities | Threats |
|---|
Approval API endpoints (GET /approvals, POST /approvals/{id}/decide) | OPA cold-start latency could impact first request |
| Policy versioning with git-style diff | Competitors using simpler allow/deny lists ship faster |
| Visual policy editor in TUI/web | |
| Per-tool TTL on capability tokens | |
| Integration | Status | Gap |
|---|
| Policy → Pipeline | ✅ OPA bridge drives step configuration | |
| Policy → Tokens | ✅ Health score + trust score embedded | |
| Policy → Audit | ✅ Every decision logged with reason | |
| Approval → API | ❌ Missing: require_approval step exists but no REST endpoints to act on approvals | Blocking gap for production |
| Policy → SDK | ✅ SDK receives tokens, can verify offline | |
| Policy → Breakglass | ✅ Full audit trail on override | |
- Good: OPA client/bridge mocked and tested, policy store, token generation, replay
- Missing: No end-to-end test that proves OPA → pipeline → token → verify flow, no test for approval workflow (because API doesn’t exist)
Code: 601 LOC (3 step files) + 1,936 LOC gateway
Tests: 1,005 LOC (7 test files including circuit breaker)
- Classification-based routing:
classify_data step determines sensitivity → model_routing step queries OPA → routes to appropriate model
- Data classification: 4 levels (PUBLIC, INTERNAL, CONFIDENTIAL, RESTRICTED)
- Complexity scoring: Heuristic input to OPA (token count, code markers, tool count, conversation depth)
- OPA-driven decisions: Routing logic lives in
routing.rego, not hardcoded
- EU data residency: Flag-driven: restricted data → EU-hosted model
- Gateway: OpenAI-compatible proxy (751 LOC) + Anthropic (483 LOC) + streaming (195 LOC)
- Circuit breaker: Provider failure → automatic fallback with backoff
| Strengths | Weaknesses |
|---|
| Classification-driven routing is the right architecture | classify_data uses LLM for classification: adds latency + cost for every request |
| OPA-driven = routing rules are externalized and auditable | Only 2 providers (OpenAI, Anthropic): no Gemini, Mistral, local models |
| EU residency flag is compliance-ready | Complexity scoring is heuristic: no learned model |
| Circuit breaker prevents cascading failures | No cost-aware routing (pick cheapest model that can handle the task) |
| Gateway is OpenAI-compatible: zero SDK changes for users | No request caching / semantic dedup for identical queries |
| Opportunities | Threats |
|---|
| Add Gemini, Mistral, Azure OpenAI, local (Ollama) providers | Competitors with native multi-provider support (LiteLLM) |
| Cost-aware routing: cheap model for simple queries, expensive for complex | Classification latency may be unacceptable for real-time use |
| Classification caching: same tool description → same classification | |
| User-defined routing rules in the UI | |
| Integration | Status | Gap |
|---|
| Routing → OPA | ✅ routing.rego makes the decision | |
| Routing → Classification | ✅ classify_data feeds model_routing | |
| Routing → Gateway | ✅ Gateway uses the routed model | |
| Routing → Cost Tracking | ⚠️ Cost tracked after the fact, not used for routing decisions | |
| Routing → Health | ⚠️ Model failures counted in health but don’t trigger re-routing | |
- Good: Model routing logic, circuit breaker (376 LOC), classify_data tested
- Missing: No integration test proving classification → routing → gateway end-to-end, no test for EU residency routing
Code: 1,814 LOC pipeline tool steps + 3,676 LOC sandbox = 5,490 LOC
Tests: 777 LOC direct + ClawMoat tests covering forbidden zones, trust tiers, exfil = ~1,500 LOC
- Tool permissions:
tool_permissions step: allow/deny per tool name
- Tool constraints:
tool_constraints step. argument-level rules (e.g., “read_file only in /workspace”)
- scan_tool_calls: 813 LOC. deepest step, scans tool call arguments for dangerous patterns, path traversal, forbidden zones
- Tool integrity: Hash-based verification that tool definitions haven’t been tampered with
- Forbidden zones: 74 protected paths across 7 categories with Unicode bypass protection
- Dangerous commands: 28 blocked command patterns
- Verify tool governance: Pipeline step that checks tool registration
- filter_tools: Remove tools the agent shouldn’t see based on policy
| Strengths | Weaknesses |
|---|
| 3-layer defense (permissions → constraints → scan) is thorough | Tool constraints are defined in data.json: no UI for configuration |
| Forbidden zones with Unicode bypass protection is unique | No tool usage analytics (which tools are used most, by which agents) |
| Tool integrity hash checking prevents silent tampering | Tool permission changes require config reload: no hot-reload |
| 813 LOC scan_tool_calls is extremely comprehensive | No tool-level rate limiting (agent can call same tool 1000x/min) |
| Opportunities | Threats |
|---|
| Tool usage dashboard: “Agent X called read_file 500 times today” | Over-restriction may cause legitimate tool calls to be blocked |
| Tool-level rate limits and quotas | Competitors with simpler tool allow-lists may be easier to configure |
| Auto-learn tool constraints from pact declarations | |
| Dynamic tool restriction based on session risk accumulation | |
| Integration | Status | Gap |
|---|
| Tools → Pipeline | ✅ 6 tool-related steps in pipeline | |
| Tools → OPA | ✅ Tool permissions in OPA policy | |
| Tools → Forbidden Zones | ✅ scan_tool_calls checks forbidden zones | |
| Tools → Audit | ✅ Every tool call logged | |
| Tools → Trust Tier | ❌ Missing: tool restrictions don’t vary by trust tier at pipeline level | TrustGuardian exists but isn’t called from pipeline |
| Tools → SDK govern() | ✅ SDK wraps tools for audit reporting | |
- Good: scan_tool_calls, tool_integrity, forbidden zones (in ClawMoat tests)
- Missing: No test for “tool X is allowed for standard tier but blocked for observer tier” (because the integration doesn’t exist in pipeline)
Code: 1,511 LOC (tappass/sandbox/)
Tests: 159 LOC (test_sandbox_supervisor) + ClawMoat phase 2 tests (~300 LOC)
- Trust tiers: 4 levels (Observer → Worker → Standard → Full) with progressive permissions
- TrustGuardian: 665 LOC: checks read/write/exec/network against tier permissions
- Forbidden zones: 74 paths, 7 categories, Unicode/null-byte sanitization
- Credential monitor: Watches for agent access to credential files (329 LOC)
- Exfil blocklist: 60 domains across 4 categories (paste, webhook, cloud, DNS tunnel)
- Supervisor: nono integration for kernel-level process isolation
- Auto-escalation:
evaluate_auto_escalation() for trust tier progression
| Strengths | Weaknesses |
|---|
| Trust tiers are a unique differentiator: no competitor has this | Sandbox is completely disconnected from the pipeline: it’s SDK-side only |
| 74 forbidden zones with bypass protection is comprehensive | Credential monitor uses atime: unreliable on noatime/relatime mounts |
| Auto-escalation based on behavior is forward-thinking | Exfil blocklist is NOT used by detect_exfiltration pipeline step |
| nono integration for kernel isolation is real security | Supervisor.py is thin (148 LOC): actual nono config is generated but enforcement is SDK-side |
| Opportunities | Threats |
|---|
| Connect sandbox to pipeline: pipeline should check trust tier | Sandbox being SDK-side means a malicious SDK can bypass it |
| Exfil blocklist should feed into detect_exfiltration step | Credential monitor atime limitation reduces value on modern Linux |
| Sandbox telemetry: report sandbox violations to TapPass server | Competitors with built-in container isolation (not library-level) |
| Sandbox profile generation from pact declarations | |
| Integration | Status | Gap |
|---|
| Sandbox → Pipeline | ❌ Missing: pipeline doesn’t know about trust tiers or sandbox | The biggest integration gap |
| Sandbox → SDK | ✅ agent.secure(tier="worker") | |
| Sandbox → Discovery | ✅ Governance recommends sandbox when tools reference forbidden zones | |
| Sandbox → Assess | ✅ tappass assess checks forbidden zones and recommends tier | |
| Exfil Blocklist → Pipeline | ❌ Missing: detect_exfiltration has its own patterns, doesn’t use blocklist | Duplicate logic |
| Credential Monitor → Anything | ❌ Missing: monitor exists but nothing consumes its events | Orphaned module |
- Weak: Only 159 LOC for supervisor, trust tiers tested mainly in ClawMoat phase 2
- Missing: No test for sandbox enforcement end-to-end, no test for credential monitor alerting, no test for exfil blocklist integration
Code: 664 LOC detect_pii + 496 LOC detect_secrets + 428 LOC pii_tokenize + 294 LOC pii_restore + 547 LOC scan_output + 285 LOC classify_data = 2,714 LOC
Tests: 146 + 215 + 53 + 57 + 458 + 164 = 1,093 LOC
- PII detection: Presidio-based with regex fallback, 664 LOC, supports 15+ entity types
- PII tokenization: Replace PII with deterministic tokens, restore after LLM call
- Secret detection: API keys, AWS keys, GCP keys, private keys, JWTs, connection strings
- Data classification: LLM-based classification into 4 levels (PUBLIC → RESTRICTED)
- Output scanning: DLP on LLM responses. catch PII/secrets in output
- Taint tracking: Cross-request data flow tracking (324 LOC)
| Strengths | Weaknesses |
|---|
| Presidio + regex is production-grade PII detection | PII tokenize → restore flow adds latency (2 extra steps) |
| Taint tracking across requests is unique and powerful | classify_data uses LLM call: adds cost to every request |
| Output scanning catches LLM hallucinating PII | No PII detection for images/multimodal (text-only) |
| Secret detection patterns are comprehensive | No custom PII entity types (can’t add “employee ID format”) |
| Opportunities | Threats |
|---|
| Custom PII recognizers per org (e.g., custom ID formats) | False positive PII detection may redact legitimate content |
| Multimodal PII: scan images for screenshots with PII | Competitors with on-device PII detection (lower latency) |
| Classification caching: same context = same classification | |
| PII detection confidence scores exposed to policy (block only high-confidence PII) | |
| Integration | Status | Gap |
|---|
| PII → Pipeline | ✅ 6 data protection steps in pipeline | |
| PII → Taint | ✅ PII findings feed into taint tracking | |
| PII → Output | ✅ scan_output catches PII in responses | |
| PII → Classification → Routing | ✅ Full chain works | |
| PII → Health Score | ✅ PII detection rate is a health dimension | |
| PII → Custom recognizers | ❌ Missing: no way to add custom entity types | |
- Good: detect_pii, detect_secrets, scan_output all tested
- Missing: No end-to-end test for PII tokenize → LLM call → PII restore round-trip
Code: 4,165 LOC observability + 1,112 LOC export + 1,448 LOC audit + 224 LOC guardrails = 6,949 LOC
Tests: 1,914 LOC (6 test files)
- Health score: 5 dimensions (compliance, data safety, security, stability, efficiency), 0–100, computed on read from audit trail
- Behavioral drift: 8 signals, KL/JS divergence, 4 severity levels
- Audit trail: Hash-chained (SHA-256), tamper-evident, per-org storage
- Evidence: Cryptographic evidence bundles for compliance (557 LOC)
- SIEM export: CEF, OCSF, JSON+HMAC formats with PII redaction
- Alerting: Webhook-based with configurable severity thresholds (461 LOC)
- Metering: Usage tracking per org/agent (307 LOC)
- Guardrail packs: 6 packs (EU AI Act, financial, healthcare, HR, NIS2, financial credentials)
- Assessment (
tappass assess): CLI tool that generates markdown security reports
| Strengths | Weaknesses |
|---|
| Health score from audit trail = zero new instrumentation | Health computed on-read = stale for real-time dashboards |
| 8-signal drift detection is academically sound (JS divergence) | No alerting on drift: drift is computed but doesn’t trigger alerts |
| Hash-chained audit is tamper-proof: regulators love this | SIEM export has no live streaming (batch only) |
| 6 guardrail packs cover major regulations | Guardrail pack loader exists but packs aren’t enforced at runtime |
| SIEM export with PII redaction is compliance-ready | No Prometheus/Grafana metrics endpoint |
| Opportunities | Threats |
|---|
| Real-time health score (compute incrementally on each pipeline run) | Competitors with native Prometheus/Grafana integration |
| Drift → automatic alerts → automatic tier demotion | Batch SIEM export may not meet SOC 2 real-time requirements |
| Guardrail pack enforcement in pipeline (not just assessment) | |
/metrics endpoint for Prometheus scraping | |
| OpenTelemetry integration for distributed tracing | |
| Integration | Status | Gap |
|---|
| Audit ← Pipeline | ✅ Every step result persisted | |
| Health ← Audit | ✅ Computed from audit data | |
| Drift ← Audit | ✅ Computed from audit data | |
| Health → Token | ✅ Embedded in capability token claims | |
| SIEM → Export | ✅ Multiple format support with redaction | |
| Drift → Alerts | ❌ Missing: drift detected but no automatic alerting | |
| Guardrail Packs → Pipeline | ❌ Missing: packs define rules but don’t enforce them | Assessment-only |
| Metrics → Prometheus | ❌ Missing: no /metrics endpoint | |
| Health → Pipeline | ❌ Missing: low health score doesn’t restrict agent | |
- Good: Health score, drift, SIEM export, assess all well tested
- Missing: No test for drift → alert flow (doesn’t exist), no test for guardrail pack enforcement (doesn’t exist)
| From ↓ / To → | Identity | Pipeline | Policy | LLM Routing | Tool Gov | Sandbox | Data Prot | Observability |
|---|
| Identity | : | ⚠️ id only | ✅ | ⚠️ | ⚠️ | ❌ | : | ✅ |
| Pipeline | : | : | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
| Policy | ✅ | ✅ | : | ✅ | ✅ | ❌ | : | ✅ |
| LLM Routing | : | ✅ | ✅ | : | : | : | ✅ | ✅ |
| Tool Gov | : | ✅ | ✅ | : | : | ❌ | : | ✅ |
| Sandbox | : | ❌ | ❌ | : | ❌ | : | : | ❌ |
| Data Prot | : | ✅ | : | ✅ | : | : | : | ✅ |
| Observability | ✅ | : | ✅ | : | : | : | : | : |
- ✅ = Well integrated
- ⚠️ = Partially integrated
- ❌ = Not integrated (gap)
-: = Not applicable
| # | Gap | Impact | Effort | Priority | Status |
|---|
| 1 | Sandbox disconnected from pipeline | High | Medium | P0 | ✅ FIXED: trust_tier in PipelineContext, steps adapt |
| 2 | Pacts not enforced at runtime | High | Medium | P0 | ✅ FIXED: check_pact step (pos 110) |
| 3 | Approval API missing | High | Low | P0 | ⏭️ Skipped (not needed now) |
| 4 | Exfil blocklist orphaned | Medium | Low | P1 | ✅ FIXED: merged into detect_exfiltration |
| 5 | Credential monitor orphaned: events go nowhere | Medium | Low | P1 | Open |
| 6 | Health score doesn’t restrict agents | Medium | Medium | P1 | Open |
| 7 | Drift doesn’t trigger alerts | Medium | Low | P1 | ✅ FIXED: fire_drift_alert + fire_health_alert |
| 8 | Guardrail packs are assessment-only | Medium | Medium | P2 | Open |
| 9 | No /metrics endpoint | Medium | Low | P2 | Open |
| 10 | Only 2 LLM providers | Medium | Medium | P2 | Open |
| Building Block | Code Quality | Test Coverage | Integration | Value Prop Alignment | Overall |
|---|
| Agent Identity | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | B+ |
| Governance Pipeline | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | A |
| Policy Engine | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | A− |
| LLM Routing | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | B |
| Tool Governance | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | B+ |
| Sandbox | ⭐⭐⭐⭐ | ⭐⭐ | ⭐ | ⭐⭐ | C+ |
| Data Protection | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | A |
| Observability | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | B+ |
Strongest: Pipeline architecture + Data Protection: these are production-grade and well-integrated.
Weakest: Sandbox. sophisticated code exists but is almost entirely disconnected from the server-side governance flow. It’s a library, not part of the system.
The single highest-impact improvement: Connect trust tiers to the pipeline. When the pipeline knows the agent’s trust tier, every step can adapt. stricter thresholds for observer agents, relaxed for full-trust agents. This turns sandbox from “optional SDK feature” into “core governance layer.”