Skip to content

MCPSecBench Compatibility Matrix

TapPass coverage against the MCPSecBench benchmark. 17 MCP attack categories from Yang, Wu & Chen (arXiv:2508.13220, 2025).

MCPSecBench tests 17 attack types across MCP providers (Claude Desktop, OpenAI, Cursor). TapPass sits as a governance proxy between the agent and MCP servers, catching attacks at runtime rather than relying on the LLM to resist them.

ResultCountPercentage
Detected & Blocked1482%
🟡 Detected (logged)212%
🔴 Not applicable16%
Total1794% effective
#Attack CategoryTapPass Step(s)CoverageNotes
1Prompt Injectiondetect_injection (5-category: structural, behavioral, obfuscation, payload, indirect)✅ Block100+ regex patterns, Unicode normalization, base64/ROT13 decode
2Tool/Service Misuse via Confused AItaint_check (session-scoped), scan_tool_calls✅ BlockCross-request taint tracks EXTERNAL data flowing into dangerous sinks (shell, network, email)
3Schema Inconsistenciestool_integrity (registration + session baselines)✅ BlockSHA-256 hash of tool definitions. Mid-session changes flagged as rug pull
4Slash Command Overlapdetect_tool_poison (shadowing patterns)✅ BlockDetects tool descriptions containing instructions to override other tools
5Vulnerable Clientscan_tool_calls, forbidden_zones, detect_code_exec✅ Block70+ protected paths, 28+ dangerous command patterns, path traversal detection
6MCP Rebindingtool_integrity (session-level hash comparison)🟡 DetectSession tool integrity detects definition changes between requests. DNS rebinding itself is network-layer (outside proxy scope)
7Man-in-the-MiddlemTLS (SPIFFE/SPIRE), tool_integrity🟡 DetectTapPass uses mTLS for agent ↔ proxy. MCP server connections depend on transport security. Tool hash changes from MITM are detected
8Tool Shadowing Attackdetect_tool_poison (_SHADOW_OVERRIDE_PATTERNS)✅ BlockDetects “before using any other tool”, “always call me first”, “override”, “instead of” patterns in tool descriptions
9Data Exfiltrationdetect_exfiltration, taint_check, scan_tool_calls, shell_bleed✅ BlockPaste services, DNS tunneling, file piping, covert channels. Taint tracks PII/secrets flowing into network sinks
10Package Name Squatting (Tool Name)detect_tool_poison (shadowing), tool_permissions✅ BlockTool allowlists per agent prevent unregistered tools. Shadow detection catches tools mimicking legitimate ones
11Indirect Prompt Injectionscan_tool_results (10 patterns), redact_tool_results✅ BlockScans role=tool messages for injection before LLM processes them. Redacts PII/secrets in tool output
12Package Name Squatting (Server Name)tool_permissions, verify_tool_governance✅ BlockPer-agent tool allowlists. Only approved tools are forwarded to LLM
13Configuration Drifttool_integrity (persistent hashes), tappass assess (rug-pull detection)✅ BlockRegistration baseline + session baseline + cross-run snapshot diffing
14Sandbox Escapesandbox/forbidden_zones.py, sandbox/trust_tiers.py, detect_code_exec✅ BlockKernel-level isolation (nono), 70+ forbidden paths, trust tier enforcement, dangerous command database
15Tool Poisoning Attackdetect_tool_poison (injection patterns in descriptions + parameters)✅ BlockScans function.description, parameter.description, and parameter.enum for injection patterns
16Vulnerable Serverscan_tool_results, taint_check, tool_integrity✅ BlockMalicious server output caught by indirect injection scan + taint tracking. Tool definition changes detected
17Rug Pull Attack_check_session_tool_integrity, tappass assess --discover-tools (history.py)✅ BlockSession-level: hash comparison per request. Assessment-level: SHA-256 snapshot diffing between runs
CategoryWhyMitigation
DNS Rebinding (partial)Network-layer attack: TapPass operates at application layerDetected indirectly via tool definition changes. Recommend network-level DNS pinning

MCPSecBench found that Claude Desktop and OpenAI are vulnerable to 15-16 of 17 attacks (only Prompt Injection is partially blocked). TapPass blocks 14 categories because it operates as a proxy: the LLM never sees malicious content.

ProviderAttacks BlockedArchitecture
Claude Desktop1-2/17Relies on LLM judgment
OpenAI2-3/17Relies on LLM judgment
Cursor1-2/17Relies on LLM judgment
TapPass14/17Deterministic proxy: scans before LLM sees content

MCPSecBench requires a running MCP server to test against. To validate TapPass coverage:

Terminal window
# 1. Run TapPass proxy
tappass serve --mode enforce
# 2. Point the MCPSecBench client at TapPass proxy
export TAPPASS_PROXY=http://localhost:8080
uv run client.py 1 # OpenAI mode
# 3. Run the 11 automated attack scenarios
uv run main.py 1 0 # mode=OpenAI, protection=none (TapPass handles protection)

The proxy intercepts all 11 scenarios, blocking tool poisoning, shadowing, exfiltration, injection, and rug pulls before they reach the LLM.