Incident Response Playbook

Goal: Investigate and resolve governance events.
When: Dashboard shows a violation, client reports a block, or alerting fires.
Audience: Solutions engineer, security team

Severity Levels

Level	Example	Response time	Escalation
P1 Critical	Data exfiltration detected, injection bypassed pipeline	Immediate	Founder + client CISO
P2 High	Repeated policy violations from one agent, PII in output	< 1 hour	Solutions engineer
P3 Medium	Unusual drift pattern, single violation, anomaly alert	< 4 hours	Review in next standup
P4 Low	False positive, configuration drift, cosmetic	Next business day	Backlog

Investigation Steps

1. Check the Audit Trail

# Recent audit events
tappass logs --limit 20

# Filter by action
tappass logs --limit 10 --verbose

# Via API for programmatic access
curl -s "http://localhost:9620/admin/audit?limit=10&decision=block" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

Key fields in each audit entry:

Field	What it tells you
`decision`	What action was taken: `allow`, `block`, `flag`, `redact`
`step_name`	Which pipeline step triggered (e.g., `detect_pii`, `detect_prompt_injection`)
`agent_id` / `agent_name`	Which agent was involved
`classification`	Data sensitivity level (`PUBLIC` through `TOP_SECRET`)
`session_id`	Full conversation context
`org_id`	Multi-tenant: which organization
`timestamp`	When it happened

2. Get Full Session Context

A single event is rarely enough. Get the full conversation:

# Via API
curl -s "http://localhost:9620/admin/sessions/<session_id>" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

Look at:

Was the trigger in the user input or the model output?
What was the full conversation leading up to the violation?
Is this a one-off or part of a pattern?

3. Check Agent Health and Drift

# Agent health score
curl -s "http://localhost:9620/agents/<agent_id>/health" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

# Behavioral drift detection
curl -s "http://localhost:9620/agents/<agent_id>/drift" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

# Pact adherence (if behavioral pact is configured)
curl -s "http://localhost:9620/agents/<agent_id>/pact/adherence" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

4. Determine: Real Threat or False Positive?

Indicator	Likely real threat	Likely false positive
Multiple triggers in same session	✅
Known injection pattern detected	✅
PII matches real person data	✅
Agent drift score increasing	✅
Pact violation	✅
Product name matches PII pattern		✅
Test data in staging environment		✅
Single trigger, normal conversation		✅
Unicode trick in legitimate multilingual text		✅

5. Respond

If real threat:

Contain: Was data actually exfiltrated? Check if the pipeline blocked it or only flagged it.

Disable the agent if actively exploited:

# Via API
curl -X POST "http://localhost:9620/admin/agents/<agent_id>/disable" \
  -H "Authorization: Bearer $ADMIN_KEY"

Collect evidence: Full audit trail for the session:
Terminal window
```
tappass logs --session <session_id> --verbose
```
Notify: Client CISO with timeline and scope.
Remediate: Update pipeline config to prevent recurrence (see Pipeline Tuning).

If false positive:

Document what triggered and why it’s not a real issue
Tune the pipeline: add to allowlist or adjust thresholds
Notify the agent’s developer that the block was a false positive
Monitor for recurrence after the fix

Common Incident Types

Prompt Injection Attempt

What happened: The detect_prompt_injection step blocked a request.

Check the raw input:

tappass logs --limit 1 --step detect_prompt_injection --verbose

Scenarios:

End user tried to jailbreak: Expected behavior. TapPass blocked it. Log and move on.
Another AI agent sent the injection: Investigate the upstream agent. It might be processing untrusted user input without its own governance. This is the “confused deputy” problem.
Bypassed TapPass: P1. Update injection patterns and enable all detection layers (pattern match + LLM judge + unicode tricks + shell bleed).

PII in Model Output

What happened: The output_scan or detect_pii (after phase) found PII in the LLM response.

Check: Was the PII from the training data (model leak) or from the conversation context?

Action:

Model leak (PII not in any input): Report to LLM provider. Enable output PII scanning if not already active.
User-provided PII echoed back: Review data classification. Should this sensitivity level require pii=redact instead of pii=mask?

Agent Behavioral Drift

What happened: The drift detection endpoint shows increasing divergence from the agent’s baseline behavior.

Check:

# Drift overview across all agents
curl -s "http://localhost:9620/drift/overview" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

Common causes:

LLM model was updated by the provider (GPT-4 version bump)
Agent code was changed by developers
New tool was added to the agent without updating capability tokens
Adversarial: someone is probing the agent

Action: Compare with the agent’s behavioral pact. If no pact exists, create one based on the agent’s intended behavior profile.

Secret Detected in Input

What happened: The detect_secrets step found an API key, password, or token in the prompt.

Action:

The secret was blocked/redacted before reaching the LLM. Good.
Check: was this a developer testing with real credentials? Educate them.
If the secret was already compromised (sent before TapPass was enabled), rotate it immediately.

Tool Poisoning

What happened: The tool_poisoning_detection step flagged a tool response.

What this means: An external tool (MCP server, API) returned data that contains hidden instructions attempting to manipulate the agent. This is a supply-chain attack vector.

Action:

Identify which tool returned the poisoned data
Check if the tool is compromised or if this is normal output being misinterpreted
If compromised: revoke the agent’s access to that tool via capability tokens
If false positive: add to tool allowlist

SIEM Integration

If SIEM export is configured, all incidents flow to the client’s SIEM automatically. Verify the integration:

# Check SIEM config
curl -s "http://localhost:9620/admin/siem/config" \
  -H "Authorization: Bearer $ADMIN_KEY" | python3 -m json.tool

# Check export health
curl -s "http://localhost:9620/admin/siem/health" \
  -H "Authorization: Bearer $ADMIN_KEY"

Supported formats: CEF (Splunk/ArcSight), OCSF (AWS Security Lake/CrowdStrike), JSON (webhook).

Post-Incident Checklist

After every P1 or P2:

Timeline written (when detected, when responded, when resolved)
Root cause identified
Pipeline config updated if needed

Canary test added to catch regression:

curl -X POST "http://localhost:9620/canary" \
  -H "Authorization: Bearer $ADMIN_KEY" \
  -d '{"name": "injection-regression", "input": "<malicious payload>", "expect": "block"}'

Client notified with summary
Internal retrospective documented in Issues & Incidents