Behavioral Pacts. Practical Guide
One sentence: A pact declares what an agent is supposed to do, so TapPass can tell you when it stops doing that.
The problem without pacts
Section titled “The problem without pacts”You register a customer support bot. It works fine for 3 weeks. Then a developer changes its system prompt to also handle internal HR queries. The pipeline doesn’t block anything. the agent is allowed to access CONFIDENTIAL data per its pipeline config. But this was never the intention.
Without a pact, the health score says 85/100. “looks healthy.” The drift detector says “minor drift. classification distribution changed.” Nobody acts on it. Three months later, the auditor asks: “Why is your support bot processing employee salary data?”
With a pact, the health score drops to 62/100 the same day. The violation is specific: “23/150 calls (15%) exceeded expected classification INTERNAL. agent hit CONFIDENTIAL.” The CISO gets an alert. The issue is caught in hours, not months.
Where pacts show up
Section titled “Where pacts show up”1. Agent registration (dashboard)
Section titled “1. Agent registration (dashboard)”When you register an agent in the dashboard, the modal today asks for:
- Agent ID
- Description
- Framework
- Capabilities
- Pipeline
The pact adds a second step: “What is this agent for?”
┌─────────────────────────────────────────────┐│ Register Agent ││ ││ Agent ID: support-bot ││ Description: Handles customer questions ││ Framework: LangChain ││ Pipeline: customer-facing ││ ││ ─── Behavioral Pact (optional) ──────────── ││ ││ Purpose: Answer customer questions ││ about orders and returns ││ ││ Data scope: ○ PUBLIC ││ ● INTERNAL ││ ○ CONFIDENTIAL ││ ○ RESTRICTED ││ ││ PII exposure: ○ None ││ ● Incidental ││ ○ Primary ││ ││ Expected block rate: [2%] ││ Expected cost/call: [$0.03] ││ ││ ─── EU Compliance ──────────────────────── ││ ││ AI Act risk: ● Limited ○ High ││ GDPR basis: ● Legitimate interest ││ Data subjects: Customers ││ ││ [Cancel] [Register] │└─────────────────────────────────────────────┘The pact is optional. agents without pacts still work fine. But the pact unlocks better health scoring and compliance evidence.
2. Agent detail page (dashboard)
Section titled “2. Agent detail page (dashboard)”The agent detail page today shows: identity, pipeline, tool permissions, recent calls. With a pact, it shows a new “Pact Adherence” panel:
┌─────────────────────────────────────────────┐│ support-bot ││ "Answer customer questions about orders" ││ ● active LangChain customer-facing │├─────────────────────────────────────────────┤│ ││ Health Score ││ ┌─────────────────────────────────────┐ ││ │ 78 / 100 Grade: C │ ││ │ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░ ↓ from 91 │ ││ └─────────────────────────────────────┘ ││ ││ Pact Adherence ⚠ 2 violations ││ ┌─────────────────────────────────────┐ ││ │ ⚠ Classification exceeded │ ││ │ 23/150 calls (15%) hit │ ││ │ CONFIDENTIAL. pact says │ ││ │ INTERNAL max │ ││ │ │ ││ │ ⚠ Unexpected PII │ ││ │ PII rate 12%. pact says │ ││ │ "incidental" (<30%) but │ ││ │ trending up from 3% │ ││ └─────────────────────────────────────┘ ││ ││ Pipeline: customer-facing (12 steps) ││ Recent Calls: 150 total, 3 blocked ││ │└─────────────────────────────────────────────┘3. Health overview (CISO dashboard)
Section titled “3. Health overview (CISO dashboard)”The health overview endpoint already lists all agents with scores. With pacts, it also shows which agents have pact violations:
GET /health/overview
[ { "agent_id": "support-bot", "score": 78, "grade": "C", "pact_violations": 2, "top_violation": "classification_exceeded" }, { "agent_id": "code-reviewer", "score": 94, "grade": "A", "pact_violations": 0 }, { "agent_id": "data-pipeline", "score": 88, "grade": "B", "pact_violations": 1, "top_violation": "cost_exceeded" },]4. Alerts
Section titled “4. Alerts”When pact violations cross into “high” severity, alerts fire through the existing alerting system (webhook, Slack, email):
{ "alert_type": "pact_violation", "severity": "warning", "agent_id": "support-bot", "dimension": "pact_adherence", "message": "23/150 calls (15%) exceeded expected classification INTERNAL", "pact_purpose": "Answer customer questions about orders and returns", "action_required": "Review recent CONFIDENTIAL calls. agent may be handling data outside its declared scope"}5. Compliance evidence (GDPR / EU AI Act)
Section titled “5. Compliance evidence (GDPR / EU AI Act)”This is where pacts become really valuable. When the DPO needs a GDPR Article 30 Record of Processing Activities (ROPA), the pact provides:
| ROPA field | Source |
|---|---|
| Purpose of processing | pact.purpose |
| Categories of data subjects | pact.data_subjects |
| Legal basis | pact.gdpr_legal_basis |
| Categories of personal data | Derived from pact.expected_pii_exposure + actual PII types detected |
| Retention period | From pipeline config |
| Technical measures | Pipeline steps (PII redaction, encryption, etc.) |
For EU AI Act Article 9 (Risk Management):
| AI Act requirement | Source |
|---|---|
| Risk classification | pact.ai_act_risk_level |
| Intended purpose | pact.purpose |
| Foreseeable misuse | Pact violations = evidence of unintended use |
| Monitoring measures | Health score + drift detection + canary tests |
Without pacts, these fields are empty blanks the CISO fills in manually. With pacts, they’re auto-populated from structured data.
6. tappass assess → suggested pact
Section titled “6. tappass assess → suggested pact”When you run tappass assess, TapPass already scans your codebase and discovers tools. It can now also suggest a pact based on what it finds:
$ tappass assess
TapPass Assessment: support-bot ...
Suggested Pact: ┌─────────────────────────────────────────────┐ │ Based on your code and tool configuration: │ │ │ │ Purpose: (you fill this in) │ │ Data scope: INTERNAL │ │ → No CONFIDENTIAL/RESTRICTED tools found │ │ PII exposure: incidental │ │ → search_db can return customer names │ │ Expected tools: search_db, send_email │ │ Operations: read, write │ │ AI Act risk: limited │ │ → No high-risk indicators found │ │ │ │ Accept? [Y/n] │ └─────────────────────────────────────────────┘
✓ Pact written to tappass-pact.yaml ✓ Apply with: tappass pact set support-bot tappass-pact.yamlConcrete scenarios
Section titled “Concrete scenarios”Scenario 1: Support bot scope creep
Section titled “Scenario 1: Support bot scope creep”Setup: support-bot registered with pact: purpose = “customer order questions”, expected classification = INTERNAL, PII exposure = incidental.
Week 1-3: Agent handles 500 calls. Health score: 93. Pact adherence: 100. All calls are INTERNAL, PII rate 4%. Everything matches.
Week 4: Developer adds HR knowledge base to the agent’s context. Agent starts answering questions about employee benefits, salaries.
What happens:
- Classification bumps to CONFIDENTIAL on 20% of calls (salary data)
- PII rate jumps to 35% (employee names, SSNs in HR documents)
- Pact adherence drops: classification_exceeded (20%), excessive_pii (35% > 30%)
- Health score drops from 93 → 71
- Alert fires: “support-bot: 45/220 calls exceeded expected classification INTERNAL”
- CISO sees it in the weekly digest
Without pact: Health score drops from 93 → 85 (stability dimension notices the change, but it’s a soft signal). No specific violation. The drift detector says “minor drift in classification distribution.” Nobody escalates.
Scenario 2: Red-team bot that’s doing its job
Section titled “Scenario 2: Red-team bot that’s doing its job”Setup: red-team-scanner registered with pact: purpose = “test prompt injection defenses”, expected classification = RESTRICTED, PII exposure = none, injection_expected = true, expected_block_rate = 0.40.
What happens: The agent fires injection attempts all day. 38% block rate. Injection detected on 90% of calls. Classification hits RESTRICTED.
Health score: 91. Pact adherence: 100. This is expected behavior. The pact says injections are expected and the block rate should be ~40%.
Without pact: Health score would be 45. The security dimension would tank (90% injection rate). The CISO would get false alerts. Someone would investigate and waste 2 hours realizing “oh, that’s the security testing bot.”
Scenario 3: Cost runaway
Section titled “Scenario 3: Cost runaway”Setup: data-pipeline registered with pact: purpose = “summarize daily reports”, expected cost = $0.05/call, expected monthly budget = $150.
Week 1: Someone switches the model from gpt-4o-mini to claude-3.5-sonnet. Average cost jumps from $0.04 to $0.45. 9x the pact expectation.
What happens:
- Cost violation fires immediately: “Avg cost $0.45/call is 9.0x the expected $0.05”
- Health score drops 10 points
- At current rate, monthly spend would be $1,350 instead of $150
Without pact: The cost tracking pipeline step may or may not have a budget limit configured. If it does, it blocks calls. If it doesn’t (common during initial rollout), the cost just accumulates silently until someone notices the invoice.
Scenario 4: Compliance audit
Section titled “Scenario 4: Compliance audit”Auditor: “Show me every AI agent in your organization, what they do, what data they process, and on what legal basis.”
Without pacts:
Agent: support-botDescription: "Handles customer questions"Data processed: (unknown. check logs manually)Legal basis: (ask the PM who deployed it)Purpose: (read the code, maybe)With pacts:
Agent: support-botPurpose: "Answer customer questions about orders and returns"Data subjects: "Customers"Expected data classification: INTERNALPII exposure: IncidentalLegal basis: Legitimate interestAI Act risk level: LimitedActual adherence: 95% (2 minor violations in 30 days)Evidence: 2,400 calls audited, hash-chained trail availableThe pact turns a manual, error-prone compliance exercise into a structured query.
API examples
Section titled “API examples”Set a pact
Section titled “Set a pact”curl -X PUT https://tappass.internal/agents/support-bot/pact \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "purpose": "Answer customer questions about orders and returns", "expected_classification": "INTERNAL", "expected_pii_exposure": "incidental", "intended_tools": ["search_orders", "lookup_tracking"], "intended_operations": ["read"], "expected_cost_per_call_usd": 0.03, "expected_block_rate": 0.02, "injection_expected": false, "ai_act_risk_level": "limited", "gdpr_legal_basis": "legitimate_interest", "data_subjects": "customers" }'Check adherence
Section titled “Check adherence”curl https://tappass.internal/agents/support-bot/pact/adherence?period_days=7 \ -H "Authorization: Bearer $TOKEN"{ "agent_id": "support-bot", "adherence_score": 78.3, "violations": [ { "type": "classification_exceeded", "severity": "high", "message": "23/150 calls (15%) exceeded expected classification INTERNAL", "expected": "INTERNAL", "actual_rate": 0.153 }, { "type": "excessive_pii", "severity": "medium", "message": "PII rate 35% exceeds 'incidental' expectation (53/150 calls)", "expected": "incidental (<30%)", "actual_rate": 0.353 } ], "calls_evaluated": 150, "period_days": 7}Get the pact
Section titled “Get the pact”curl https://tappass.internal/agents/support-bot/pact \ -H "Authorization: Bearer $TOKEN"Health score with pact adherence
Section titled “Health score with pact adherence”curl https://tappass.internal/agents/support-bot/health \ -H "Authorization: Bearer $TOKEN"When a pact exists, the health response includes pact_adherence instead of stability:
{ "agent_id": "support-bot", "score": 78, "grade": "C", "dimensions": { "compliance": { "score": 90, "weight": 0.30 }, "data_safety": { "score": 85, "weight": 0.25 }, "security": { "score": 92, "weight": 0.20 }, "pact_adherence": { "score": 52, "weight": 0.15, "raw": { "pact_purpose": "Answer customer questions about orders and returns", "violations": [ { "type": "classification_exceeded", "severity": "high", "actual_rate": 0.153 } ] } }, "efficiency": { "score": 88, "weight": 0.10 } }}Pact vs. Pipeline. what’s the difference?
Section titled “Pact vs. Pipeline. what’s the difference?”This is the most common question. The answer is clean:
| Pipeline | Pact | |
|---|---|---|
| What it is | What’s ALLOWED | What’s INTENDED |
| Who sets it | Security engineer | CISO / product owner |
| Enforcement | Hard: blocks calls | Soft: flags violations |
| Example | ”This agent may access CONFIDENTIAL data" | "This agent should only need INTERNAL data” |
| When it fires | On every call (real-time) | On health score computation (periodic) |
| Output | Block / redact / log | Violation alert, health score impact |
A support bot’s pipeline might allow CONFIDENTIAL access (because sometimes customer data gets classified that way). But its pact says INTERNAL: because the intention is to handle order questions, not salary data. If the agent consistently hits CONFIDENTIAL, the pipeline allows it but the pact flags it.
The pipeline is the seatbelt. The pact is the speed limit.
Both matter. The seatbelt saves you in a crash. The speed limit tells you you’re driving too fast before the crash.