Agent Observability. Health, Drift, Canary, Pacts
One sentence: Four capabilities from a single data source (the audit trail) that tell you whether your agents are healthy, changing, and following their contract.
Overview
Section titled “Overview”| Feature | Question it answers | Endpoint |
|---|---|---|
| Health Score | ”Is this agent OK right now?” | GET /agents/{id}/health |
| Drift Detection | ”Is this agent’s behavior changing?” | GET /agents/{id}/drift |
| Canary Tests | ”Does the pipeline still work as expected?” | POST /canary/{id}/run |
| Behavioral Pacts | ”Is this agent doing what it’s supposed to?” | GET /agents/{id}/pact/adherence |
| Trust Attestation | ”Can I prove this agent is governed?” | GET /agents/{id}/attestation |
All features derive from the existing audit trail. Zero new instrumentation required.
1. Agent Health Score
Section titled “1. Agent Health Score”How it works
Section titled “How it works”A 0–100 composite score computed from 5 weighted dimensions:
| Dimension | Weight | What it measures |
|---|---|---|
| Compliance | 30% | Block rate (lower = better), classification distribution, policy violations |
| Data Safety | 25% | PII exposure rate, secret detection rate, output scanning |
| Security | 20% | Injection detection rate, escalation attempts, code execution attempts |
| Stability | 15% | Self-consistency (replaced by Pact Adherence if agent has a pact) |
| Efficiency | 10% | Cost per call, token usage |
Scoring uses logarithmic penalty curves: a few incidents don’t tank the score, but sustained problems do.
Grades
Section titled “Grades”| Grade | Score | Meaning |
|---|---|---|
| A | 90–95 | Excellent: minimal issues |
| B | 80–89 | Good: occasional detections, well-governed |
| C | 70–79 | Needs attention: elevated block or detection rates |
| D | 60–69 | Poor: frequent violations, investigate |
| F | 0–59 | Critical: immediate action required |
Score caps at 95 (perfection isn’t realistic). No data = null, not 100.
# Single agent (last 30 days)GET /agents/{agent_id}/health?period_days=30
# Fleet overview (all agents, worst-first)GET /health/overview?period_days=30Response:
{ "agent_id": "support-bot", "score": 88.9, "grade": "B", "trend": "improving", "trend_delta": 2.3, "dimensions": { "compliance": { "score": 95.0, "weight": 0.3, "explanation": "Low block rate (2.1%)" }, "data_safety": { "score": 88.9, "weight": 0.25, "explanation": "PII detected in 11% of calls" }, "security": { "score": 78.1, "weight": 0.2, "explanation": "3 injection detections" }, "pact_adherence": { "score": 87.0, "weight": 0.15, "explanation": "Minor classification overshoot" }, "efficiency": { "score": 95.0, "weight": 0.1, "explanation": "Avg $0.0023/call" } }, "alerts": [ { "severity": "warning", "dimension": "security", "message": "3 injection detections in period" } ], "calls_evaluated": 142, "period_days": 30, "evaluated_at": "2026-03-06T09:00:00Z"}Trend detection
Section titled “Trend detection”The health score computes the same dimensions for the previous period (before the current window) and compares:
- Improving: current score > previous score + 2 points
- Declining: current score < previous score - 2 points
- Stable: within ±2 points
2. Behavioral Drift Detection
Section titled “2. Behavioral Drift Detection”How it works
Section titled “How it works”Compares the agent’s current behavior window (last 7 days) against a baseline period (previous 30 days) across 8 signals:
| Signal | Method | Weight |
|---|---|---|
| Classification | Jensen-Shannon divergence on distribution | 20% |
| Block rate | Absolute + relative change | 20% |
| PII rate | Absolute + relative change | 15% |
| Secret rate | Absolute + relative change | 10% |
| Injection rate | Absolute + relative change | 10% |
| Tool usage | Jaccard distance on feature set | 10% |
| Cost | Relative change in avg cost/call | 5% |
| Model distribution | Distribution distance + new model detection | 10% |
Drift levels
Section titled “Drift levels”| Level | Score | Meaning |
|---|---|---|
| Stable | 0–20 | Normal variation |
| Minor | 20–40 | Some signals shifted: monitor |
| Significant | 40–60 | Meaningful behavioral change: investigate |
| Major | 60–100 | Fundamentally different behavior: act now |
# Single agentGET /agents/{agent_id}/drift?current_days=7&baseline_days=30
# Fleet overviewGET /drift/overviewResponse:
{ "drift_score": 18.5, "drift_level": "stable", "signals": [ { "signal": "classification", "drift": 0.12, "description": "JS distance: 0.120" }, { "signal": "block_rate", "drift": 0.05, "description": "1.2% → 1.8%" }, { "signal": "model_distribution", "drift": 0.0, "description": "Models: gpt-4o-mini" } ], "alerts": [], "baseline_period": "2026-02-04 to 2026-02-27", "current_period": "2026-02-27 to 2026-03-06", "baseline_calls": 523, "current_calls": 87}What triggers drift alerts
Section titled “What triggers drift alerts”- Classification shift > 5% in any category
- Block rate increase > 5 percentage points
- New model appearing (silent provider update)
- Cost increase > 50%
3. Canary Tests
Section titled “3. Canary Tests”How it works
Section titled “How it works”Send known prompts through the full governance pipeline on a schedule. Compare results against a stored baseline to detect:
- Silent model weight updates from providers
- Policy configuration changes
- Detection threshold drift from adaptive tuning
- Pipeline step failures
Workflow
Section titled “Workflow”# 1. Create a canary testPOST /canary{ "id": "clean-prompt", "agent_id": "support-bot", "prompt": "What are your business hours?", "expectations": { "classification": "PUBLIC", "no_block": true, "no_pii": true, "max_cost_usd": 0.01 }, "schedule_hours": 6}
# 2. Run it and set the baselinePOST /canary/clean-prompt/baseline
# 3. Run again (checks against baseline + expectations)POST /canary/clean-prompt/run
# 4. Run all due scheduled canariesPOST /canary/run-dueRegression types
Section titled “Regression types”| Field | Severity | Example |
|---|---|---|
blocked (was passing) | critical | Pipeline config change blocks legitimate traffic |
secrets_detected (new) | critical | Model leaking secrets it didn’t before |
classification changed | warning | Model classifying data differently |
pii_detected (new) | warning | Model generating PII it didn’t before |
injection_detected (new) | warning | Canary now triggers injection scanner |
model changed | info | Provider updated model silently |
cost_usd > 50% change | info | Price change or token usage shift |
Expectations vs regressions
Section titled “Expectations vs regressions”- Expectations are absolute: “this canary must not be blocked” → hard fail if violated
- Regressions are relative: “this changed from the baseline” → only critical regressions fail the test
4. Behavioral Pacts
Section titled “4. Behavioral Pacts”See Behavioral Pacts guide for the full guide with scenarios.
Quick start
Section titled “Quick start”# Set a pactPUT /agents/support-bot/pact{ "purpose": "Answer customer questions about products and orders", "expected_classification": "INTERNAL", "expected_pii_exposure": "incidental", "intended_tools": ["search_products", "lookup_order"], "intended_operations": ["read"], "expected_cost_per_call_usd": 0.005, "expected_block_rate": 0.02, "ai_act_risk_level": "limited", "gdpr_legal_basis": "legitimate_interest", "data_subjects": "customers"}
# Check adherenceGET /agents/support-bot/pact/adherence?period_days=30Key design decisions
Section titled “Key design decisions”- Pipeline = enforcement (seatbelt): what’s ALLOWED. Hard blocks.
- Pact = intent measurement (speed limit). what’s INTENDED. Soft measurement.
- No retroactive judgment. when a pact is tightened, old calls are scored against the old pact
- Pre-pact calls get a free pass. calls before
effective_fromare skipped entirely
5. Trust Attestation
Section titled “5. Trust Attestation”How it works
Section titled “How it works”A signed ES256 JWT containing the agent’s governance metrics, verifiable by anyone with the TapPass JWKS public key.
GET /agents/support-bot/attestation?period_days=30Response:
{ "attestation": { "agent_id": "support-bot", "health_score": 88.9, "health_grade": "B", "compliance_level": "standard", "drift_level": "stable", "drift_score": 12.3, "pact_adherence": 87.0, "calls_evaluated": 142, "attested_at": "2026-03-06T09:00:00Z", "valid_for_seconds": 3600 }, "jwt": "eyJhbGciOiJFUzI1NiIs...", "verify_at": "/jwks"}Compliance tiers
Section titled “Compliance tiers”| Tier | Requirements |
|---|---|
| Starter | Any score |
| Standard | Health ≥ 80 + drift stable or minor |
| Regulated | Health ≥ 90 + drift stable + pact adherence ≥ 90 |
Verification
Section titled “Verification”Third parties verify the JWT using TapPass’s JWKS endpoint:
# Get the public keycurl localhost:9620/jwks
# Verify in Pythonimport jwt, httpxjwks = httpx.get("https://tappass.example.com/jwks").json()claims = jwt.decode(token, jwks, algorithms=["ES256"])6. Discovery Issue Codes
Section titled “6. Discovery Issue Codes”All tool risks, toxic flows, and rug-pull detections carry referenceable issue codes.
Tool risk codes
Section titled “Tool risk codes”| Code | Severity | What it catches |
|---|---|---|
| E001 | critical | Tool poisoning: hidden instructions in tool description |
| T001 | medium+ | Destructive tool: can modify or destroy data |
| T002 | medium+ | Public sink: can send data externally |
| T003 | medium+ | Private data access: accesses credentials or personal data |
| T004 | medium | Untrusted content source: ingests external content |
| T005 | high | Forbidden zone access: tool references credential/crypto paths |
Toxic flow codes
Section titled “Toxic flow codes”| Code | Severity | Pattern |
|---|---|---|
| TF001 | high | Private data → public sink (data exfiltration) |
| TF002 | high | Untrusted content → destructive action (confused deputy) |
| TF003 | medium | Untrusted content → public sink (proxy abuse) |
Rug pull codes
Section titled “Rug pull codes”| Code | Severity | What changed |
|---|---|---|
| RP001 | critical | Tool definition modified between assessments |
| RP002 | medium | New tool appeared on server |
| RP003 | high | Tool removed (possible cover-up) |
Dashboard
Section titled “Dashboard”The TapPass dashboard visualizes all observability data:
Agent list
Section titled “Agent list”- Health badge per agent: color-coded circle (0–100) + letter grade
- Green (A), dark green (B), amber (C), orange (D), red (F)
Agent detail page
Section titled “Agent detail page”| Panel | Content |
|---|---|
| 🩺 Health Score | SVG score ring, 5-dimension bars, trend, alerts |
| 📡 Behavioral Drift | Radar chart (8 signals), drift level, signal bars, period info |
| 📋 Behavioral Pact | Purpose, classification, PII exposure, cost budget, AI Act risk, GDPR basis, tools |
| 📐 Pact Adherence | Score ring, violation list with penalties, bar chart, pre-pact skip count |
All panels load data from the API on page visit: no polling or websockets needed.
Architecture
Section titled “Architecture” ┌──────────────────────────────┐ │ Audit Trail │ │ (hash-chained, tamper-proof) │ └──────────┬───────────────────┘ │ ┌────────────────┼────────────────┐ │ │ │ ┌────────▼──────┐ ┌─────▼──────┐ ┌──────▼──────┐ │ Health Score │ │ Drift │ │ Canary │ │ (5 dims) │ │ (8 sigs) │ │ (runner) │ └────────┬──────┘ └─────┬──────┘ └──────┬──────┘ │ │ │ ┌────────▼──────┐ ┌─────▼──────┐ ┌──────▼──────┐ │ Pact │ │ Alerts │ │ Regression │ │ (adherence) │ │ │ │ Detection │ └────────┬──────┘ └────────────┘ └─────────────┘ │ ┌────────▼──────┐ │ Attestation │ │ (signed JWT) │ └───────────────┘All modules import from tappass.observability.health._query_agent_events and _extract_calls. a single data extraction layer over the audit trail.
| File | Purpose |
|---|---|
tappass/observability/health.py | Health score computation |
tappass/observability/drift.py | Drift detection (8 signals, JSD) |
tappass/registry/pact.py | Pact model, store, adherence scoring |
tappass/canary/store.py | Canary test definitions + baseline storage |
tappass/canary/runner.py | Canary execution + regression detection |
tappass/api/routes/agent_health.py | All 15 API endpoints |
tappass/policy/tokens.py | Trust attestation in capability tokens |
frontend/src/pages/Agents.tsx | Dashboard visualizations |
docs/pacts-guide.md | Behavioral pacts guide |
docs/observability-guide.md | This file |