Issues & Incidents. Data Model and Architecture

Hierarchy

Agent
└── Session                    ← conversation (30-min inactivity = new session)
    └── Turn                   ← single LLM/tool call (turn_index in session)
        └── Event              ← auditable record (hash-chained)
            └── Detection[]    ← findings from pipeline steps

Detections  →  grouped into  →  Issues     (fingerprinted by agent + step)
Issues      →  escalated to  →  Incidents  (when severity warrants action)

Concepts

Concept	What it is	Lifecycle	Example
Session	A conversation between user and agent. 30-min inactivity = new session.	Created on first call, expires after idle	Support chat #4829
Turn	One request→pipeline→response cycle. Identified by `turn_index`.	Immutable	Turn 3: user asks to export data
Event	An auditable thing that happened. Atomic unit of the hash chain.	Immutable, hash-chained	`llm_call`, `llm_call_blocked`, `config_change`
Detection	A finding from a pipeline step. Lives in `event.details.detections[]`.	Immutable (part of event)	`detect_pii` found 2 emails, action=redact
Issue	A pattern of similar detections, grouped by fingerprint. Accumulates events.	New → Ongoing → Resolved	”PII leaking in responses”: 47 events
Incident	An escalated situation requiring human attention. GDPR Art.33 trigger.	Open → Investigating → Contained → Resolved	”Data breach: PII exposed”

Issue vs Incident

Issue = the what. A recurring problem pattern. “This agent keeps leaking PII.” Can live for weeks. Has event count, trend, first/last seen.
Incident = the so what. An actionable escalation. “This PII leak constitutes a breach, notify DPO within 72h.” Has containment actions, timeline, responsible party.

Not every issue becomes an incident. An issue with severity: medium may just need a config fix. An issue with severity: critical auto-creates an incident.

Fingerprinting

Issues are grouped by fingerprint: SHA-256(agent_id + ":" + detection_step)[:16]

This means all PII detections from one agent form a single issue. All injection attempts from one agent form another. Different agents get separate issues even for the same step.

Severity Derivation

Condition	Severity
Detection action = `block`	Critical
Data classification = RESTRICTED or CONFIDENTIAL	High
Detection action = `redact` or `notify`	Medium
Otherwise	Low

Severity only escalates, never downgrades. If a new event is more severe, the issue severity is upgraded.

Auto-escalation to Incident

An incident is auto-created when:

An issue reaches severity: critical AND
The issue does not already have an incident

The incident includes auto-generated containment actions (e.g., “Agent calls blocked by pipeline”).

Data Models

Issue (`tappass/models/_core.py`)

class Issue(BaseModel):
    issue_id: str              # "iss_..." auto-generated
    fingerprint: str           # SHA-256(agent_id:detection_step)[:16]
    org_id: str
    agent_id: str
    detection_step: str        # e.g. "detect_pii"
    title: str                 # human-readable
    severity: IssueSeverity    # critical/high/medium/low
    status: IssueStatus        # new/ongoing/resolved
    event_count: int
    blocked_count: int
    first_seen: datetime
    last_seen: datetime
    last_event_id: str
    incident_id: str | None    # linked incident if escalated

Incident (`tappass/models/_core.py`)

class Incident(BaseModel):
    incident_id: str           # "inc_..." auto-generated
    org_id: str
    agent_id: str
    issue_ids: list[str]       # linked issues
    severity: IssueSeverity
    lifecycle: IncidentLifecycle  # open/investigating/contained/resolved
    title: str
    description: str
    containment_actions: list[dict]
    affected_categories: list[str]  # e.g. ["Personal data", "Credentials"]
    gdpr_notified_at: datetime | None  # Art.33 tracking
    detected_at: datetime
    resolved_at: datetime | None

Storage

Both issues and incidents use the same JSONL + in-memory pattern as audit events:

data/issues.jsonl: append-only, rewritten on update
data/incidents.jsonl. same pattern
In-memory indexed by ID and fingerprint for fast lookups

API Endpoints

Issues

Method	Path	Description
`GET`	`/issues`	List issues. Params: `agent_id`, `status`, `limit`, `offset`
`GET`	`/issues/{issue_id}`	Get single issue
`PATCH`	`/issues/{issue_id}`	Update status: `{"status": "resolved"}`

Incidents

Method	Path	Description
`GET`	`/incidents-api`	List incidents. Params: `agent_id`, `lifecycle`, `limit`, `offset`
`GET`	`/incidents-api/{incident_id}`	Get single incident
`PATCH`	`/incidents-api/{incident_id}`	Update lifecycle, add containment actions, mark GDPR notified

Incident PATCH body examples:

{"lifecycle": "investigating"}
{"containment_action": "Agent API key revoked"}
{"gdpr_notified": true}

Pipeline Integration

Issue creation is hooked into AuditTrail.record() in tappass/audit/trail.py. After each audit event is persisted:

Check if event has detections or is llm_call_blocked
For each detection step, compute fingerprint
Find existing issue or create new one
Bump event count, update severity, update last_seen
If severity reaches critical → auto-create incident

This is best-effort (errors are logged but don’t block the audit trail).

Audit Trail Hash Chain

Every Event gets hashed. The hash covers all event fields including session_id and turn_index (promoted to top-level). Issues and incidents are NOT in the hash chain: they are derived/mutable entities.

Hash input: SHA-256(canonical_json(event_fields) + prev_hash)

Fields hashed:
  event_id, timestamp, org_id, event_type, agent_id, user_id, task_id,
  session_id, turn_index, resource, operation, details,
  source_framework, source_sdk_version, _prev_hash

What is NOT hashed: _hash itself, issue/incident IDs (derived, mutable).

Frontend

The Activity tab on agent detail (frontend/src/pages/Agents.tsx) fetches from GET /issues?agent_id=... and renders a Sentry-inspired issues list with:

Severity color bar
Title + detection step
Event count, blocked count, last seen
Expandable event breadcrumbs
Filter tabs: All / New / Ongoing

Falls back to client-side grouping from audit events, then to mock data when no real data exists.

File Locations

File	Purpose
`tappass/models/_core.py`	Issue, Incident, enum definitions
`tappass/audit/issues.py`	IssueRepo, IncidentRepo, fingerprinting
`tappass/audit/trail.py`	`_process_detections()` hook
`tappass/api/routes/issues.py`	REST API endpoints
`tappass/gateway/service.py`	Passes session_id/turn_index to AuditEvent
`frontend/src/types.ts`	ApiIssue, ApiIncident TypeScript types
`frontend/src/pages/Agents.tsx`	Activity tab UI