Client Onboarding Playbook

Goal: Take a new client from signed contract to governed AI agents in production.
Timeline: 1 week (pilot), 4 weeks (full rollout)
Audience: Solutions engineer, platform team

Pre-engagement Checklist

Before the kickoff call, gather:

Client’s AI stack: which frameworks (LangChain, CrewAI, custom), which LLMs (OpenAI, Azure, Anthropic)
Number of agents in scope (start with 1-3 for pilot)
Compliance requirements: EU AI Act, DORA, NIS2, HIPAA, SOC 2
Network constraints: can agents reach TapPass? VPN? Air-gapped?
Existing observability: where do logs go? Splunk, Datadog, ELK?
Decision maker for “what should be blocked vs. flagged vs. allowed”

Week 1: Pilot

Day 1: Deploy TapPass

Option A: Client hosts (recommended for regulated)

The production stack runs via Docker Compose with 7 services: TapPass core, OPA, PostgreSQL, Redis, SPIRE (server + agent), and Cloudflare Tunnel.

cd deploy

# Create .env.prod from the template
cp ../.env.example .env.prod
# Fill in:
#   TAPPASS_ADMIN_API_KEY    (generate: python -c "import secrets; print('tp_' + secrets.token_urlsafe(32))")
#   TAPPASS_JWT_SECRET       (generate: python -c "import secrets; print(secrets.token_urlsafe(48))")
#   TAPPASS_VAULT_KEY        (generate: python -c "import secrets,base64; print(base64.b64encode(secrets.token_bytes(32)).decode())")
#   POSTGRES_PASSWORD        (generate: python -c "import secrets; print(secrets.token_urlsafe(24))")
#   SPIRE_JOIN_TOKEN         (generate: python -c "import secrets; print(secrets.token_urlsafe(32))")
#   TAPPASS_LICENSE           (from license server)
#   LLM provider keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)

docker compose --env-file .env.prod -f docker-compose.prod.yml up -d

# After first start, register SPIRE workload entries:
docker compose --env-file .env.prod -f docker-compose.prod.yml exec spire-server \
  bash /opt/spire/register-entries.sh

Verify:

curl -s http://localhost:9620/health | python3 -m json.tool
# Should return: {"status": "healthy", "version": "0.5.0", "storage": "local"}

What works immediately: Server starts on port 9620, health check passes, OPA loaded, PostgreSQL migrated, SPIRE issuing certs.

What’s rough:

HTTPS requires Caddy/nginx in front (Caddyfile included in deploy/)
Or use Cloudflare Tunnel (config in deploy/setup-tunnel.md)

Option B: Quick start (development/demo)

pip install tappass
tappass up --license <key>

This walks through storage selection (memory, local PostgreSQL, Supabase) and starts the server interactively.

Option C: Managed by us

Point their agents at https://<client>.tappass.ai. We handle infra.

Day 2: CISO Sets Up Organization

The CISO runs tappass init interactively, or does it through the dashboard:

tappass init

This walks through:

Sign in (email/password or Google SSO)
Create organization
Create first agent identity
Select pipeline preset
Generate .env with agent credentials

Or via the API:

# Create org + first agent
curl -X POST http://localhost:9620/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"email": "ciso@client.com", "password": "...", "org_name": "Client Corp"}'

Day 3: Configure Pipeline

Choose the starting preset based on industry:

Industry	Starting preset	Steps
Financial services	`regulated`	45 steps (full pipeline with DORA audit, PCI patterns)
Healthcare	`regulated`	45 steps (PHI detection, EU model routing)
Legal	`regulated`	45 steps (privilege detection, matter isolation)
Government	`regulated`	45 steps (classification levels, air-gap support)
SaaS	`standard`	38 steps (multi-tenant isolation)
Consulting	`standard`	38 steps (client data boundaries)
Startups	`starter`	11 steps (minimal, observe mode)

Pipeline is configured per-agent via the CLI or API:

# View current pipeline
tappass gov pipelines

# View a specific pipeline's steps
tappass gov pipeline default

# Assign a preset to an agent
tappass gov assign <agent-id> --pipeline regulated

Or with a YAML override file:

preset: regulated
overrides:
  classification:
    custom_patterns:
      - name: client_account_number
        pattern: "ACC-[0-9]{8}"
        level: CONFIDENTIAL
  policy:
    require_human_approval:
      - classification: RESTRICTED
      - classification: TOP_SECRET

Day 4: Integrate First Agent

Walk their developer through the SDK:

pip install tappass

from tappass import Agent

agent = Agent(
    "https://tappass.client.internal/v1",
    name="claims-processor",
    api_key="tp_...",
    flags={
        "pii": "mask",
        "mode": "observe",  # Start in observe, switch to enforce later
    }
)

# Their existing OpenAI code works unchanged
response = agent.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

Day 4-5: Observe Mode

Keep mode=observe for the first week. This means:

All 45 pipeline steps run on every call (for regulated)
Detections are logged but not blocked
Audit trail fills up with real data
Dashboard shows what would have been blocked

Review meeting at end of week:

Show the dashboard: “Here’s what your agents did this week”
Show the audit trail: tappass logs --limit 50
Highlight risky calls that would have been blocked
Agree on what should be blocked vs. flagged vs. allowed
Set thresholds together with the CISO

Week 2-4: Full Rollout

Switch to Enforce Mode

agent = Agent(
    url,
    flags={"mode": "enforce"}  # Now blocks violations
)

Or via environment variable:

export TAPPASS_FLAGS="mode=enforce"

Add Remaining Agents

For each agent:

Register via CLI: tappass agents add --name "research-bot" --preset standard
Assign capability token scopes (which tools, which data levels)
Set pipeline overrides if needed
Deploy SDK integration
Verify in observe mode for 24h, then switch to enforce

Configure SIEM Export

TapPass supports three SIEM formats: CEF (Splunk/ArcSight), OCSF (AWS Security Lake/CrowdStrike), and JSON (generic webhook).

Configure via the API:

curl -X PUT http://localhost:9620/admin/siem/config \
  -H "Authorization: Bearer $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "destination": {
      "type": "splunk_hec",
      "url": "https://client-splunk.internal:8088/services/collector",
      "token": "..."
    },
    "format": "cef",
    "severity_filter": "detection"
  }'

Severity filter options: all, detection (something detected), action (action taken), block (only blocks).

Configure Alerting

curl -X POST http://localhost:9620/admin/webhooks \
  -H "Authorization: Bearer $ADMIN_KEY" \
  -d '{
    "url": "https://client-slack.webhook.url",
    "events": ["violation", "escalation", "anomaly"]
  }'

Run Status Check

Before handover, run the full diagnostic:

tappass status

This checks: config, server connection, auth, agents, pipeline, database, OPA, SPIRE, license.

Handover

Deliver to the client:

Dashboard access for CISO + security team
Runbook: “What to do when TapPass blocks something”
Pipeline config (version controlled in their repo)
Monitoring: /health/ready endpoint added to their uptime tool
SIEM integration verified (events flowing)
Escalation path: who to call if governance needs tuning

Common Gotchas

Problem	Solution
Agent timeouts after adding TapPass	Check per-step latency via `/admin/metrics`. `starter` adds ~20ms, `regulated` ~100ms. If `llm_judge` is enabled, that’s an extra ~200ms per call.
Too many false positives on PII	Tune PII allowlist. Product names that look like person names are common (“Anna Router”, “Iris Scanner”). Add to `pii.allowlist` in pipeline config.
Developer pushback (“it’s slowing us down”)	Start with `mode=observe`. Show the dashboard. Let the data speak. No enforcement until the CISO sees what’s happening.
CISO wants everything blocked immediately	Push back. Observe first, enforce second. Blocking without data causes outages and developer revolt.
Streaming responses break	Use `agent.chat.completions.create(stream=True)`. The SDK handles governed streaming natively.
Multiple LLM providers	TapPass proxies to all. Configure providers via environment variables (see `.env.example`). One agent can use multiple models.
”tappass up” hangs	Check if port 9620 is already in use. Run `tappass status` for diagnostics.
Database migration fails	Migrations are in `deploy/migrations/`. Currently 12 migration files. Check `docker compose logs postgres` for errors.
SPIRE not issuing certs	Forgot to run `register-entries.sh` after first deploy. Check with `docker compose exec spire-server /opt/spire/bin/spire-server entry show`.