Skip to content

Deployment Playbook

Goal: Deploy TapPass on a fresh server or upgrade an existing instance.
Audience: Platform engineer, DevOps


The production stack consists of 7 services:

ServiceImagePurpose
tappassBuilt from DockerfileCore governance server, port 9620
opaopenpolicyagent/opa:1.14.0Policy decision point (Rego policies)
postgrespostgres:16-alpinePersistent storage (agents, audit, sessions)
redisredis:7-alpineRate limiting, session state, cache (64MB, LRU)
spire-serverghcr.io/spiffe/spire-server:1.11.0SPIFFE certificate authority
spire-agentghcr.io/spiffe/spire-agent:1.11.0Workload attestation, cert issuance
tunnelcloudflare/cloudflared:latestCloudflare Tunnel for external access

Additionally, spiffe-helper runs as a sidecar for cert rotation.


See Server Prerequisites for full requirements.

Minimum: Ubuntu 24.04 LTS, 4 cores, 4 GB RAM, 40 GB disk, Docker + Docker Compose.

Recommended: 8+ cores, 8+ GB RAM, 100+ GB disk.

Option A: Full Production Stack (Docker Compose)

Section titled “Option A: Full Production Stack (Docker Compose)”
Terminal window
git clone git@github.com:tappass/tappass.git
cd tappass/deploy
# Create secrets
cp ../.env.example .env.prod

Required secrets in .env.prod:

Terminal window
# Security (all required for production)
TAPPASS_ADMIN_API_KEY="tp_..." # python -c "import secrets; print('tp_' + secrets.token_urlsafe(32))"
TAPPASS_JWT_SECRET="..." # python -c "import secrets; print(secrets.token_urlsafe(48))"
TAPPASS_VAULT_KEY="..." # python -c "import secrets,base64; print(base64.b64encode(secrets.token_bytes(32)).decode())"
TAPPASS_TOKEN_KEY_FILE="tappass-token.pem" # openssl ecparam -genkey -name prime256v1 -noout -out tappass-token.pem
POSTGRES_PASSWORD="..." # python -c "import secrets; print(secrets.token_urlsafe(24))"
SPIRE_JOIN_TOKEN="..." # python -c "import secrets; print(secrets.token_urlsafe(32))"
# License
TAPPASS_LICENSE="..." # From license server
# LLM providers (at least one)
OPENAI_API_KEY="sk-..."
# ANTHROPIC_API_KEY="sk-ant-..."
# AZURE_API_KEY="..."
# AZURE_API_BASE="https://your-resource.openai.azure.com"
# LLM judge (for semantic analysis steps)
TAPPASS_LLM_JUDGE_MODEL="gpt-4o-mini"
# Tunnel (for external access)
TUNNEL_TOKEN="..." # From Cloudflare Zero Trust dashboard

Deploy:

Terminal window
docker compose --env-file .env.prod -f docker-compose.prod.yml up -d
# Wait for all services to be healthy
docker compose --env-file .env.prod -f docker-compose.prod.yml ps
# Register SPIRE workload entries (one-time)
docker compose --env-file .env.prod -f docker-compose.prod.yml exec spire-server \
bash /opt/spire/register-entries.sh
# Verify
curl -s http://localhost:9620/health | python3 -m json.tool

Expected health response:

{
"status": "healthy",
"version": "0.5.0",
"storage": "local",
"license": {"org": "Client Corp", "tier": "enterprise", "expires": "2027-01-01"}
}
Terminal window
pip install tappass
tappass up --license <key>

Interactive wizard walks through:

  1. License validation
  2. Storage selection (memory / local PostgreSQL / Supabase)
  3. Secret generation
  4. Server start

For a completely fresh Ubuntu 24.04 server, use the bootstrap script:

Terminal window
sudo ./deploy/bootstrap.sh

This installs everything: Docker, clones the repo, sets up systemd services, configures Cloudflare tunnels, installs MkDocs for docs, and creates the deploy user (gebruiker).


Option 1: Cloudflare Tunnel (recommended)

Already included in the Docker Compose stack. Set TUNNEL_TOKEN and configure the tunnel’s public hostname in Cloudflare Zero Trust dashboard to point to http://tappass:9620.

Option 2: Caddy (self-hosted TLS)

Terminal window
# Caddyfile included in deploy/
caddy run --config deploy/Caddyfile

BackendConfigUse case
MemoryNo config neededDev/testing only. Data lost on restart.
Local PostgreSQLDATABASE_URL=postgresql:/...Self-hosted production.
SupabaseSUPABASE_URL + SUPABASE_KEYManaged PostgreSQL.

12 migration files in deploy/migrations/. Run automatically on first start when using Docker Compose (mounted as init scripts).

For manual migration:

Terminal window
psql -h localhost -U tappass -d tappass -f deploy/migrations/001_schema.sql
# ... through 012_governance_policies.sql
Terminal window
# PostgreSQL dump
docker compose exec postgres pg_dump -U tappass tappass > backup_$(date +%Y%m%d).sql
# Restore
docker compose exec -T postgres psql -U tappass tappass < backup_20260315.sql

Terminal window
cd tappass
git pull origin main
# Rebuild and restart
cd deploy
docker compose --env-file .env.prod -f docker-compose.prod.yml up -d --build
# Verify
curl -s http://localhost:9620/health

New migrations run automatically on PostgreSQL container restart (via init scripts).


EndpointPurposeAuth required
GET /healthReadiness check (DB status, version)No
GET /health/liveLiveness probe (process running)No
GET /health/readyReadiness probe (can serve traffic)No
GET /health/startupStartup probe (finished init)No
GET /health/detailedFull diagnostics (DB, Redis, OPA, SPIRE)Yes (AUDITOR+)
GET /metricsPrometheus metricsYes (AUDITOR+)
livenessProbe:
httpGet:
path: /health/live
port: 9620
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 9620
periodSeconds: 10
startupProbe:
httpGet:
path: /health/startup
port: 9620
failureThreshold: 30
periodSeconds: 5
SignalAlert ifAction
/health returns non-200Any occurrenceCheck DB, Redis, OPA
Pipeline latency P99> 500msDisable heavy steps or scale
Error rate on /v1/chat/completions> 1%Check logs, LLM provider status
Redis memory> 90% of 64MB limitIncrease maxmemory or check for leaks
PostgreSQL connections> 80% poolScale DB or optimize queries
Disk usage> 80%Rotate audit logs, clean Docker images

Set OTEL_EXPORTER_OTLP_ENDPOINT to push traces to your collector. Zero overhead when unset.

Terminal window
OTEL_EXPORTER_OTLP_ENDPOINT="http://otel-collector:4317"
OTEL_SERVICE_NAME="tappass"

TapPass core is stateless. Scale by running multiple instances behind a load balancer:

services:
tappass:
deploy:
replicas: 3

All instances must share the same PostgreSQL and Redis.

  • OPA sidecar: one per TapPass instance (stateless, low overhead)
  • SPIRE: single server, but agents can run per-node
  • The Docker Compose license-net network assumes a single-host license server

ScenarioRTORPOAction
Single container crash~10s0Docker auto-restarts (unless-stopped)
Full server failure~30minLast DB backupRedeploy from git + restore backup
Database corruption~1hLast backuppg_restore from backup
Redis data loss00Redis is cache-only, recovers automatically

PortServiceExposed externally
9620TapPass APIVia tunnel or reverse proxy only
8181OPAInternal only (container network)
5432PostgreSQLInternal only
6379RedisInternal only

Outbound only:

  • LLM providers (api.openai.com, api.anthropic.com, etc.)
  • Cloudflare (for tunnel)
  • GitHub (for updates)
  • License server (for license validation)