Deployment Guide
TapPass Operations Guide
Section titled “TapPass Operations Guide”Everything you need to deploy, operate, and troubleshoot TapPass.
Architecture Overview
Section titled “Architecture Overview”Internet → Cloudflare Tunnel → TapPass → PostgreSQL (data-net) → Redis (data-net) → OPA (identity-net) → SPIRE (identity-net) → Conjur (secrets-net) → Platform (license-net)- 5 isolated Docker networks — each service only reaches what it needs
- TapPass is the only container that bridges all networks
- Cloudflare Tunnel handles TLS + DDoS — zero ports exposed to internet
- SPIRE issues auto-rotating certificates for workload identity
- Conjur stores all secrets — fetched at runtime, never on disk
Fresh Install
Section titled “Fresh Install”cd deploy./install.sh # Interactive wizard (8 steps) # → generates .env.prod # → calls deploy.sh # → if Conjur: run conjur/setup.sh afterInstall wizard steps
Section titled “Install wizard steps”| Step | What | Default |
|---|---|---|
| 1 | Prerequisites (Docker, disk) | Auto-check |
| 2 | Secrets (admin key, JWT, vault) | Auto-generated |
| 3 | License key | Must paste |
| 4 | Cloudflare Tunnel | Must paste token |
| 5 | SSO | Google OIDC |
| 6 | SPIRE | Enabled |
| 7 | Secrets management | Conjur |
| 8 | LLM keys | Optional |
All defaults are production-secure. Hitting Enter on everything gives a hardened deployment.
Containers
Section titled “Containers”| Container | Image | Network | Purpose |
|---|---|---|---|
| tappass | ghcr.io/tappass/tappass | all | Core API + dashboard |
| postgres | postgres:16.8-alpine | data-net | Application database |
| redis | redis:7.4.8-alpine | data-net | Rate limiting, sessions |
| opa | openpolicyagent/opa:1.14.0 | identity-net | Policy engine |
| spire-server | ghcr.io/spiffe/spire-server:1.11.0 | identity-net | Identity CA |
| spire-agent | ghcr.io/spiffe/spire-agent:1.11.0 | identity-net | Workload attestation |
| spiffe-helper | ghcr.io/spiffe/spiffe-helper:0.8.0 | identity-net | Cert rotation to disk |
| conjur | cyberark/conjur:1.24.0 | secrets-net | Secrets vault |
| conjur-postgres | postgres:16.8-alpine | secrets-net | Conjur database |
| tunnel | cloudflare/cloudflared:2026.2.0 | frontend-net | HTTPS ingress |
Security Properties
Section titled “Security Properties”| Property | How |
|---|---|
| Non-root | UID 999 via setpriv in entrypoint |
| Read-only filesystem | read_only: true + tmpfs for /tmp |
| No core dumps | ulimit -c 0 in entrypoint |
| Capabilities dropped | --inh-caps=-all |
| Zero secrets in docker inspect | Conjur fetches at runtime |
| TLS between containers | Internal CA certs in deploy/certs/ |
| Redis auth | requirepass + TLS |
| PostgreSQL SSL | ssl=on with internal CA |
| Network segmentation | 5 isolated networks |
| Audit trail | SHA-256 hash chain, tamper-evident |
Secrets Management
Section titled “Secrets Management”Conjur mode (production default)
Section titled “Conjur mode (production default)”Secrets stored in CyberArk Conjur. Fetched at runtime via REST API.
# View secrets in ConjurADMIN_KEY=$(grep CONJUR_ADMIN_API_KEY .env.prod | cut -d= -f2)AUTH=$(curl -sf --data "$ADMIN_KEY" http://localhost:8080/authn/tappass/admin/authenticate | base64 -w0)curl -sf -H "Authorization: Token token=\"$AUTH\"" http://localhost:8080/resources/tappass?kind=variable# Rotate a secretcurl -sf -X POST -H "Authorization: Token token=\"$AUTH\"" \ --data "new-value" \ http://localhost:8080/secrets/tappass/variable/tappass%2FANTHROPIC_API_KEY
# TapPass picks up new value within 5 minutes (cache TTL)# Or force immediate pickup:curl -X POST http://localhost:9620/admin/secrets/invalidate?name=ANTHROPIC_API_KEY \ -H "Authorization: Bearer <admin_key>"Env mode (evaluation)
Section titled “Env mode (evaluation)”Secrets in .env.prod (chmod 600). Simpler but less secure.
Day-to-Day Operations
Section titled “Day-to-Day Operations”Health check
Section titled “Health check”# Quickdocker exec tappass-tappass-1 python3 -c \ "import urllib.request; print(urllib.request.urlopen('http://localhost:9620/health').read().decode())"
# Full./scripts/health-monitor.shView logs
Section titled “View logs”docker compose --env-file .env.prod -f docker-compose.prod.yml logs tappass -f --tail 50Restart
Section titled “Restart”docker compose --env-file .env.prod -f docker-compose.prod.yml restart tappassUpgrade
Section titled “Upgrade”bash scripts/upgrade.sh 0.6.0 # Upgrade to specific versionbash scripts/upgrade.sh --rollback # Rollback to previousThe upgrade script:
- Pulls new image
- Verifies signature (if cosign installed)
- Backs up database
- Rolling restart with health check (60s timeout)
- Auto-rollback if health check fails
Backups
Section titled “Backups”| What | Schedule | Script | Retention |
|---|---|---|---|
| TapPass PostgreSQL | Daily 3:15 AM | scripts/backup-postgres.sh | 30 days |
| Conjur PostgreSQL | Daily 3:30 AM | scripts/backup-conjur.sh | 30 days |
Manual backup
Section titled “Manual backup”bash scripts/backup-conjur.shbash scripts/backup-postgres.shTest restore
Section titled “Test restore”bash scripts/test-restore.shSpins up temp containers, restores backups, verifies data. No production impact.
Restore from backup
Section titled “Restore from backup”# Stop TapPassdocker compose --env-file .env.prod -f docker-compose.prod.yml stop tappass
# Restore TapPass PGgunzip -c backups/postgres/tappass-YYYY-MM-DD.sql.gz | \ docker exec -i tappass-postgres-1 psql -U tappass tappass
# Restore Conjurgunzip -c backups/conjur/conjur-YYYY-MM-DD.sql.gz | \ docker exec -i tappass-conjur-postgres-1 psql -U postgres postgres
# Restartdocker compose --env-file .env.prod -f docker-compose.prod.yml start tappassMonitoring
Section titled “Monitoring”Health monitor runs every 5 minutes via cron:
*/5 * * * * /path/to/deploy/scripts/health-monitor.sh >> logs/health-monitor.log 2>&1Checks:
- TapPass health endpoint
- Platform health (
licenses.tappass.ai) - All container status
- Disk space (alerts >90%)
- Internal CA cert expiry (alerts <30 days)
Alerting: set HEALTH_SLACK_WEBHOOK env var for Slack notifications.
Status files: deploy/status/tappass.down created when down, removed on recovery.
SPIRE / SPIFFE
Section titled “SPIRE / SPIFFE”Re-register workload entries (after SPIRE restart)
Section titled “Re-register workload entries (after SPIRE restart)”TOKEN=$(docker exec tappass-spire-server-1 /opt/spire/bin/spire-server token generate \ -spiffeID spiffe://tappass.internal/agent/spire-agent | grep Token | awk '{print $2}')
# Update .env.prodsed -i "s/SPIRE_JOIN_TOKEN=.*/SPIRE_JOIN_TOKEN=$TOKEN/" .env.prod
# Restart agentdocker compose --env-file .env.prod -f docker-compose.prod.yml up -d --force-recreate spire-agent
# Re-register entriesdocker exec tappass-spire-server-1 bash /opt/spire/register-entries.shCheck cert validity
Section titled “Check cert validity”docker exec tappass-tappass-1 openssl x509 -in /run/spire/certs/svid.pem -noout -dates -checkend 0Troubleshooting
Section titled “Troubleshooting”| Symptom | Cause | Fix |
|---|---|---|
storage: memory | PG not connected | Check docker logs tappass-postgres-1 |
PRODUCTION UNSAFE | Secrets not loaded | Verify Conjur is healthy, check .conjur-authn-key |
seat_limit_exceeded | Too many activations | Deactivate old seats in Platform admin |
| SPIRE certs expired | spiffe-helper stopped | Restart: docker restart tappass-spiffe-helper-1 |
NOAUTH Redis error | Wrong password | Check REDIS_PASSWORD in .env.prod matches Redis config |
| OPA 403 on all requests | No authz policy loaded | Check docker logs tappass-opa-1, verify policies mounted |
Known Limitations (GitHub Issues)
Section titled “Known Limitations (GitHub Issues)”| # | Issue | Status |
|---|---|---|
| 1 | Conjur on plain HTTP (internal) | Needs reverse proxy |
| 2 | SPIFFE JWT → Conjur ES256 incompatibility | Conjur OSS limitation |
| 3 | Docker socket in SPIRE agent | SPIRE attestor requirement |
| 4 | Static Conjur API key | Depends on #2 |
| 5 | PostgreSQL SSL client cert | HOME=/root issue |
| 6 | Platform HA for 50+ customers | Scaling milestone |
| 7 | Platform auth: short-lived tokens | Architecture improvement |