Skip to content

Provisioning Protocol (Zero-AI)

This protocol defines the standard procedure for provisioning new production servers. It is designed to work without AI assistance.

Last Updated: January 2026


Key Lessons Learned

Critical Requirements

These lessons were learned from multiple failed provisioning attempts:

1. **Use Traefik v3.6+** on Ubuntu 24.04 with Docker 29.x (fixes Docker API version mismatch)
2. **Migrate Infisical LAST** - deploy services using existing Infisical URL first
3. **Pin versions** in docker-compose.yml to avoid surprises
4. **Use proper server names** - never refer to servers as "old" or "new"

Server Reference

Name IP Role Status
bruno 188.34.198.57 Production ✅ Active
development-vps 46.224.125.1 Development ✅ Active

Prerequisites

  • [ ] Infisical CLI installed: infisical --version
  • [ ] Terraform installed: terraform --version
  • [ ] Ansible installed: ansible --version
  • [ ] SSH keys loaded: ssh-add -l
  • [ ] Access to https://secrets.kua.cl (login works)

Phase 1: Verify Infisical Secrets (5 min)

1.1 Required Secrets

Login to https://secrets.kua.cl and verify these exist in Production environment:

SSH Keys:

  • SSH_KEYS_ACTIVE_DEVICES (e.g., MACMINI,MACBOOKPRO)
  • SSH_KEY_MACMINI_PUBLIC + SSH_KEY_MACMINI_STATUS
  • SSH_KEY_MACBOOKPRO_PUBLIC + SSH_KEY_MACBOOKPRO_STATUS

Infrastructure:

  • HETZNER_API_TOKEN

Storage:

  • S3_ACCESS_KEY, S3_SECRET_KEY, S3_ENDPOINT, S3_REGION, S3_BUCKET
  • STORAGE_BOX_SSH_PRIVATE_KEY

Services:

  • ENCRYPTION_KEY, AUTH_SECRET
  • POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB
  • REDIS_PASSWORD
  • IMGPROXY_KEY, IMGPROXY_SALT

1.2 Get Machine Identity Credentials

  1. In Infisical, go to Project SettingsMachine Identities
  2. Note the Client ID and Client Secret

Phase 2: Terraform - Provision Server (5 min)

2.1 Update main.tf

cd ~/coder-core/terraform/hetzner
vim main.tf

Add or modify server resource:

resource "hcloud_server" "production" {
  name        = "my-new-server"  # Change this
  image       = "ubuntu-24.04"
  server_type = "cpx32"  # 4 vCPU, 8GB RAM, 160GB NVMe
  location    = "nbg1"   # Nuremberg
  # ... rest stays same
}

2.2 Apply Terraform

cd ~/coder-core
./bin/deploy-infra.sh plan    # Review
./bin/deploy-infra.sh apply   # Create
./bin/deploy-infra.sh output  # Get IP

2.3 Wait for Cloud-Init

sleep 120  # Wait 2 minutes
ssh root@<NEW_IP> "hostname && uptime"

Phase 3: Update Ansible Inventory (2 min)

3.1 Edit hosts.yml

vim ~/coder-core/ansible/inventory/hosts.yml

Add new server:

production:
  hosts:
    my-new-server:
      ansible_host: <NEW_IP>
      ansible_user: root
      server_role: production
      domain_suffix: kua.cl

Phase 4: Bootstrap with Ansible (15 min)

4.1 Run Site Playbook

cd ~/coder-core/ansible
ansible-playbook playbooks/site.yml --limit my-new-server

Enter credentials when prompted:

  • Machine Identity Client ID
  • Machine Identity Client Secret
  • S3 Access Key
  • S3 Secret Key
  • Storage Box SSH Key

Alternative: Use Bootstrap Bundle

If you have a bootstrap-secrets.age bundle, you can decrypt it on the server to populate the .env file directly: age -d bootstrap-secrets.age > ~/coder-core/services/production/.env

4.2 Verify Installation

ssh root@<NEW_IP> "docker --version && rclone listremotes && ls /mnt/"

Expected:

Docker version 29.x.x
hetzner-s3:
storagebox:
s3  storagebox

Phase 5: Deploy Services (10 min)

5.1 Run Deploy Playbook

cd ~/coder-core/ansible
ansible-playbook playbooks/deploy-services.yml --limit my-new-server \
  -e "infisical_client_id=YOUR_CLIENT_ID" \
  -e "infisical_client_secret=YOUR_CLIENT_SECRET"

Secrets Source

This exports secrets from the EXISTING Infisical (at secrets.kua.cl) to the new server. The new server's Infisical will be empty until we migrate the database.

5.2 Verify Containers

ssh root@<NEW_IP> "docker ps --format 'table {{.Names}}\t{{.Status}}'"

All containers should be "Up".

5.3 Check Traefik Logs (Critical!)

ssh root@<NEW_IP> "docker logs --tail 30 traefik"

Good signs:

  • No "client version 1.24 is too old" errors
  • ACME certificate errors are OK (DNS hasn't been updated yet)

Bad signs:

  • Docker API version errors → Check Traefik version (must be v3.6+)

Phase 6: Migrate Infisical Database (10 min)

Only After Services Running

Do this AFTER all services are running on the new server.

6.1 Get Source Credentials

ssh root@<OLD_PROD_IP> "docker inspect infisical-postgres --format '{{.Config.Env}}' | tr ' ' '\n' | grep POSTGRES"

6.2 Dump Source Database

ssh root@<OLD_PROD_IP> "docker exec infisical-postgres pg_dump -U <USER> <DB> > /root/infisical_dump.sql"

6.3 Transfer and Restore

scp root@<OLD_PROD_IP>:/root/infisical_dump.sql /tmp/
scp /tmp/infisical_dump.sql root@<NEW_IP>:/root/
ssh root@<NEW_IP> "cat /root/infisical_dump.sql | docker exec -i postgres psql -U kavi main"

6.4 Verify Migration

ssh root@<NEW_IP> "docker exec postgres psql -U kavi main -c 'SELECT count(*) FROM users;'"

Should show at least 1 user.

6.5 Restart Infisical

ssh root@<NEW_IP> "docker restart infisical"

Phase 7: Update DNS (5 min)

7.1 Update Cloudflare

Update A records to point to the new server IP:

Record Type Value Proxied
secrets A Yes
git A No
media A Yes
cdn A Yes
notes A Yes
docs A Yes

7.2 Purge Cache

Cloudflare Dashboard → Caching → Purge Everything

7.3 Verify DNS

dig +short secrets.kua.cl

Phase 8: Verify Everything (5 min)

8.1 Test Infisical

curl -s https://secrets.kua.cl/api/status | jq

Expected: {"message":"Ok"}

8.2 Login to Infisical

Go to https://secrets.kua.cl and login with your credentials.

8.3 Test All Services

curl -sI https://docs.kua.cl | head -3
curl -sI https://media.kua.cl | head -3
curl -sI https://git.kua.cl | head -3

Phase 9: Cleanup Old Servers

9.1 Wait Period

Wait 7 days for stability verification.

9.2 Remove from Inventory

Edit ~/coder-core/ansible/inventory/hosts.yml and remove old entries.

9.3 Delete via Hetzner Console

Go to https://console.hetzner.cloud and delete decommissioned servers.


Troubleshooting

"client version 1.24 is too old"

Cause: Traefik version < v3.6 with Docker 29.x

Fix: Update docker-compose.yml:

traefik:
  image: traefik:v3.6 # NOT v3.3 or earlier

ACME Certificate Errors

Cause: DNS not pointing to new server yet

Fix: Wait for DNS propagation or update Cloudflare

Services Have Empty Secrets

Cause: .env not generated from Infisical

Fix:

infisical export --env=prod --projectId=<ID> --path=/ --domain=https://secrets.kua.cl > /tmp/.env
scp /tmp/.env root@<NEW_IP>:/root/coder-core/services/production/.env
ssh root@<NEW_IP> "cd /root/coder-core/services/production && docker compose up -d"

Infisical Database Empty After Restore

Cause: Permission mismatch during pg_dump

Fix: Use the correct user when dumping:

ssh root@<OLD_PROD_IP> "docker exec infisical-postgres pg_dump -U infisical infisical > /root/infisical_dump.sql"

Quick Reference

Phase Command Time
Verify Secrets Infisical Web UI 5 min
Terraform ./bin/deploy-infra.sh apply 5 min
Ansible Inventory Edit hosts.yml 2 min
Ansible Bootstrap ansible-playbook site.yml --limit server 15 min
Deploy Services ansible-playbook deploy-services.yml --limit server 10 min
Migrate Infisical pg_dump + restore 10 min
DNS Update Cloudflare Dashboard 5 min
Verification curl tests + login 5 min
Total ~60 min

Last updated: January 2026 - Includes lessons learned from multiple provisioning attempts