Skip to content

Terraform Infrastructure as Code

Why Terraform for this infrastructure - Complete infrastructure management through declarative configuration.


What is Terraform

Terraform is an open-source Infrastructure as Code (IaC) tool that lets you define and provision infrastructure using declarative configuration files.

Key Concept: You describe what you want (desired state), Terraform figures out how to create it.

Traditional Approach (Manual)

# Create VPS manually via Hetzner console
1. Log in to console.hetzner.cloud
2. Click "Add Server"
3. Select CPX42, Ubuntu 22.04, FSN1
4. Add SSH keys manually
5. Create server
6. Wait for IP address
7. Update DNS records manually in Cloudflare
8. Configure firewall rules manually
9. Document IP address somewhere

# Result: No record of what you did, hard to replicate

Terraform Approach (Infrastructure as Code)

# vps.tf
resource "hcloud_server" "hetzner_vps" {
  name        = "hetzner-vps"
  server_type = "cpx42"
  image       = "ubuntu-22.04"
  location    = "fsn1"
  ssh_keys    = [hcloud_ssh_key.macbook.id]
}

# dns.tf
resource "cloudflare_record" "root" {
  zone_id = var.cloudflare_zone_id
  name    = "@"
  value   = hcloud_server.hetzner_vps.ipv4_address  # Automatic!
  type    = "A"
}
terraform apply
# Creates VPS, updates DNS automatically, everything in code

Result: Infrastructure documented as code, version controlled, reproducible.


Why Terraform for This Infrastructure

Your Infrastructure Profile

You have: - ✅ 12+ resources across 3 cloud providers (Hetzner, Cloudflare, Storage Box) - ✅ Multiple devices managing infrastructure (MacBook, iPad, PC) - ✅ Existing production data that must be preserved - ✅ Critical services requiring disaster recovery (Immich, Infisical, Coolify) - ✅ Growing infrastructure (might add more servers in future)

This is a PERFECT fit for Terraform.

The Multi-Device Challenge

Problem: You manage infrastructure from 3 devices:

MacBook (home)
  ↓ Creates VPS, adds DNS

iPad (traveling)
  ↓ Needs to update firewall rule
  ↓ Question: What DNS records exist?
  ↓ Question: Which SSH keys are installed?

PC (office)
  ↓ Needs to add new service
  ↓ Question: What's the current VPS configuration?

Without Terraform: - ❌ No shared state (each device has different knowledge) - ❌ Manual documentation (becomes outdated) - ❌ Risk of conflicts (iPad doesn't know MacBook made changes) - ❌ No audit trail (who changed what when?)

With Terraform: - ✅ Shared remote state (all devices see same infrastructure) - ✅ Self-documenting (code IS the documentation) - ✅ State locking (prevents concurrent changes) - ✅ Git history (complete audit trail)

The Resource Complexity

Your infrastructure (manually managed):

Provider Resources Interdependencies
Hetzner VPS, Storage Box, SSH keys, Firewall VPS IP → DNS records
Cloudflare 5+ DNS records, Zone settings DNS → VPS IP
Storage S3 buckets, Terraform state State → All resources

Total: 12+ interdependent resources.

Example dependency chain:

SSH keys → VPS creation → VPS IP → DNS records → Service access

Manual management: 1. Create SSH keys in Hetzner console 2. Note down key IDs 3. Create VPS with those key IDs (type manually) 4. Wait for VPS to get IP 5. Copy IP address 6. Log into Cloudflare 7. Update DNS record with new IP 8. Hope you didn't make a typo

Terraform manages this automatically:

# Terraform handles the dependency chain
ssh_keys = [hcloud_ssh_key.macbook.id]  # Reference, not manual ID
value = hcloud_server.hetzner_vps.ipv4_address  # Automatic IP


What Terraform Manages (vs Doesn't)

✅ Terraform SHOULD Manage

Infrastructure resources (long-lived, infrequent changes):

Resource Type Why Terraform Example
VPS instances Reproducible, disaster recovery Hetzner CPX42 server
DNS records Automatic IP updates, version control kua.cl → VPS IP
SSH public keys Multi-device access control, audit trail MacBook, iPad, PC keys
Firewall rules Security policy as code Allow 443, 80, SSH
S3 buckets Permissions as code kua-images bucket
Networks Infrastructure foundation VPC, subnets (if applicable)

Benefits: - 🎯 Disaster recovery: terraform apply rebuilds everything - 🔍 Audit trail: Git log shows all changes - 🔄 Consistency: Same config across all devices - 📝 Documentation: Code IS the documentation

❌ Terraform SHOULD NOT Manage

Dynamic resources (change frequently, application-level):

Resource Type Why NOT Terraform Better Tool
Docker containers Too dynamic, change hourly/daily docker-compose
Application secrets Wrong tool, security model doesn't fit Infisical
SSH private keys Per-device, not infrastructure Manual + secure backup
Application code Not infrastructure Git
Database data Data, not infrastructure Backups
Existing production servers Import risk, data loss potential Manual management → migrate

Why docker-compose for containers:

# docker-compose.yml - Application layer (changes daily)
services:
  immich:
    image: ghcr.io/immich-app/immich-server:release  # Updates weekly
    restart: unless-stopped
    environment:
      DB_PASSWORD: ${DB_PASSWORD}  # From Infisical, changes rarely

Why Infisical for secrets:

# Infisical - Secret management (rotate frequently)
infisical secrets set ANTHROPIC_API_KEY=sk-...
# Secrets change monthly, environment-specific, NOT infrastructure

Why Terraform for infrastructure:

# terraform - Infrastructure layer (changes monthly/yearly)
resource "hcloud_server" "hetzner_vps" {
  server_type = "cpx42"  # Upgrade once per year
  location    = "fsn1"   # Never changes
}

The Stack:

┌─────────────────────────────────┐
│   Application (docker-compose)  │ ← Changes daily
├─────────────────────────────────┤
│   Secrets (Infisical)           │ ← Changes weekly
├─────────────────────────────────┤
│   Infrastructure (Terraform)    │ ← Changes monthly
└─────────────────────────────────┘


Benefits for Your Use Case

1. Multi-Device Infrastructure Management

Before Terraform:

MacBook creates VPS
  → Writes down IP in notes.txt
iPad needs to add DNS record
  → Can't find IP, SSHs to VPS to check
PC needs to know SSH key IDs
  → Logs into Hetzner console, writes down manually

After Terraform:

# On any device (MacBook, iPad, PC):
git pull  # Get latest infrastructure code
terraform plan  # See current state
terraform output vps_ip  # Get VPS IP automatically
# All devices see same state via remote backend

2. Disaster Recovery

Scenario: Hetzner VPS destroyed (hardware failure, accidental deletion).

Before Terraform (Manual rebuild): 1. Create new VPS via console (15 min) 2. Find documentation for server type, location (5 min) 3. Add SSH keys manually (10 min) 4. Update DNS records (5 min) 5. Configure firewall rules (10 min) 6. Mount Storage Box (10 min) 7. Restore databases from backups (30 min) 8. Debug issues (30-120 min)

Total: 2-4 hours + potential data loss

After Terraform (Automated rebuild): 1. terraform apply (10 min - creates VPS, DNS, firewall) 2. cloud-init restores from backups (15 min - automatic) 3. Verify services (5 min)

Total: 30 minutes, zero data loss

The code:

# Everything defined in code, reproducible
resource "hcloud_server" "hetzner_vps" {
  name        = "hetzner-vps"
  server_type = "cpx42"
  image       = "ubuntu-22.04"
  location    = "fsn1"
  ssh_keys    = [
    hcloud_ssh_key.macbook.id,
    hcloud_ssh_key.ipad.id,
    hcloud_ssh_key.pc.id
  ]
  user_data   = file("cloud-init.yml")  # Automated restore
}

# One command rebuilds everything

3. Version Control & Audit Trail

Git log shows infrastructure history:

git log --oneline

abc123f (HEAD) feat(vps): upgrade to CPX42 for performance
def456a feat(dns): add api.kua.cl for new service
789abc1 fix(firewall): allow port 8080 for Infisical
456def2 feat(ssh): add iPad SSH key for mobile access
123abc3 feat(vps): initial Hetzner VPS configuration

See exactly what changed:

git diff HEAD~1 vps.tf

- server_type = "cpx31"
+ server_type = "cpx42"

Who made the change:

git blame vps.tf

456def2 (Kavi MacBook 2025-01-10) server_type = "cpx42"

Rollback if needed:

git revert abc123f
terraform apply  # Reverts to CPX31

4. Automatic Dependency Management

Example: VPS IP changes → DNS must update

Manual approach: 1. VPS created with new IP (46.224.XXX.XXX) 2. Log into Cloudflare 3. Find all DNS records pointing to old IP 4. Update each record manually 5. Miss one record → service down

Terraform approach:

# VPS resource
resource "hcloud_server" "hetzner_vps" {
  # ...
}

# DNS automatically uses VPS IP (dynamic reference)
resource "cloudflare_record" "root" {
  value = hcloud_server.hetzner_vps.ipv4_address
}

resource "cloudflare_record" "secrets" {
  value = "kua.cl"  # CNAME to root
}

# terraform apply
# → VPS created with new IP
# → DNS updated automatically
# → All services continue working

Result: Zero manual intervention, zero missed records.

5. Infrastructure as Documentation

Before Terraform (scattered documentation):

notes.txt:
  VPS IP: 46.224.146.107 (is this current?)
  SSH keys: Added macbook key on Jan 5 (which ID?)
  DNS: kua.cl points to VPS (which records exist?)

# Documentation becomes outdated

After Terraform (self-documenting):

# vps.tf - SINGLE SOURCE OF TRUTH
resource "hcloud_server" "hetzner_vps" {
  name        = "hetzner-vps"
  server_type = "cpx42"          # Current server type
  location    = "fsn1"           # Current location
  ssh_keys    = [                # Current SSH keys
    hcloud_ssh_key.macbook.id,
    hcloud_ssh_key.ipad.id
  ]
}

# dns.tf - ALL DNS RECORDS
resource "cloudflare_record" "root" { ... }
resource "cloudflare_record" "secrets" { ... }
resource "cloudflare_record" "plex" { ... }
# If it's not in code, it doesn't exist

Query current state:

# What's my VPS IP?
terraform output vps_ip

# What DNS records exist?
terraform state list | grep cloudflare_record

# What SSH keys are installed?
terraform state list | grep hcloud_ssh_key

Documentation NEVER gets outdated (code IS the truth).


Common Concerns Addressed

"I only have one server, is Terraform overkill?"

Short answer: No, you have 12+ resources across 3 providers, not "one server."

Your infrastructure: - 3 SSH keys (multi-device access) - 1 VPS instance - 5+ DNS records - 1 Storage Box - 1+ S3 buckets - Firewall rules - Network configuration

Total: 12+ interdependent resources.

Terraform value: - Manages dependencies automatically (IP → DNS) - Version control for all resources - Multi-device access (MacBook/iPad/PC) - Disaster recovery automation

"Won't it be risky to import existing infrastructure?"

Yes, importing existing production IS risky.

That's why the migration guide uses parallel infrastructure: 1. Keep old VPS running (zero risk) 2. Create NEW VPS via Terraform 3. Test on new VPS 4. Switch DNS when verified 5. Destroy old VPS after 1 week

No import required for the VPS (safest approach).

Safe imports: - DNS records (no data, easy to verify) - SSH keys (additive, can't break existing)

See: Migration Guide for details.

"What if I make a mistake?"

Terraform's safety features:

  1. Plan before apply:

    terraform plan
    # Shows EXACTLY what will change before doing it
    # - Added: 1 VPS
    # - Changed: 2 DNS records
    # - Destroyed: 0 resources ← Important!
    

  2. Git version control:

    git log  # See all changes
    git revert <commit>  # Undo a change
    git diff  # Review before committing
    

  3. State backups:

    # Terraform keeps backups automatically
    terraform state pull > backup.tfstate
    # Can restore if needed
    

  4. prevent_destroy lifecycle:

    resource "hcloud_server" "hetzner_vps" {
      lifecycle {
        prevent_destroy = true  # Can't accidentally destroy
      }
    }
    
    # terraform destroy
    # Error: Instance cannot be destroyed
    

  5. Dry-run testing:

    terraform plan -out=plan.tfplan  # Save plan
    # Review carefully
    terraform show plan.tfplan  # Review again
    terraform apply plan.tfplan  # Apply if safe
    

"How much time will this take?"

Initial setup: 4-5 weeks (following migration guide)

Time saved: - Disaster recovery: 2-4 hours → 30 minutes (80% faster) - Adding DNS record: 5 minutes → 2 minutes (60% faster) - Finding infrastructure info: 10 minutes → 10 seconds (99% faster) - Multi-device coordination: 30 minutes → 0 (100% faster)

ROI: Break-even after first disaster recovery event.

Long-term: Terraform becomes faster than manual as infrastructure grows.


Summary

Terraform is the right tool for your infrastructure because:

  1. 12+ resources across 3 providers (perfect scale for Terraform)
  2. Multi-device access (MacBook, iPad, PC need shared state)
  3. Disaster recovery requirement (30 min rebuild vs 2-4 hours)
  4. Version control need (audit trail, rollback capability)
  5. Growing infrastructure (will add more resources over time)

What Terraform manages: - Infrastructure layer: VPS, DNS, SSH keys, firewall - Not: Applications (docker-compose), Secrets (Infisical), Data (backups)

Migration approach: - Safe parallel infrastructure (not risky imports) - 4-week phased approach - Zero data loss, zero downtime

Time investment: - Initial: 4-5 weeks (one-time) - Saved: 2-4 hours per disaster recovery - ROI: Break-even after first recovery event

Next steps: 1. Review Migration Guide - Safe migration strategy 2. Review Hybrid Storage - Performance architecture 3. Complete Backup Procedures - Pre-migration backups 4. Start Phase 1 (Week 2) - Import DNS and SSH keys

Infrastructure as Code = Infrastructure as Documentation = Infrastructure as Disaster Recovery