Terraform Workflow & Best Practices¶

Daily operations and best practices - Complete guide to working with Terraform safely (planned implementation).

Planned Architecture

This documentation describes the planned Terraform workflow. Not yet implemented.

Table of Contents¶

The Golden Rules
Daily Workflow
Always terraform plan Before apply
Reading terraform plan Output
terraform destroy Requires Extreme Caution
Git Workflow
Common Mistakes
Emergency Procedures

The Golden Rules¶

Rule #1: Always Plan Before Apply¶

# ❌ NEVER do this:
terraform apply -auto-approve

# ✅ ALWAYS do this:
terraform plan  # Review output carefully
terraform apply # Confirm after reviewing

Why: The 30 seconds "saved" can cost hours fixing mistakes.

Rule #2: Read EVERY Line of Plan Output¶

terraform plan

# READ:
# - What's being ADDED (+)
# - What's being CHANGED (~)
# - What's being DESTROYED (-)
# - Resource counts at the end

Why: Catch mistakes before they destroy infrastructure.

Rule #3: Backup Before Destructive Operations¶

# Before major changes
terraform state pull > backup-$(date +%Y%m%d-%H%M%S).json

# Then proceed
terraform apply

Why: State corruption recovery requires backup.

Rule #4: Never Edit State Manually¶

# ❌ DON'T:
vim terraform.tfstate

# ✅ DO:
terraform state mv <old> <new>
terraform state rm <resource>
terraform import <resource> <id>

Why: Manual edits cause state corruption.

Rule #5: Commit Code, Not State¶

# .gitignore should have:
*.tfstate
*.tfstate.*
*.tfvars

Why: State contains secrets, conflicts with collaborative work.

Daily Workflow¶

Standard Change Workflow¶

Step 1: Update Code

# Edit .tf files
vim hetzner-vps.tf

# Example change:
# server_type = "cpx42"  # Was: cpx31

Step 2: Format

# Format Terraform files
terraform fmt

# Check formatting
terraform fmt -check

Step 3: Validate

# Validate configuration syntax
terraform validate

# Expected:
# Success! The configuration is valid.

Step 4: Plan

# Generate execution plan
terraform plan

# Review output carefully
# Check: What resources change?
# Check: Cost impact?
# Check: Downtime required?

Step 5: Apply

# Apply changes
terraform apply

# Read plan again (shown in apply)
# Type: yes
# Wait for completion

Step 6: Verify

# Check infrastructure
ssh kavi@100.80.53.55

# Verify changes applied correctly
# Example: Check new server type
cat /proc/cpuinfo | grep "model name" | wc -l  # Should show 8 (CPX42)

Step 7: Commit

git add hetzner-vps.tf
git commit -m "feat(vps): upgrade to CPX42 for increased performance"
git push

Always terraform plan Before apply¶

Why Plan Is Critical¶

Example scenario:

# You THINK you're changing:
resource "hcloud_server" "hetzner_vps" {
  server_type = "cpx42"  # Upgrading
}

# But you ACCIDENTALLY changed:
resource "hcloud_server" "hetzner_vps" {
  name = "new-name"  # DESTROYS AND RECREATES!
}

Without plan:

terraform apply -auto-approve
# Server destroyed and recreated
# All data LOST
# Downtime: 5+ minutes
# Recovery: Hours

With plan:

terraform plan
# Output:
# -/+ hcloud_server.hetzner_vps must be replaced
# -/+ (forces replacement)
#   ~ name: "hetzner-vps" -> "new-name"
#
# ⚠️ WARNING: This will DESTROY and RECREATE!

# You see the warning!
# Cancel: Ctrl+C
# Fix the mistake

Saving Plans¶

For review or automation:

# Save plan to file
terraform plan -out=tfplan

# Review plan file (if needed)
terraform show tfplan

# Apply saved plan (no confirmation needed)
terraform apply tfplan

# Clean up
rm tfplan

Reading terraform plan Output¶

Symbols¶

Symbol	Meaning	Action
`+`	Create	New resource will be created
`-`	Destroy	Resource will be destroyed
`~`	Update	Resource will be updated in-place
`-/+`	Replace	Destroy then create (forces replacement)
`<=`	Read	Data source will be read

Example Plan Output¶

Terraform will perform the following actions:

  # hcloud_server.hetzner_vps will be updated in-place
  ~ resource "hcloud_server" "hetzner_vps" {
      ~ server_type     = "cpx31" -> "cpx42"
        name            = "hetzner-vps"
        # (10 unchanged attributes hidden)
    }

  # cloudflare_record.api will be created
  + resource "cloudflare_record" "api" {
      + hostname = (known after apply)
      + id       = (known after apply)
      + name     = "api"
      + proxied  = false
      + ttl      = 3600
      + type     = "A"
      + value    = "46.224.146.107"
      + zone_id  = "abc123"
    }

Plan: 1 to add, 1 to change, 0 to destroy.

Read: - ✅ ~ Hetzner VPS will be updated (CPX31 → CPX42) - ✅ + New DNS record api.kua.cl will be created - ✅ Total: 1 add, 1 change, 0 destroy

Confirm: Changes match intention

Warning Signs¶

🚨 Resource will be DESTROYED:

-/+ hcloud_server.hetzner_vps must be replaced

Questions to ask: - Why is replacement needed? - Will this cause downtime? - Is data backed up? - Is this intentional?

If unexpected: Cancel (Ctrl+C), investigate

terraform destroy Requires Extreme Caution¶

The Danger¶

One command destroys EVERYTHING:

terraform destroy
# Destroys:
# - Hetzner VPS (server DELETED)
# - DNS records (site OFFLINE)
# - SSH keys (access LOST)
# - Firewall rules (security GONE)

# Confirmation:
# yes

# Everything GONE

Before Running destroy¶

Checklist: 1. ✅ Verify workspace: terraform workspace show 2. ✅ Verify backend: Check terraform.tf (prod vs staging?) 3. ✅ Check cloud console: Am I in the right account? 4. ✅ Backup state: terraform state pull > backup.json 5. ✅ Read plan output: terraform destroy shows what will be deleted 6. ✅ Confirm intention: Do I really want to delete EVERYTHING?

Safer Alternatives¶

Destroy specific resource:

# Instead of destroying everything:
terraform destroy

# Destroy one resource:
terraform destroy -target=cloudflare_record.test_api

# Plan first:
terraform plan -destroy -target=cloudflare_record.test_api

Use prevent_destroy:

# Prevent accidental destruction
resource "hcloud_server" "hetzner_vps" {
  # ...

  lifecycle {
    prevent_destroy = true
  }
}

Result:

terraform destroy
# Error: Instance cannot be destroyed
#
# Resource hcloud_server.hetzner_vps has lifecycle.prevent_destroy

Git Workflow¶

Branch Strategy¶

Feature branch workflow:

# Create feature branch
git checkout -b terraform/upgrade-vps-cpx42

# Make changes
vim hetzner-vps.tf

# Test with plan
terraform plan

# Commit
git add hetzner-vps.tf
git commit -m "feat(vps): upgrade to CPX42"

# Push
git push origin terraform/upgrade-vps-cpx42

# Create PR (GitHub/GitLab)
# Review terraform plan output in PR
# Merge after approval
# Apply from main branch

Commit Message Format¶

Use conventional commits:

type(scope): description

# Types:
feat     - New infrastructure resource
fix      - Fix configuration issue
refactor - Restructure without changing infrastructure
docs     - Documentation only
chore    - Maintenance (update provider versions)

# Examples:
feat(vps): add Hetzner VPS server
feat(dns): add api.kua.cl DNS record
fix(firewall): allow port 8080 for Infisical
refactor(vps): move SSH keys to separate file
chore(deps): update hcloud provider to 1.45.1

Common Mistakes¶

Mistake #1: Not Reading Plan Output¶

Scenario:

# Developer runs:
terraform apply
# Quickly types: yes
# Without reading output

# Result: Accidentally destroyed production server

Fix: ALWAYS read plan output line-by-line

Mistake #2: Working in Wrong Directory¶

Scenario:

# Developer thinks they're in staging/
cd ~/Coding/terraform/production  # Actually in production!

terraform destroy
# Destroys PRODUCTION (oops!)

Fix:

# Always verify location
pwd
# /Users/kavi/Coding/terraform/production

# Check backend
grep "key" terraform.tf
# key = "prod/terraform.tfstate"  # ← Confirms production!

Mistake #3: Concurrent Runs¶

Scenario:

# Terminal A
terraform apply &

# Terminal B (simultaneously)
terraform apply

# Result: State corruption

Fix: Use state locking or coordinate manually

Mistake #4: Ignoring Warnings¶

Scenario:

terraform plan
# Warning: Argument is deprecated
#   on hetzner-vps.tf line 10:
#   ...

# Developer ignores warning
terraform apply

Fix: Address warnings before applying

Mistake #5: Not Backing Up Before Major Changes¶

Scenario:

# Major VPS upgrade
terraform apply
# Something goes wrong, state corrupted
# No backup = can't recover

Fix: terraform state pull > backup.json before major changes

Emergency Procedures¶

State Corruption¶

Symptoms: - Terraform shows unexpected plan - Resources "drift" constantly - Errors about mismatched state

Recovery:

# 1. Stop all terraform operations
# 2. Restore from backup
terraform state push backup-20250115.json

# 3. Verify
terraform plan
# Should show "No changes"

# 4. If still broken, rebuild state via imports
terraform import hcloud_server.hetzner_vps 12345678

Lost State File¶

If using remote backend:

# Reinitialize
rm -rf .terraform terraform.tfstate*
terraform init \
  -backend-config="access_key=$STORAGEBOX_ACCESS_KEY" \
  -backend-config="secret_key=$STORAGEBOX_SECRET_KEY"

# State restored from Storage Box

If local state only:

# Restore from backup
cp terraform.tfstate.backup terraform.tfstate

# Or rebuild via imports (tedious)

Accidentally Destroyed Resource¶

If caught immediately:

# Recreate with Terraform
terraform apply
# Recreates destroyed resource

If state is out of sync:

# Import existing resource
terraform import hcloud_server.hetzner_vps <server-id>

Summary¶

Terraform Workflow Best Practices: - ✅ ALWAYS terraform plan before apply - ✅ READ every line of plan output - ✅ BACKUP state before major changes - ✅ VERIFY workspace/backend before destroy - ✅ COMMIT code to Git, never state - ✅ FORMAT with terraform fmt - ✅ VALIDATE with terraform validate

Common Mistakes to Avoid: - ❌ terraform apply -auto-approve (never use) - ❌ Editing state file manually - ❌ Running terraform destroy without checklist - ❌ Concurrent terraform runs - ❌ Committing *.tfstate to Git

Emergency Contacts: - State corruption: Restore from backup - Lost state: Reinitialize from remote backend - Destroyed resource: Recreate with terraform apply

What's Next: - Review State Management - critical state concepts - Review VPS Management - provision infrastructure - Review DNS Management - manage DNS records