Hybrid Storage Architecture¶
Local NVMe vs Storage Box performance - Understanding where to store what for optimal performance.
Critical Performance Insight
Storage Box is 30-60x slower than local NVMe. Running databases on network storage will destroy performance. This guide explains the correct hybrid architecture.
Table of Contents¶
- The Performance Problem
- Storage Performance Comparison
- Current Architecture Analysis
- The Correct Hybrid Pattern
- Database Performance Impact
- Backup Strategy
- Disaster Recovery
The Performance Problem¶
The Question¶
User's Question (from Gemini conversation):
"I have a Storage Box. Can I run my PostgreSQL database on it?"
The Answer¶
Gemini's Response:
"Don't run the DB live on Storage Box. It will kill performance."
Why This Matters¶
Your Hetzner infrastructure has two storage types:
- Local NVMe (VPS disk)
- 📊 Speed: ~3000 MB/s
- ⚡ Latency: <0.1ms
- 💾 Size: 240GB (CPX42)
-
🔄 Ephemeral: Lost if VPS destroyed
-
Storage Box (network mount)
- 📊 Speed: ~50-100 MB/s
- ⚡ Latency: Network latency (~1-5ms)
- 💾 Size: 1TB+
- 🔄 Persistent: Survives VPS destruction
Performance Ratio: NVMe is 30-60x faster.
The Mistake¶
Common misconception:
"Storage Box is persistent, so I'll store everything there for safety."
Result: - Database on Storage Box = slow queries - Immich photo loading = 5-10 second delays - App becomes unusable
The reality: - Databases need fast random I/O → local NVMe - Static files can tolerate slower sequential reads → Storage Box - Hybrid approach = best of both worlds
Storage Performance Comparison¶
Benchmarks¶
| Operation | Local NVMe | Storage Box | Performance Ratio |
|---|---|---|---|
| Sequential Read | ~3000 MB/s | ~100 MB/s | 30x faster |
| Sequential Write | ~2800 MB/s | ~80 MB/s | 35x faster |
| Random Read (4K) | ~600 MB/s | ~10 MB/s | 60x faster |
| Random Write (4K) | ~500 MB/s | ~8 MB/s | 62x faster |
| Latency | <0.1ms | 1-5ms | 10-50x faster |
What This Means¶
Database workload (random 4K I/O): - NVMe: 600 MB/s - Storage Box: 10 MB/s - Result: Database 60x slower on Storage Box
Photo loading (sequential read): - NVMe: 3000 MB/s - Storage Box: 100 MB/s - Result: Photos 30x slower (but still acceptable)
Real-World Impact¶
PostgreSQL query on NVMe:
Same query on Storage Box:
User experience: - 10ms = instant - 600ms = noticeable delay, feels sluggish
Why Network Storage Is Slow¶
Every database operation: 1. Application → VPS local disk: ~0.1ms 2. VPS → Storage Box (network): ~1-5ms 3. Storage Box processes request: ~1-2ms 4. Storage Box → VPS (network): ~1-5ms 5. VPS → Application: ~0.1ms
Total: ~4-12ms per operation
PostgreSQL workload: - Thousands of operations per query - Each operation waits for network round-trip - Result: Queries timeout, app becomes unusable
Test Results (From Your Infrastructure)¶
From Gemini analysis:
# Immich database on local NVMe
docker exec immich-postgres pgbench -c 10 -j 2 -t 1000 immich
# Results:
# TPS: 1200 transactions/second
# Latency: 8ms average
# If moved to Storage Box:
# TPS: ~20 transactions/second (60x slower)
# Latency: 500ms average (app timeouts)
Current Architecture Analysis¶
Your Existing Setup (CORRECT)¶
From SSH inspection:
Mounts:
{
"Mounts": [
{
"Type": "volume",
"Source": "immich_pgdata",
"Destination": "/var/lib/postgresql/data",
"Driver": "local",
"RW": true
},
{
"Type": "bind",
"Source": "/mnt/storagebox/immich/upload",
"Destination": "/usr/src/app/upload",
"RW": true
}
]
}
Analysis:
| Mount | Storage | Correct? | Why |
|---|---|---|---|
Database (immich_pgdata) |
Local NVMe | ✅ YES | High IOPS, fast random access |
Photos (/mnt/storagebox/immich/upload) |
Storage Box | ✅ YES | Sequential reads, large files |
This is PERFECT: - ✅ Database on fast NVMe (queries in milliseconds) - ✅ Photos on Storage Box (slow is acceptable for large files)
How You Avoided the Mistake¶
Your docker-compose.yml (implicit):
services:
immich-postgres:
volumes:
- immich_pgdata:/var/lib/postgresql/data # ✅ Local volume
immich-server:
volumes:
- /mnt/storagebox/immich/upload:/usr/src/app/upload # ✅ Network mount
volumes:
immich_pgdata:
driver: local # ✅ Creates on VPS disk (/var/lib/docker/volumes/)
Why this works:
- driver: local → Docker creates volume on VPS NVMe
- /mnt/storagebox/ → Explicit network mount path
- Result: Database fast, photos persistent
Other Services (Verification)¶
Coolify:
Infisical:
Redis:
All correct - databases on NVMe, only static files on Storage Box.
The Correct Hybrid Pattern¶
Architecture Diagram¶
┌────────────────────────────────────────────────────────────┐
│ HETZNER VPS (CPX42) │
│ │
│ ┌───────────────────────────────────────────────────────┐│
│ │ LOCAL NVMe DISK (240GB, Ephemeral) ││
│ │ ││
│ │ /var/lib/docker/volumes/ ││
│ │ ├─ immich_pgdata/ (PostgreSQL) ││
│ │ ├─ coolify_data/ (PostgreSQL) ││
│ │ ├─ infisical_mongo/ (MongoDB) ││
│ │ ├─ redis_data/ (Redis) ││
│ │ └─ app_cache/ (Temporary files) ││
│ │ ││
│ │ Performance: ~3000 MB/s, <0.1ms latency ││
│ │ Use Case: Databases, caches, temp files ││
│ │ ││
│ │ 🔄 Nightly Backup Script (3 AM cron): ││
│ │ pg_dump immich → /mnt/storagebox/backups/ ││
│ │ pg_dump coolify → /mnt/storagebox/backups/ ││
│ │ mongodump infisical → /mnt/storagebox/backups/ ││
│ └───────────────────────────────────────────────────────┘│
│ │
│ ┌───────────────────────────────────────────────────────┐│
│ │ RCLONE MOUNT (to Storage Box) ││
│ │ /mnt/storagebox/ → WebDAV/CIFS ││
│ └───────────────────────────────────────────────────────┘│
└────────────────────────────────────────────────────────────┘
│
│ Network (HTTPS/WebDAV)
│
┌────────────────────────────────────────────────────────────┐
│ HETZNER STORAGE BOX (1TB, Persistent) │
│ │
│ /immich/ │
│ └─ upload/ │
│ ├─ 2024/ │
│ │ ├─ 01/ │
│ │ │ ├─ IMG_1234.jpg (5MB) │
│ │ │ ├─ IMG_1235.jpg (4MB) │
│ │ │ └─ ... (5GB+ total) │
│ └─ ... │
│ │
│ /backups/ │
│ ├─ immich_db_20250115.sql.gz │
│ ├─ immich_db_20250116.sql.gz │
│ ├─ coolify_db_20250115.sql.gz │
│ ├─ infisical_env_20250115.gpg (ENCRYPTION_KEY) │
│ └─ ... (7 days retention) │
│ │
│ /terraform-state/ │
│ └─ prod/ │
│ └─ terraform.tfstate │
│ │
│ Performance: ~50-100 MB/s, network latency │
│ Use Case: Photos, videos, backups, Terraform state │
└────────────────────────────────────────────────────────────┘
Decision Matrix¶
Where to store what:
| Data Type | Storage Location | Why |
|---|---|---|
| PostgreSQL database | Local NVMe | High IOPS, random access, <100ms query requirement |
| MongoDB database | Local NVMe | Same as PostgreSQL |
| Redis cache | Local NVMe | Extreme speed requirement (microseconds) |
| Photos (Immich) | Storage Box | Large sequential files, slow acceptable |
| Videos | Storage Box | Large sequential files |
| Database backups | Storage Box | Persistence, disaster recovery |
| Terraform state | Storage Box | Persistence, multi-device access |
| Docker images | Local NVMe | Fast container startup |
| Application cache | Local NVMe | Frequently accessed, speed critical |
| Logs (old) | Storage Box | Archive, rarely accessed |
Configuration Pattern¶
docker-compose.yml (correct pattern):
version: '3.8'
services:
# Database: LOCAL volume (fast)
postgres:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data # ✅ Local NVMe
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
# Application: MIXED volumes
immich-server:
image: ghcr.io/immich-app/immich-server:release
volumes:
- /mnt/storagebox/immich/upload:/usr/src/app/upload # ✅ Network (photos)
- app_cache:/tmp # ✅ Local NVMe (cache)
depends_on:
- postgres
volumes:
pgdata:
driver: local # ✅ Creates on VPS disk
app_cache:
driver: local # ✅ Creates on VPS disk
Key Points:
- Named volumes (pgdata, app_cache) → local NVMe
- Explicit paths (/mnt/storagebox/) → network mount
- Never mix: Don't put database data on /mnt/storagebox/
Database Performance Impact¶
PostgreSQL Workload Analysis¶
Typical query:
SELECT a.*, u.name
FROM assets a
JOIN users u ON a.user_id = u.id
WHERE a.created_at > '2024-01-01'
ORDER BY a.created_at DESC
LIMIT 100;
Operations:
1. Index scan on created_at (random reads)
2. Row fetches from table (random reads)
3. Join with users table (random reads)
4. Sort buffer operations (random writes)
5. Result set construction
I/O Pattern: ~500 random 4K reads + 50 random writes
On NVMe: - 500 reads @ 0.1ms each = 50ms - 50 writes @ 0.1ms each = 5ms - Total query time: ~60ms ✅
On Storage Box: - 500 reads @ 2ms each = 1000ms - 50 writes @ 3ms each = 150ms - Total query time: ~1200ms ❌ (20x slower)
User Impact: App becomes unusable (timeouts, slowness).
Immich-Specific Impact¶
Immich timeline query (loading photos):
-- Fetch 200 photos for timeline
SELECT * FROM assets
WHERE user_id = 123
AND type = 'IMAGE'
ORDER BY created_at DESC
LIMIT 200;
On NVMe: - Database query: 15ms - Fetch photo URLs from DB: 5ms - Load photos from Storage Box: 100-200ms (acceptable, sequential reads) - Total: ~220ms ✅ Feels instant
On Storage Box (database): - Database query: 900ms (60x slower) - Fetch photo URLs from DB: 300ms - Load photos from Storage Box: 100-200ms - Total: ~1400ms ❌ 1.4 second delay, feels slow
Result: Database on NVMe = fast app, Storage Box for photos = persistent.
Redis Performance (Extreme Example)¶
Redis is a cache - requires microsecond latency.
On NVMe: - GET operation: 0.05ms (50 microseconds) - 20,000 ops/second per client
On Storage Box: - GET operation: 2ms (2000 microseconds) - 500 ops/second per client - Result: 40x slower, cache becomes bottleneck
Redis on network storage = defeats the purpose of caching.
Backup Strategy¶
Nightly Database Dumps¶
Purpose: Persistence while keeping databases on fast NVMe.
Cron job (/etc/cron.d/database-backups):
Backup script (/opt/scripts/backup-databases.sh):
#!/bin/bash
set -e
BACKUP_DIR="/mnt/storagebox/backups"
DATE=$(date +%Y%m%d)
# Immich database
docker exec immich-postgres pg_dump -U postgres immich | gzip > ${BACKUP_DIR}/immich_db_${DATE}.sql.gz
# Coolify database
docker exec coolify-db pg_dump -U postgres coolify | gzip > ${BACKUP_DIR}/coolify_db_${DATE}.sql.gz
# Infisical database
docker exec infisical-mongo mongodump --archive=${BACKUP_DIR}/infisical_db_${DATE}.archive --gzip
# Infisical ENCRYPTION_KEY (critical)
gpg --symmetric --cipher-algo AES256 /root/infisical/.env
mv /root/infisical/.env.gpg ${BACKUP_DIR}/infisical_env_${DATE}.gpg
# Cleanup old backups (keep 7 days)
find ${BACKUP_DIR} -name "*.sql.gz" -mtime +7 -delete
find ${BACKUP_DIR} -name "*.archive" -mtime +7 -delete
find ${BACKUP_DIR} -name "*.gpg" -mtime +30 -delete # Keep ENCRYPTION_KEY longer
echo "Backup completed: $(date)"
Verify backups:
# Check backup exists
ls -lh /mnt/storagebox/backups/immich_db_$(date +%Y%m%d).sql.gz
# Verify file size (should be >10MB for Immich)
du -h /mnt/storagebox/backups/immich_db_$(date +%Y%m%d).sql.gz
# Test restore (on local machine)
gunzip -c immich_db_20250115.sql.gz | head -100
# Should show SQL statements
Retention Policy¶
| Backup Type | Frequency | Retention | Storage Location |
|---|---|---|---|
| Database dumps | Daily (3 AM) | 7 days | Storage Box /backups/ |
| ENCRYPTION_KEY | Daily | 30 days | Storage Box /backups/ (GPG encrypted) |
| Terraform state | Real-time | Unlimited | Storage Box /terraform-state/ |
| Photos/videos | N/A (already on Storage Box) | Unlimited | Storage Box /immich/upload/ |
Backup Storage Cost¶
Storage Box: 1TB for ~$4/month
Current usage: - Photos: ~5GB (Immich) - Database backups: ~2GB (7 days × 300MB/day) - Terraform state: ~1MB - Total: ~7GB (0.7% of 1TB)
Room for growth: Can store years of backups before hitting limits.
Disaster Recovery¶
Scenario: VPS Destroyed¶
Data loss window: Maximum 24 hours (since last backup).
Recovery Procedure¶
Step 1: Terraform recreates VPS (10 minutes):
Step 2: cloud-init restores from backups (automatic, 15 minutes):
#cloud-config
runcmd:
# Mount Storage Box
- rclone mount storagebox:/ /mnt/storagebox --daemon
# Restore Immich database
- docker run -d --name postgres-temp postgres:16
- gunzip -c /mnt/storagebox/backups/immich_db_$(date +%Y%m%d).sql.gz | docker exec -i postgres-temp psql -U postgres
# Restore Coolify database
- gunzip -c /mnt/storagebox/backups/coolify_db_$(date +%Y%m%d).sql.gz | docker exec -i postgres-temp psql -U postgres
# Restore Infisical ENCRYPTION_KEY
- gpg --decrypt /mnt/storagebox/backups/infisical_env_$(date +%Y%m%d).gpg > /root/infisical/.env
# Start services
- docker-compose up -d
Step 3: Verify restoration (5 minutes):
# Check database row counts
docker exec postgres psql -U postgres immich -c "SELECT COUNT(*) FROM assets;"
# Check photos accessible
ls /mnt/storagebox/immich/upload/ | head -20
# Check services running
docker ps
Total recovery time: ~30 minutes (infrastructure + data restore).
What Is Preserved¶
| Data | Preserved? | How |
|---|---|---|
| Photos | ✅ YES (100%) | Already on Storage Box |
| Database data | ✅ YES (~99%) | Restored from nightly backup |
| Infisical secrets | ✅ YES (100%) | ENCRYPTION_KEY backed up |
| Terraform state | ✅ YES (100%) | Remote backend on Storage Box |
| Configuration | ✅ YES (100%) | docker-compose.yml in Git |
| Docker images | ✅ YES (100%) | Pulled from registry |
Data loss: Maximum 24 hours of new photos/data (since last backup).
The "Disposable Server" Concept¶
Before hybrid architecture: - VPS destroyed = PERMANENT DATA LOSS - Recovery: Manual rebuild, hours/days
After hybrid architecture: - VPS destroyed = Recreate via Terraform - Recovery: Automated, 30 minutes - Data loss: <24 hours (nightly backups)
Mindset shift: - Server is ephemeral (replaceable) - Data is persistent (on Storage Box) - Infrastructure is code (Terraform)
Benefits: - No fear of VPS failures - Easy to test infrastructure changes - Can rebuild from scratch anytime
Summary¶
Key Principles¶
- Databases on local NVMe (fast, ephemeral)
- Static files on Storage Box (slow, persistent)
- Nightly backups (persistence for databases)
- cloud-init restores (automated disaster recovery)
Performance Rules¶
| Workload | Storage | Why |
|---|---|---|
| High IOPS (databases, caches) | Local NVMe | <0.1ms latency required |
| Sequential reads (photos, videos) | Storage Box | 100ms latency acceptable |
| Persistence (backups, state) | Storage Box | Survives VPS destruction |
Current Architecture Status¶
- ✅ Your setup is CORRECT
- ✅ Databases on local NVMe (fast)
- ✅ Photos on Storage Box (persistent)
- ✅ Nightly backups (disaster recovery)
No changes needed - you already avoided the common mistakes.
For Terraform Migration¶
When creating new VPS via Terraform:
- cloud-init must:
- Mount Storage Box FIRST
- Create Docker volumes as
local(not on/mnt/storagebox/) - Restore databases from backups
-
Start services
-
docker-compose.yml must:
- Use named volumes for databases (→ local NVMe)
-
Use
/mnt/storagebox/paths for photos (→ network mount) -
Backup script must:
- Run daily at 3 AM
- Dump databases to Storage Box
- Verify backups succeeded
Result: New VPS will replicate current (correct) architecture.
What's Next¶
- See Migration Guide for safe Terraform migration
- See Backup Procedures for detailed backup setup
- See Disaster Recovery for recovery procedures