Phase 2 documentation implementation: - Created HARDWARE.md: Complete hardware inventory (servers, GPUs, storage, network cards) - Created SERVICES.md: Service inventory with URLs, credentials, health checks (25+ services) - Created MONITORING.md: Health monitoring recommendations, alert setup, implementation plan - Created MAINTENANCE.md: Regular procedures, update schedules, testing checklists - Updated README.md: Added all Phase 2 documentation links - Updated CLAUDE.md: Cleaned up to quick reference only (1340→377 lines) All detailed content now in specialized documentation files with cross-references. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 KiB
Storage Architecture
Documentation of all storage pools, datasets, shares, and capacity planning across the homelab.
Overview
Storage Distribution
| Location | Type | Capacity | Purpose |
|---|---|---|---|
| PVE | NVMe + SSD mirrors | ~9 TB usable | VM storage, fast IO |
| PVE2 | NVMe + HDD mirrors | ~6+ TB usable | VM storage, bulk data |
| TrueNAS | ZFS pool + EMC enclosure | ~12+ TB usable | Central file storage, NFS/SMB |
PVE (10.10.10.120) Storage Pools
nvme-mirror1 (Primary Fast Storage)
- Type: ZFS mirror
- Devices: 2x Sabrent Rocket Q NVMe
- Capacity: 3.6 TB usable
- Purpose: High-performance VM storage
- Used By:
- Critical VMs requiring fast IO
- Database workloads
- Development environments
Check status:
ssh pve 'zpool status nvme-mirror1'
ssh pve 'zpool list nvme-mirror1'
nvme-mirror2 (Secondary Fast Storage)
- Type: ZFS mirror
- Devices: 2x Kingston SFYRD 2TB NVMe
- Capacity: 1.8 TB usable
- Purpose: Additional fast VM storage
- Used By: TBD
Check status:
ssh pve 'zpool status nvme-mirror2'
ssh pve 'zpool list nvme-mirror2'
rpool (Root Pool)
- Type: ZFS mirror
- Devices: 2x Samsung 870 QVO 4TB SSD
- Capacity: 3.6 TB usable
- Purpose: Proxmox OS, container storage, VM backups
- Used By:
- Proxmox root filesystem
- LXC containers
- Local VM backups
Check status:
ssh pve 'zpool status rpool'
ssh pve 'df -h /var/lib/vz'
Storage Pool Usage Summary (PVE)
Get current usage:
ssh pve 'zpool list'
ssh pve 'pvesm status'
PVE2 (10.10.10.102) Storage Pools
nvme-mirror3 (Fast Storage)
- Type: ZFS mirror
- Devices: 2x NVMe (model unknown)
- Capacity: Unknown (needs investigation)
- Purpose: High-performance VM storage
- Used By: Trading VM (301), other VMs
Check status:
ssh pve2 'zpool status nvme-mirror3'
ssh pve2 'zpool list nvme-mirror3'
local-zfs2 (Bulk Storage)
- Type: ZFS mirror
- Devices: 2x WD Red 6TB HDD
- Capacity: ~6 TB usable
- Purpose: Bulk/archival storage
- Power Management: 30-minute spindown configured
- Saves ~10-16W when idle
- Udev rule:
/etc/udev/rules.d/69-hdd-spindown.rules - Command:
hdparm -S 241(30 min)
Notes:
- Pool had only 768 KB used as of 2024-12-16
- Drives configured to spin down after 30 min idle
- Good for archival, NOT for active workloads
Check status:
ssh pve2 'zpool status local-zfs2'
ssh pve2 'zpool list local-zfs2'
# Check if drives are spun down
ssh pve2 'hdparm -C /dev/sdX' # Shows active/standby
TrueNAS (VM 100 @ 10.10.10.200) - Central Storage
ZFS Pool: vault
Primary storage pool for all shared data.
Devices: ❓ Needs investigation
- EMC storage enclosure with multiple drives
- SAS connection via LSI SAS2308 HBA (passed through to VM)
Capacity: ❓ Needs investigation
Check pool status:
ssh truenas 'zpool status vault'
ssh truenas 'zpool list vault'
# Get detailed capacity
ssh truenas 'zfs list -o name,used,avail,refer,mountpoint'
Datasets (Known)
Based on Syncthing configuration, likely datasets:
| Dataset | Purpose | Synced Devices | Notes |
|---|---|---|---|
| vault/documents | Personal documents | Mac Mini, MacBook, Windows PC, Phone | ~11 GB |
| vault/downloads | Downloads folder | Mac Mini, TrueNAS | ~38 GB |
| vault/pictures | Photos | Mac Mini, MacBook, Phone | Unknown size |
| vault/notes | Note files | Mac Mini, MacBook, Phone | Unknown size |
| vault/desktop | Desktop sync | Unknown | 7.2 GB |
| vault/movies | Movie library | Unknown | Unknown size |
| vault/config | Config files | Mac Mini, MacBook | Unknown size |
Get complete dataset list:
ssh truenas 'zfs list -r vault'
NFS/SMB Shares
Status: ❓ Not documented
Needs investigation:
# List NFS exports
ssh truenas 'showmount -e localhost'
# List SMB shares
ssh truenas 'smbclient -L localhost -N'
# Via TrueNAS API/UI
# Sharing → Unix Shares (NFS)
# Sharing → Windows Shares (SMB)
Expected shares:
- Media libraries for Plex (on Saltbox VM)
- Document storage
- VM backups?
- ISO storage?
EMC Storage Enclosure
Model: EMC KTN-STL4 (or similar) Connection: SAS via LSI SAS2308 HBA (passthrough to TrueNAS VM) Drives: ❓ Unknown count and capacity
See EMC-ENCLOSURE.md for:
- SES commands
- Fan control
- LCC (Link Control Card) troubleshooting
- Maintenance procedures
Check enclosure status:
ssh truenas 'sg_ses --page=0x02 /dev/sgX' # Element descriptor
ssh truenas 'smartctl --scan' # List all drives
Storage Network Architecture
Internal Storage Network (10.10.10.20.0/24)
Purpose: Dedicated network for NFS/iSCSI traffic to reduce congestion on main network.
Bridge: vmbr3 on PVE (virtual bridge, no physical NIC) Subnet: 10.10.10.20.0/24 DHCP: No Gateway: No (internal only, no internet)
Connected VMs:
- TrueNAS VM (secondary NIC)
- Saltbox VM (secondary NIC) - for NFS mounts
- Other VMs needing storage access
Configuration:
# On TrueNAS VM - check second NIC
ssh truenas 'ip addr show enp6s19'
# On Saltbox - check NFS mounts
ssh saltbox 'mount | grep nfs'
Benefits:
- Separates storage traffic from general network
- Prevents NFS/SMB from saturating main network
- Better performance for storage-heavy workloads
Storage Capacity Planning
Current Usage (Estimate)
Needs actual audit:
# PVE pools
ssh pve 'zpool list -o name,size,alloc,free'
# PVE2 pools
ssh pve2 'zpool list -o name,size,alloc,free'
# TrueNAS vault pool
ssh truenas 'zpool list vault'
# Get detailed breakdown
ssh truenas 'zfs list -r vault -o name,used,avail'
Growth Rate
Needs tracking - recommend monthly snapshots of capacity:
#!/bin/bash
# Save as ~/bin/storage-capacity-report.sh
DATE=$(date +%Y-%m-%d)
REPORT=~/Backups/storage-reports/capacity-$DATE.txt
mkdir -p ~/Backups/storage-reports
echo "Storage Capacity Report - $DATE" > $REPORT
echo "================================" >> $REPORT
echo "" >> $REPORT
echo "PVE Pools:" >> $REPORT
ssh pve 'zpool list' >> $REPORT
echo "" >> $REPORT
echo "PVE2 Pools:" >> $REPORT
ssh pve2 'zpool list' >> $REPORT
echo "" >> $REPORT
echo "TrueNAS Pools:" >> $REPORT
ssh truenas 'zpool list' >> $REPORT
echo "" >> $REPORT
echo "TrueNAS Datasets:" >> $REPORT
ssh truenas 'zfs list -r vault -o name,used,avail' >> $REPORT
echo "Report saved to $REPORT"
Run monthly via cron:
0 9 1 * * ~/bin/storage-capacity-report.sh
Expansion Planning
When to expand:
- Pool reaches 80% capacity
- Performance degrades
- New workloads require more space
Expansion options:
- Add drives to existing pools (if mirrors, add mirror vdev)
- Add new NVMe drives to PVE/PVE2
- Expand EMC enclosure (add more drives)
- Add second EMC enclosure
Cost estimates: TBD
ZFS Health Monitoring
Daily Health Checks
# Check for errors on all pools
ssh pve 'zpool status -x' # Shows only unhealthy pools
ssh pve2 'zpool status -x'
ssh truenas 'zpool status -x'
# Check scrub status
ssh pve 'zpool status | grep scrub'
ssh pve2 'zpool status | grep scrub'
ssh truenas 'zpool status | grep scrub'
Scrub Schedule
Recommended: Monthly scrub on all pools
Configure scrub:
# Via Proxmox UI: Node → Disks → ZFS → Select pool → Scrub
# Or via cron:
0 2 1 * * /sbin/zpool scrub nvme-mirror1
0 2 1 * * /sbin/zpool scrub rpool
On TrueNAS:
- Configure via UI: Storage → Pools → Scrub Tasks
- Recommended: 1st of every month at 2 AM
SMART Monitoring
Check drive health:
# PVE
ssh pve 'smartctl -a /dev/nvme0'
ssh pve 'smartctl -a /dev/sda'
# TrueNAS
ssh truenas 'smartctl --scan'
ssh truenas 'smartctl -a /dev/sdX' # For each drive
Configure SMART tests:
- TrueNAS UI: Tasks → S.M.A.R.T. Tests
- Recommended: Weekly short test, monthly long test
Alerts
Set up email alerts for:
- ZFS pool errors
- SMART test failures
- Pool capacity > 80%
- Scrub failures
Storage Performance Tuning
ZFS ARC (Cache)
Check ARC usage:
ssh pve 'arc_summary'
ssh truenas 'arc_summary'
Tuning (if needed):
- PVE/PVE2: Set max ARC in
/etc/modprobe.d/zfs.conf - TrueNAS: Configure via UI (System → Advanced → Tunables)
NFS Performance
Mount options (on clients like Saltbox):
rsize=131072,wsize=131072,hard,timeo=600,retrans=2,vers=3
Verify NFS mounts:
ssh saltbox 'mount | grep nfs'
Record Size Optimization
Different workloads need different record sizes:
- VMs: 64K (default, good for VMs)
- Databases: 8K or 16K
- Media files: 1M (large sequential reads)
Set record size (on TrueNAS datasets):
ssh truenas 'zfs set recordsize=1M vault/movies'
Disaster Recovery
Pool Recovery
If a pool fails to import:
# Try importing with different name
zpool import -f -N poolname newpoolname
# Check pool with readonly
zpool import -f -o readonly=on poolname
# Force import (last resort)
zpool import -f -F poolname
Drive Replacement
When a drive fails:
# Identify failed drive
zpool status poolname
# Replace drive
zpool replace poolname old-device new-device
# Monitor resilver
watch zpool status poolname
Data Recovery
If pool is completely lost:
- Restore from offsite backup (see BACKUP-STRATEGY.md)
- Recreate pool structure
- Restore data
Critical: This is why we need offsite backups!
Quick Reference
Common Commands
# Pool status
zpool status [poolname]
zpool list
# Dataset usage
zfs list
zfs list -r vault
# Check pool health (only unhealthy)
zpool status -x
# Scrub pool
zpool scrub poolname
# Get pool IO stats
zpool iostat -v 1
# Snapshot management
zfs snapshot poolname/dataset@snapname
zfs list -t snapshot
zfs rollback poolname/dataset@snapname
zfs destroy poolname/dataset@snapname
Storage Locations by Use Case
| Use Case | Recommended Storage | Why |
|---|---|---|
| VM OS disk | nvme-mirror1 (PVE) | Fastest IO |
| Database | nvme-mirror1/2 | Low latency |
| Media files | TrueNAS vault | Large capacity |
| Development | nvme-mirror2 | Fast, mid-tier |
| Containers | rpool | Good performance |
| Backups | TrueNAS or rpool | Large capacity |
| Archive | local-zfs2 (PVE2) | Cheap, can spin down |
Investigation Needed
- Get complete TrueNAS dataset list
- Document NFS/SMB share configuration
- Inventory EMC enclosure drives (count, capacity, model)
- Document current pool usage percentages
- Set up monthly capacity reports
- Configure ZFS scrub schedules
- Set up storage health alerts
Related Documentation
- BACKUP-STRATEGY.md - Backup and snapshot strategy
- EMC-ENCLOSURE.md - Storage enclosure maintenance
- VMS.md - VM storage assignments
- NETWORK.md - Storage network configuration
Last Updated: 2025-12-22