# Storage Architecture Documentation of all storage pools, datasets, shares, and capacity planning across the homelab. ## Overview ### Storage Distribution | Location | Type | Capacity | Purpose | |----------|------|----------|---------| | **PVE** | NVMe + SSD mirrors | ~9 TB usable | VM storage, fast IO | | **PVE2** | NVMe + HDD mirrors | ~6+ TB usable | VM storage, bulk data | | **TrueNAS** | ZFS pool + EMC enclosure | ~12+ TB usable | Central file storage, NFS/SMB | --- ## PVE (10.10.10.120) Storage Pools ### nvme-mirror1 (Primary Fast Storage) - **Type**: ZFS mirror - **Devices**: 2x Sabrent Rocket Q NVMe - **Capacity**: 3.6 TB usable - **Purpose**: High-performance VM storage - **Used By**: - Critical VMs requiring fast IO - Database workloads - Development environments **Check status**: ```bash ssh pve 'zpool status nvme-mirror1' ssh pve 'zpool list nvme-mirror1' ``` ### nvme-mirror2 (Secondary Fast Storage) - **Type**: ZFS mirror - **Devices**: 2x Kingston SFYRD 2TB NVMe - **Capacity**: 1.8 TB usable - **Purpose**: Additional fast VM storage - **Used By**: TBD **Check status**: ```bash ssh pve 'zpool status nvme-mirror2' ssh pve 'zpool list nvme-mirror2' ``` ### rpool (Root Pool) - **Type**: ZFS mirror - **Devices**: 2x Samsung 870 QVO 4TB SSD - **Capacity**: 3.6 TB usable - **Purpose**: Proxmox OS, container storage, VM backups - **Used By**: - Proxmox root filesystem - LXC containers - Local VM backups **Check status**: ```bash ssh pve 'zpool status rpool' ssh pve 'df -h /var/lib/vz' ``` ### Storage Pool Usage Summary (PVE) **Get current usage**: ```bash ssh pve 'zpool list' ssh pve 'pvesm status' ``` --- ## PVE2 (10.10.10.102) Storage Pools ### nvme-mirror3 (Fast Storage) - **Type**: ZFS mirror - **Devices**: 2x NVMe (model unknown) - **Capacity**: Unknown (needs investigation) - **Purpose**: High-performance VM storage - **Used By**: Trading VM (301), other VMs **Check status**: ```bash ssh pve2 'zpool status nvme-mirror3' ssh pve2 'zpool list nvme-mirror3' ``` ### local-zfs2 (Bulk Storage) - **Type**: ZFS mirror - **Devices**: 2x WD Red 6TB HDD - **Capacity**: ~6 TB usable - **Purpose**: Bulk/archival storage - **Power Management**: 30-minute spindown configured - Saves ~10-16W when idle - Udev rule: `/etc/udev/rules.d/69-hdd-spindown.rules` - Command: `hdparm -S 241` (30 min) **Notes**: - Pool had only 768 KB used as of 2024-12-16 - Drives configured to spin down after 30 min idle - Good for archival, NOT for active workloads **Check status**: ```bash ssh pve2 'zpool status local-zfs2' ssh pve2 'zpool list local-zfs2' # Check if drives are spun down ssh pve2 'hdparm -C /dev/sdX' # Shows active/standby ``` --- ## TrueNAS (VM 100 @ 10.10.10.200) - Central Storage ### ZFS Pool: vault **Primary storage pool** for all shared data. **Devices**: ❓ Needs investigation - EMC storage enclosure with multiple drives - SAS connection via LSI SAS2308 HBA (passed through to VM) **Capacity**: ❓ Needs investigation **Check pool status**: ```bash ssh truenas 'zpool status vault' ssh truenas 'zpool list vault' # Get detailed capacity ssh truenas 'zfs list -o name,used,avail,refer,mountpoint' ``` ### Datasets (Known) Based on Syncthing configuration, likely datasets: | Dataset | Purpose | Synced Devices | Notes | |---------|---------|----------------|-------| | vault/documents | Personal documents | Mac Mini, MacBook, Windows PC, Phone | ~11 GB | | vault/downloads | Downloads folder | Mac Mini, TrueNAS | ~38 GB | | vault/pictures | Photos | Mac Mini, MacBook, Phone | Unknown size | | vault/notes | Note files | Mac Mini, MacBook, Phone | Unknown size | | vault/desktop | Desktop sync | Unknown | 7.2 GB | | vault/movies | Movie library | Unknown | Unknown size | | vault/config | Config files | Mac Mini, MacBook | Unknown size | **Get complete dataset list**: ```bash ssh truenas 'zfs list -r vault' ``` ### NFS/SMB Shares **Status**: ❓ Not documented **Needs investigation**: ```bash # List NFS exports ssh truenas 'showmount -e localhost' # List SMB shares ssh truenas 'smbclient -L localhost -N' # Via TrueNAS API/UI # Sharing → Unix Shares (NFS) # Sharing → Windows Shares (SMB) ``` **Expected shares**: - Media libraries for Plex (on Saltbox VM) - Document storage - VM backups? - ISO storage? ### EMC Storage Enclosure **Model**: EMC KTN-STL4 (or similar) **Connection**: SAS via LSI SAS2308 HBA (passthrough to TrueNAS VM) **Drives**: ❓ Unknown count and capacity **See [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md)** for: - SES commands - Fan control - LCC (Link Control Card) troubleshooting - Maintenance procedures **Check enclosure status**: ```bash ssh truenas 'sg_ses --page=0x02 /dev/sgX' # Element descriptor ssh truenas 'smartctl --scan' # List all drives ``` --- ## Storage Network Architecture ### Internal Storage Network (10.10.10.20.0/24) **Purpose**: Dedicated network for NFS/iSCSI traffic to reduce congestion on main network. **Bridge**: vmbr3 on PVE (virtual bridge, no physical NIC) **Subnet**: 10.10.10.20.0/24 **DHCP**: No **Gateway**: No (internal only, no internet) **Connected VMs**: - TrueNAS VM (secondary NIC) - Saltbox VM (secondary NIC) - for NFS mounts - Other VMs needing storage access **Configuration**: ```bash # On TrueNAS VM - check second NIC ssh truenas 'ip addr show enp6s19' # On Saltbox - check NFS mounts ssh saltbox 'mount | grep nfs' ``` **Benefits**: - Separates storage traffic from general network - Prevents NFS/SMB from saturating main network - Better performance for storage-heavy workloads --- ## Storage Capacity Planning ### Current Usage (Estimate) **Needs actual audit**: ```bash # PVE pools ssh pve 'zpool list -o name,size,alloc,free' # PVE2 pools ssh pve2 'zpool list -o name,size,alloc,free' # TrueNAS vault pool ssh truenas 'zpool list vault' # Get detailed breakdown ssh truenas 'zfs list -r vault -o name,used,avail' ``` ### Growth Rate **Needs tracking** - recommend monthly snapshots of capacity: ```bash #!/bin/bash # Save as ~/bin/storage-capacity-report.sh DATE=$(date +%Y-%m-%d) REPORT=~/Backups/storage-reports/capacity-$DATE.txt mkdir -p ~/Backups/storage-reports echo "Storage Capacity Report - $DATE" > $REPORT echo "================================" >> $REPORT echo "" >> $REPORT echo "PVE Pools:" >> $REPORT ssh pve 'zpool list' >> $REPORT echo "" >> $REPORT echo "PVE2 Pools:" >> $REPORT ssh pve2 'zpool list' >> $REPORT echo "" >> $REPORT echo "TrueNAS Pools:" >> $REPORT ssh truenas 'zpool list' >> $REPORT echo "" >> $REPORT echo "TrueNAS Datasets:" >> $REPORT ssh truenas 'zfs list -r vault -o name,used,avail' >> $REPORT echo "Report saved to $REPORT" ``` **Run monthly via cron**: ```cron 0 9 1 * * ~/bin/storage-capacity-report.sh ``` ### Expansion Planning **When to expand**: - Pool reaches 80% capacity - Performance degrades - New workloads require more space **Expansion options**: 1. Add drives to existing pools (if mirrors, add mirror vdev) 2. Add new NVMe drives to PVE/PVE2 3. Expand EMC enclosure (add more drives) 4. Add second EMC enclosure **Cost estimates**: TBD --- ## ZFS Health Monitoring ### Daily Health Checks ```bash # Check for errors on all pools ssh pve 'zpool status -x' # Shows only unhealthy pools ssh pve2 'zpool status -x' ssh truenas 'zpool status -x' # Check scrub status ssh pve 'zpool status | grep scrub' ssh pve2 'zpool status | grep scrub' ssh truenas 'zpool status | grep scrub' ``` ### Scrub Schedule **Recommended**: Monthly scrub on all pools **Configure scrub**: ```bash # Via Proxmox UI: Node → Disks → ZFS → Select pool → Scrub # Or via cron: 0 2 1 * * /sbin/zpool scrub nvme-mirror1 0 2 1 * * /sbin/zpool scrub rpool ``` **On TrueNAS**: - Configure via UI: Storage → Pools → Scrub Tasks - Recommended: 1st of every month at 2 AM ### SMART Monitoring **Check drive health**: ```bash # PVE ssh pve 'smartctl -a /dev/nvme0' ssh pve 'smartctl -a /dev/sda' # TrueNAS ssh truenas 'smartctl --scan' ssh truenas 'smartctl -a /dev/sdX' # For each drive ``` **Configure SMART tests**: - TrueNAS UI: Tasks → S.M.A.R.T. Tests - Recommended: Weekly short test, monthly long test ### Alerts **Set up email alerts for**: - ZFS pool errors - SMART test failures - Pool capacity > 80% - Scrub failures --- ## Storage Performance Tuning ### ZFS ARC (Cache) **Check ARC usage**: ```bash ssh pve 'arc_summary' ssh truenas 'arc_summary' ``` **Tuning** (if needed): - PVE/PVE2: Set max ARC in `/etc/modprobe.d/zfs.conf` - TrueNAS: Configure via UI (System → Advanced → Tunables) ### NFS Performance **Mount options** (on clients like Saltbox): ``` rsize=131072,wsize=131072,hard,timeo=600,retrans=2,vers=3 ``` **Verify NFS mounts**: ```bash ssh saltbox 'mount | grep nfs' ``` ### Record Size Optimization **Different workloads need different record sizes**: - VMs: 64K (default, good for VMs) - Databases: 8K or 16K - Media files: 1M (large sequential reads) **Set record size** (on TrueNAS datasets): ```bash ssh truenas 'zfs set recordsize=1M vault/movies' ``` --- ## Disaster Recovery ### Pool Recovery **If a pool fails to import**: ```bash # Try importing with different name zpool import -f -N poolname newpoolname # Check pool with readonly zpool import -f -o readonly=on poolname # Force import (last resort) zpool import -f -F poolname ``` ### Drive Replacement **When a drive fails**: ```bash # Identify failed drive zpool status poolname # Replace drive zpool replace poolname old-device new-device # Monitor resilver watch zpool status poolname ``` ### Data Recovery **If pool is completely lost**: 1. Restore from offsite backup (see [BACKUP-STRATEGY.md](BACKUP-STRATEGY.md)) 2. Recreate pool structure 3. Restore data **Critical**: This is why we need offsite backups! --- ## Quick Reference ### Common Commands ```bash # Pool status zpool status [poolname] zpool list # Dataset usage zfs list zfs list -r vault # Check pool health (only unhealthy) zpool status -x # Scrub pool zpool scrub poolname # Get pool IO stats zpool iostat -v 1 # Snapshot management zfs snapshot poolname/dataset@snapname zfs list -t snapshot zfs rollback poolname/dataset@snapname zfs destroy poolname/dataset@snapname ``` ### Storage Locations by Use Case | Use Case | Recommended Storage | Why | |----------|---------------------|-----| | VM OS disk | nvme-mirror1 (PVE) | Fastest IO | | Database | nvme-mirror1/2 | Low latency | | Media files | TrueNAS vault | Large capacity | | Development | nvme-mirror2 | Fast, mid-tier | | Containers | rpool | Good performance | | Backups | TrueNAS or rpool | Large capacity | | Archive | local-zfs2 (PVE2) | Cheap, can spin down | --- ## Investigation Needed - [ ] Get complete TrueNAS dataset list - [ ] Document NFS/SMB share configuration - [ ] Inventory EMC enclosure drives (count, capacity, model) - [ ] Document current pool usage percentages - [ ] Set up monthly capacity reports - [ ] Configure ZFS scrub schedules - [ ] Set up storage health alerts --- ## Related Documentation - [BACKUP-STRATEGY.md](BACKUP-STRATEGY.md) - Backup and snapshot strategy - [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md) - Storage enclosure maintenance - [VMS.md](VMS.md) - VM storage assignments - [NETWORK.md](NETWORK.md) - Storage network configuration --- **Last Updated**: 2025-12-22