Phase 2 documentation implementation: - Created HARDWARE.md: Complete hardware inventory (servers, GPUs, storage, network cards) - Created SERVICES.md: Service inventory with URLs, credentials, health checks (25+ services) - Created MONITORING.md: Health monitoring recommendations, alert setup, implementation plan - Created MAINTENANCE.md: Regular procedures, update schedules, testing checklists - Updated README.md: Added all Phase 2 documentation links - Updated CLAUDE.md: Cleaned up to quick reference only (1340→377 lines) All detailed content now in specialized documentation files with cross-references. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 KiB
Hardware Inventory
Complete hardware specifications for all homelab equipment.
Servers
PVE (10.10.10.120) - Primary Proxmox Server
CPU
- Model: AMD Ryzen Threadripper PRO 3975WX
- Cores: 32 cores / 64 threads
- Base Clock: 3.5 GHz
- Boost Clock: 4.2 GHz
- TDP: 280W
- Architecture: Zen 2 (7nm)
- Socket: sTRX4
- Features: ECC support, PCIe 4.0
RAM
- Capacity: 128 GB
- Type: DDR4 ECC Registered
- Speed: Unknown (needs investigation)
- Channels: 8-channel (quad-channel per socket)
- Idle Power: ~30-40W
Storage
OS/VM Storage:
| Pool | Devices | Type | Capacity | Purpose |
|---|---|---|---|---|
nvme-mirror1 |
2x Sabrent Rocket Q NVMe | ZFS Mirror | 3.6 TB usable | High-performance VM storage |
nvme-mirror2 |
2x Kingston SFYRD 2TB NVMe | ZFS Mirror | 1.8 TB usable | Additional fast VM storage |
rpool |
2x Samsung 870 QVO 4TB SSD | ZFS Mirror | 3.6 TB usable | Proxmox OS, containers, backups |
Total Storage: ~9 TB usable
GPUs
| Model | Slot | VRAM | TDP | Purpose | Passed To |
|---|---|---|---|---|---|
| NVIDIA Quadro P2000 | PCIe slot 1 | 5 GB GDDR5 | 75W | Plex transcoding | Host |
| NVIDIA TITAN RTX | PCIe slot 2 | 24 GB GDDR6 | 280W | AI workloads | Saltbox (101), lmdev1 (111) |
Total GPU Power: 75W + 280W = 355W (under load)
Network Cards
| Interface | Model | Speed | Purpose | Bridge |
|---|---|---|---|---|
| enp1s0 | Intel I210 (onboard) | 1 Gb | Management | vmbr0 |
| enp35s0f0 | Intel X520 (dual-port SFP+) | 10 Gb | High-speed LXC | vmbr1 |
| enp35s0f1 | Intel X520 (dual-port SFP+) | 10 Gb | High-speed VM | vmbr2 |
10Gb Transceivers: Intel FTLX8571D3BCV (SFP+ 10GBASE-SR, 850nm, multimode)
Storage Controllers
| Model | Interface | Purpose |
|---|---|---|
| LSI SAS2308 HBA | PCIe 3.0 x8 | Passed to TrueNAS VM for EMC enclosure |
| Samsung NVMe controller | PCIe | Passed to TrueNAS VM for ZFS caching |
Motherboard
- Model: Unknown - needs investigation
- Chipset: AMD TRX40
- Form Factor: ATX/EATX
- PCIe Slots: Multiple PCIe 4.0 slots
- Features: IOMMU support, ECC memory
Power Supply
- Model: Unknown
- Wattage: Likely 1000W+ (needs investigation)
- Type: ATX, 80+ certification unknown
Cooling
- CPU Cooler: Unknown - likely large tower or AIO
- Case Fans: Unknown quantity
- Note: CPU temps 70-80°C under load (healthy)
PVE2 (10.10.10.102) - Secondary Proxmox Server
CPU
- Model: AMD Ryzen Threadripper PRO 3975WX
- Specs: Same as PVE (32C/64T, 280W TDP)
RAM
- Capacity: 128 GB DDR4 ECC
- Same specs as PVE
Storage
| Pool | Devices | Type | Capacity | Purpose |
|---|---|---|---|---|
nvme-mirror3 |
2x NVMe (model unknown) | ZFS Mirror | Unknown | High-performance VM storage |
local-zfs2 |
2x WD Red 6TB HDD | ZFS Mirror | ~6 TB usable | Bulk/archival storage (spins down) |
HDD Spindown: Configured for 30-min idle spindown (saves ~10-16W)
GPUs
| Model | Slot | VRAM | TDP | Purpose | Passed To |
|---|---|---|---|---|---|
| NVIDIA RTX A6000 | PCIe slot 1 | 48 GB GDDR6 | 300W | AI trading workloads | trading-vm (301) |
Network Cards
| Interface | Model | Speed | Purpose |
|---|---|---|---|
| nic1 | Unknown (onboard) | 1 Gb | Management |
Note: MTU set to 9000 for jumbo frames
Motherboard
- Model: Unknown
- Chipset: AMD TRX40
- Similar to PVE
Network Equipment
UniFi Dream Machine Pro (UCG-Fiber)
- Model: UniFi Cloud Gateway Fiber
- IP: 10.10.10.1
- Ports: Multiple 1Gb + SFP+ uplink
- Features: Router, firewall, VPN, IDS/IPS
- MTU: 9216 (supports jumbo frames)
- Tailscale: Installed for VPN failover
Switches
Details needed - investigate current switch setup:
- 10Gb switch for high-speed connections?
- 1Gb switch for general devices?
- PoE capabilities?
# Check what's connected to 10Gb interfaces
ssh pve 'ip link show enp35s0f0'
ssh pve 'ip link show enp35s0f1'
Storage Hardware
EMC Storage Enclosure
See EMC-ENCLOSURE.md for complete details
- Model: EMC KTN-STL4 (or similar)
- Form Factor: 4U rackmount
- Drive Bays: 25x 3.5" SAS/SATA
- Controllers: Dual LCC (Link Control Cards)
- Connection: SAS via LSI SAS2308 HBA
- Passed to: TrueNAS VM (VMID 100)
Current Status:
- LCC A: Active (working)
- LCC B: Failed (replacement ordered)
Drive Inventory: Unknown - needs audit
# Get drive list from TrueNAS
ssh truenas 'smartctl --scan'
ssh truenas 'lsblk'
NVMe Drives
| Model | Quantity | Capacity | Location | Pool |
|---|---|---|---|---|
| Sabrent Rocket Q | 2 | Unknown | PVE | nvme-mirror1 |
| Kingston SFYRD | 2 | 2 TB each | PVE | nvme-mirror2 |
| Unknown model | 2 | Unknown | PVE2 | nvme-mirror3 |
| Samsung (model unknown) | 1 | Unknown | TrueNAS (passed) | ZFS cache |
SSDs
| Model | Quantity | Capacity | Location | Pool |
|---|---|---|---|---|
| Samsung 870 QVO | 2 | 4 TB each | PVE | rpool |
HDDs
| Model | Quantity | Capacity | Location | Pool |
|---|---|---|---|---|
| WD Red | 2 | 6 TB each | PVE2 | local-zfs2 |
| Unknown (in EMC) | Unknown | Unknown | TrueNAS | vault |
UPS
Current UPS
| Specification | Value |
|---|---|
| Model | CyberPower OR2200PFCRT2U |
| Capacity | 2200VA / 1320W |
| Form Factor | 2U rackmount |
| Input | NEMA 5-15P (rewired from 5-20P) |
| Outlets | 2x 5-20R + 6x 5-15R |
| Output | PFC Sinewave |
| Runtime | ~15-20 min @ 33% load |
| Interface | USB (connected to PVE) |
See UPS.md for configuration details
Client Devices
Mac Mini (Hutson's Workstation)
- Model: Unknown generation
- CPU: Unknown
- RAM: Unknown
- Storage: Unknown
- Network: 1Gb Ethernet (en0) - MTU 9000
- Tailscale IP: 100.108.89.58
- Local IP: 10.10.10.125 (static)
- Purpose: Primary workstation, Happy Coder daemon host
MacBook (Mobile)
- Model: Unknown
- Network: Wi-Fi + Ethernet adapter
- Tailscale IP: Unknown
- Purpose: Mobile work, development
Windows PC
- Model: Unknown
- CPU: Unknown
- Network: 1Gb Ethernet
- IP: 10.10.10.150
- Purpose: Gaming, Windows development, Syncthing node
Phone (Android)
- Model: Unknown
- IP: 10.10.10.54 (when on Wi-Fi)
- Purpose: Syncthing mobile node, Happy Coder client
Rack Layout (If Applicable)
Needs documentation - Current rack configuration unknown
Suggested format:
U42: Blank panel
U41: UPS (CyberPower 2U)
U40: UPS (CyberPower 2U)
U39: Switch (10Gb)
U38-U35: EMC Storage Enclosure (4U)
U34: PVE Server
U33: PVE2 Server
...
Power Consumption
Measured Power Draw
| Component | Idle | Typical | Peak | Notes |
|---|---|---|---|---|
| PVE Server | 250-350W | 500W | 750W | CPU + GPUs + storage |
| PVE2 Server | 200-300W | 400W | 600W | CPU + GPU + storage |
| Network Gear | ~50W | ~50W | ~50W | Router + switches |
| Total | 500-700W | ~950W | ~1400W | Exceeds UPS under peak load |
UPS Capacity: 1320W Typical Load: 33-50% (safe margin) Peak Load: Can exceed UPS capacity temporarily (acceptable)
Power Optimizations Applied
See POWER-MANAGEMENT.md for details
- KSMD disabled: ~60-80W saved
- CPU governors: ~60-120W saved
- Syncthing rescans: ~60-80W saved
- HDD spindown: ~10-16W saved when idle
- Total savings: ~150-300W
Thermal Management
CPU Cooling
PVE & PVE2:
- CPU cooler: Unknown model
- Thermal paste: Unknown, likely needs refresh if temps >85°C
- Target temp: 70-80°C under load
- Max safe: 90°C Tctl (Threadripper PRO spec)
GPU Cooling
All GPUs are passively managed (stock coolers):
- TITAN RTX: 2-3W idle, 280W load
- RTX A6000: 11W idle, 300W load
- Quadro P2000: 25W constant (Plex active)
Case Airflow
Unknown - needs investigation:
- Case model?
- Fan configuration?
- Positive or negative pressure?
Cable Management
Network Cables
| Connection | Type | Length | Speed |
|---|---|---|---|
| PVE → Switch (10Gb) | OM3 fiber | Unknown | 10Gb |
| PVE2 → Router | Cat6 | Unknown | 1Gb |
| Mac Mini → Switch | Cat6 | Unknown | 1Gb |
| TrueNAS → EMC | SAS cable | Unknown | 6Gb/s |
Power Cables
Critical: All servers on UPS battery-backed outlets
Maintenance Schedule
Annual Maintenance
- Clean dust from servers (every 6-12 months)
- Check thermal paste on CPUs (every 2-3 years)
- Test UPS battery runtime (annually)
- Verify all fans operational
- Check for bulging capacitors on PSUs
Drive Health
# Check SMART status on all drives
ssh pve 'smartctl -a /dev/nvme0'
ssh pve2 'smartctl -a /dev/sda'
ssh truenas 'smartctl --scan | while read dev type; do echo "=== $dev ==="; smartctl -a $dev | grep -E "Model|Serial|Health|Reallocated|Current_Pending"; done'
Temperature Monitoring
# Check all temps (needs lm-sensors installed)
ssh pve 'sensors'
ssh pve2 'sensors'
Warranty & Purchase Info
Needs documentation:
- When were servers purchased?
- Where were components bought?
- Any warranties still active?
- Replacement part sources?
Upgrade Path
Short-term Upgrades (< 6 months)
- 20A circuit for UPS (restore original 5-20P plug)
- Document missing hardware specs
- Label all cables
- Create rack diagram
Medium-term Upgrades (6-12 months)
- Additional 10Gb NIC for PVE2?
- More NVMe storage?
- Upgrade network switches?
- Replace EMC enclosure with newer model?
Long-term Upgrades (1-2 years)
- CPU upgrade to newer Threadripper?
- RAM expansion to 256GB?
- Additional GPU for AI workloads?
- Migrate to PCIe 5.0 storage?
Investigation Needed
High-priority items to document:
- Get exact motherboard model (both servers)
- Get PSU model and wattage
- CPU cooler models
- Network switch models and configuration
- Complete drive inventory in EMC enclosure
- RAM speed and timings
- Case models
- Exact NVMe models for all drives
Commands to gather info:
# Motherboard
ssh pve 'dmidecode -t baseboard'
# CPU details
ssh pve 'lscpu'
# RAM details
ssh pve 'dmidecode -t memory | grep -E "Size|Speed|Manufacturer"'
# Storage devices
ssh pve 'lsblk -o NAME,SIZE,TYPE,TRAN,MODEL'
# Network cards
ssh pve 'lspci | grep -i network'
# GPU details
ssh pve 'lspci | grep -i vga'
ssh pve 'nvidia-smi -L' # If nvidia-smi available
Related Documentation
- VMS.md - VM resource allocation
- STORAGE.md - Storage pools and usage
- POWER-MANAGEMENT.md - Power optimizations
- UPS.md - UPS configuration
- NETWORK.md - Network configuration
- EMC-ENCLOSURE.md - Storage enclosure details
Last Updated: 2025-12-22 Status: ⚠️ Incomplete - many specs need investigation