Initial commit: Homelab infrastructure documentation

- CLAUDE.md: Main homelab assistant context and instructions - IP-ASSIGNMENTS.md: Complete IP address assignments - NETWORK.md: Network bridges, VLANs, and configuration - EMC-ENCLOSURE.md: EMC storage enclosure documentation - SYNCTHING.md: Syncthing setup and device list - SHELL-ALIASES.md: ZSH aliases for Claude Code sessions - HOMEASSISTANT.md: Home Assistant API and automations - INFRASTRUCTURE.md: Server hardware and power management - configs/: Shared shell configurations - scripts/: Utility scripts - mcp-central/: MCP server configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 02:31:02 -05:00
commit 93821d1557
17 changed files with 3267 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,22 @@
 # Secrets and credentials
 .env
 *.credentials
 *-credentials*.txt
 # macOS
 .DS_Store
 .AppleDouble
 .LSOverride
 # Editor/IDE
 .obsidian/
 .claude/
 .vscode/
 *.swp
 *.swo
 *~
 # Temporary files
 *.tmp
 *.bak
 nul
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,197 @@
 # Homelab Changelog
 ## 2024-12-16
 ### Power Investigation
 Investigated UPS power limit issues across both Proxmox servers.
 #### Findings
 1. **KSMD (Kernel Same-page Merging Daemon)** was consuming 50-57% CPU constantly on PVE
   - `sleep_millisecs` set to 12ms (extremely aggressive, default is 200ms)
   - `general_profit` was **negative** (-320MB) meaning it was wasting CPU
   - No memory overcommit situation (98GB allocated on 128GB RAM)
   - Diverse workloads (TrueNAS, Windows, Linux) = few duplicate pages to merge
 2. **GPU Power Draw** identified as major consumers:
   - RTX A6000 on PVE2: up to 300W TDP
   - TITAN RTX on PVE: up to 280W TDP
   - Quadro P2000 on PVE: up to 75W TDP
 3. **TrueNAS VM** occasionally spiking to 86% CPU (needs investigation)
 #### Changes Made
 - [x] **Disabled KSMD on PVE** (10.10.10.120)
  ```bash
  echo 0 > /sys/kernel/mm/ksm/run
  ```
  - Immediate result: KSMD CPU dropped from 51-57% to 0%
  - Load average dropped from 1.88 to 1.28
  - Estimated savings: ~7-10W continuous
 #### Additional Changes
 - [x] **Made KSMD disable persistent on both hosts**
  - Note: KSM is controlled via sysfs, not sysctl
  - Created systemd service `/etc/systemd/system/disable-ksm.service`:
  ```ini
  [Unit]
  Description=Disable KSM (Kernel Same-page Merging)
  After=multi-user.target
  [Service]
  Type=oneshot
  ExecStart=/bin/sh -c "echo 0 > /sys/kernel/mm/ksm/run"
  RemainAfterExit=yes
  [Install]
  WantedBy=multi-user.target
  ```
  - Enabled on both PVE and PVE2: `systemctl enable disable-ksm.service`
 ### Syncthing Rescan Interval Fix
 **Root Cause**: Syncthing on TrueNAS was rescanning 56GB of data every 60 seconds, causing constant 100% CPU usage (~3172 minutes CPU time in 3 days).
 **Folders affected** (changed from 60s to 3600s):
 - downloads (38GB)
 - documents (11GB)
 - desktop (7.2GB)
 - config, movies, notes, pictures
 **Fix applied**:
 ```bash
 # Downloaded config from TrueNAS
 ssh pve 'qm guest exec 100 -- cat /mnt/.ix-apps/app_mounts/syncthing/config/config/config.xml'
 # Changed all rescanIntervalS="60" to rescanIntervalS="3600"
 sed -i 's/rescanIntervalS="60"/rescanIntervalS="3600"/g' config.xml
 # Uploaded and restarted Syncthing
 curl -X POST -H "X-API-Key: xxx" http://localhost:20910/rest/system/restart
 ```
 **Note**: fsWatcher is enabled, so changes are detected in real-time. The rescan is just a safety net.
 **Estimated savings**: ~60-80W (TrueNAS VM CPU will drop from 86% to ~5-10% at idle)
 ### GPU Power State Investigation
 | GPU | VM | Idle Power | P-State | Status |
 |-----|-----|-----------|---------|--------|
 | RTX A6000 | trading-vm (301) | **11W** | P8 | Optimal |
 | TITAN RTX | lmdev1 (111) | **2W** | P8 | Excellent! |
 | Quadro P2000 | saltbox (101) | **25W** | P0 | Stuck due to Plex |
 **Findings**:
 - RTX A6000: Properly entering P8 (11W idle) - excellent
 - TITAN RTX: Only 2W at idle despite ComfyUI/Python processes (436MiB VRAM used)
  - Modern GPUs have much better idle power management
 - Quadro P2000: Stuck in P0 at 25W because Plex Transcoder holds GPU memory
  - Older Quadro cards don't idle as efficiently with processes attached
  - Power limit fixed at 75W (not adjustable)
 **Changes made**:
 - [x] Installed QEMU guest agent on lmdev1 (VM 111)
 - [x] Added SSH key access to lmdev1 (10.10.10.111)
 - [x] Updated ~/.ssh/config with lmdev1 entry
 ### CPU Governor Optimization
 **Issue**: Both servers using `performance` CPU governor, keeping CPUs at high frequencies (3-4GHz) even when 99% idle.
 **Changes**:
 #### PVE (10.10.10.120)
 - **Driver**: `amd-pstate-epp` (modern AMD P-State with Energy Performance Preference)
 - **Change**: Governor `performance` → `powersave`, EPP `performance` → `balance_power`
 - **Result**: Idle frequencies dropped from ~4GHz to ~1.7GHz
 - **Persistence**: Created `/etc/systemd/system/cpu-powersave.service`
  ```ini
  [Unit]
  Description=Set CPU governor to powersave with balance_power EPP
  After=multi-user.target
  [Service]
  Type=oneshot
  ExecStart=/bin/bash -c 'for gov in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo powersave > "$gov"; done; for epp in /sys/devices/system/cpu/cpu*/cpufreq/energy_performance_preference; do echo balance_power > "$epp"; done'
  RemainAfterExit=yes
  [Install]
  WantedBy=multi-user.target
  ```
 #### PVE2 (10.10.10.102)
 - **Driver**: `acpi-cpufreq` (older driver)
 - **Change**: Governor `performance` → `schedutil`
 - **Result**: Idle frequencies dropped from ~4GHz to ~2.2GHz
 - **Persistence**: Created `/etc/systemd/system/cpu-powersave.service`
  ```ini
  [Unit]
  Description=Set CPU governor to schedutil for power savings
  After=multi-user.target
  [Service]
  Type=oneshot
  ExecStart=/bin/bash -c 'for gov in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo schedutil > "$gov"; done'
  RemainAfterExit=yes
  [Install]
  WantedBy=multi-user.target
  ```
 **Estimated savings**: 30-60W per server (60-120W total)
 ### ksmtuned Service Disabled
 **Issue**: The `ksmtuned` (KSM tuning daemon) was still running on both servers even after KSMD was disabled. Consuming ~39 min CPU on PVE and ~12 min CPU on PVE2 over 3 days.
 **Fix**:
 ```bash
 systemctl stop ksmtuned
 systemctl disable ksmtuned
 ```
 Applied to both PVE and PVE2.
 **Estimated savings**: ~2-5W
 ### HDD Spindown on PVE2
 **Issue**: Two WD Red 6TB drives (local-zfs2 pool) spinning 24/7 despite pool having only 768KB used. Each drive uses 5-8W spinning.
 **Fix**:
 ```bash
 # Set 30-minute spindown timeout
 hdparm -S 241 /dev/sda /dev/sdb
 ```
 **Persistence**: Created udev rule `/etc/udev/rules.d/69-hdd-spindown.rules`:
 ```
 ACTION=="add", KERNEL=="sd[a-z]", ATTRS{model}=="WDC WD60EFRX-68L*", RUN+="/usr/sbin/hdparm -S 241 /dev/%k"
 ```
 **Estimated savings**: ~10-16W (when drives spin down)
 #### Pending Changes
 - [ ] Monitor overall power consumption after all optimizations
 - [ ] Consider PCIe ASPM optimization
 - [ ] Consider NMI watchdog disable
 ### SSH Key Setup
 - Added SSH key authentication to both Proxmox servers
 - Updated `~/.ssh/config` with entries for `pve` and `pve2`
 ---
 ## Notes
 ### What is KSMD?
 Kernel Same-page Merging Daemon - scans memory for duplicate pages across VMs and merges them. Trades CPU cycles for RAM savings. Useful when:
 - Overcommitting memory
 - Running many identical VMs
 Not useful when:
 - Plenty of RAM headroom (our case)
 - Diverse workloads with few duplicate pages
 - `general_profit` is negative
 ### What is Memory Ballooning?
 Guest-cooperative memory management. Hypervisor can request VMs to give back unused RAM. Independent from KSMD. Both are Proxmox/KVM memory optimization features but serve different purposes.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,962 @@
 # Homelab Infrastructure
 ## Quick Reference - Common Tasks
 | Task | Section | Quick Command |
 |------|---------|---------------|
 | **Add new public service** | [Reverse Proxy](#reverse-proxy-architecture-traefik) | Create Traefik config + Cloudflare DNS |
 | **Add Cloudflare DNS** | [Cloudflare API](#cloudflare-api-access) | `curl -X POST cloudflare.com/...` |
 | **Check server temps** | [Temperature Check](#server-temperature-check) | `ssh pve 'grep Tctl ...'` |
 | **Syncthing issues** | [Troubleshooting](#troubleshooting-runbooks) | Check API connections |
 | **SSL cert issues** | [Traefik DNS Challenge](#ssl-certificates) | Use `cloudflare` resolver |
 **Key Credentials (see sections for full details):**
 - Cloudflare: `cloudflare@htsn.io` / API Key in [Cloudflare API](#cloudflare-api-access)
 - SSH Password: `GrilledCh33s3#`
 - Traefik: CT 202 @ 10.10.10.250
 ---
 ## Role
 You are the **Homelab Assistant** - a Claude Code session dedicated to managing and maintaining Hutson's home infrastructure. Your responsibilities include:
 - **Infrastructure Management**: Proxmox servers, VMs, containers, networking
 - **File Sync**: Syncthing configuration across all devices (Mac Mini, MacBook, Windows PC, TrueNAS, Android)
 - **Network Administration**: Router config, SSH access, Tailscale, device management
 - **Power Optimization**: CPU governors, GPU power states, service tuning
 - **Documentation**: Keep CLAUDE.md, SYNCTHING.md, and SHELL-ALIASES.md up to date
 - **Automation**: Shell aliases, startup scripts, scheduled tasks
 You have full access to all homelab devices via SSH and APIs. Use this context to help troubleshoot, configure, and optimize the infrastructure.
 ### Proactive Behaviors
 When the user mentions issues or asks questions, proactively:
 - **"sync not working"** → Check Syncthing status on ALL devices, identify which is offline
 - **"device offline"** → Ping both local and Tailscale IPs, check if service is running
 - **"slow"** → Check CPU usage, running processes, Syncthing rescan activity
 - **"check status"** → Run full health check across all systems
 - **"something's wrong"** → Run diagnostics on likely culprits based on context
 ### Quick Health Checks
 Run these to get a quick overview of the homelab:
 ```bash
 # === FULL HEALTH CHECK ===
 # Syncthing connections (Mac Mini)
 curl -s -H "X-API-Key: oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5" "http://127.0.0.1:8384/rest/system/connections" | python3 -c "import sys,json; d=json.load(sys.stdin)['connections']; [print(f\"{v.get('name',k[:7])}: {'UP' if v['connected'] else 'DOWN'}\") for k,v in d.items()]"
 # Proxmox VMs
 ssh pve 'qm list' 2>/dev/null || echo "PVE: unreachable"
 ssh pve2 'qm list' 2>/dev/null || echo "PVE2: unreachable"
 # Ping critical devices
 ping -c 1 -W 1 10.10.10.200 >/dev/null && echo "TrueNAS: UP" || echo "TrueNAS: DOWN"
 ping -c 1 -W 1 10.10.10.1 >/dev/null && echo "Router: UP" || echo "Router: DOWN"
 # Check Windows PC Syncthing (often goes offline)
 nc -zw1 10.10.10.150 22000 && echo "Windows Syncthing: UP" || echo "Windows Syncthing: DOWN"
 ```
 ### Troubleshooting Runbooks
 | Symptom | Check | Fix |
 |---------|-------|-----|
 | Device not syncing | `curl Syncthing API → connections` | Check if device online, restart Syncthing |
 | Windows PC offline | `ping 10.10.10.150` then `nc -z 22000` | SSH in, `Start-ScheduledTask -TaskName "Syncthing"` |
 | Phone not syncing | Phone Syncthing app in background? | User must open app, keep screen on |
 | High CPU on TrueNAS | Syncthing rescan? KSM? | Check rescan intervals, disable KSM |
 | VM won't start | Storage available? RAM free? | `ssh pve 'qm start VMID'`, check logs |
 | Tailscale offline | `tailscale status` | `tailscale up` or restart service |
 | Sync stuck at X% | Folder errors? Conflicts? | Check `rest/folder/errors?folder=NAME` |
 | Server running hot | Check KSM, check CPU processes | Disable KSM, identify runaway process |
 | Storage enclosure loud | Check fan speed via SES | See [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md) |
 | Drives not detected | Check SAS link, LCC status | Switch LCC, rescan SCSI hosts |
 ### Server Temperature Check
 ```bash
 # Check temps on both servers (Threadripper PRO max safe: 90°C Tctl)
 ssh pve 'for f in /sys/class/hwmon/hwmon*/temp*_input; do label=$(cat ${f%_input}_label 2>/dev/null); if [ "$label" = "Tctl" ]; then echo "PVE Tctl: $(($(cat $f)/1000))°C"; fi; done'
 ssh pve2 'for f in /sys/class/hwmon/hwmon*/temp*_input; do label=$(cat ${f%_input}_label 2>/dev/null); if [ "$label" = "Tctl" ]; then echo "PVE2 Tctl: $(($(cat $f)/1000))°C"; fi; done'
 ```
 **Healthy temps**: 70-80°C under load. **Warning**: >85°C. **Throttle**: 90°C.
 ### Service Dependencies
 ```
 TrueNAS (10.10.10.200)
 ├── Central Syncthing hub - if down, sync breaks between devices
 ├── NFS/SMB shares for VMs
 └── Media storage for Plex
 PiHole (CT 200)
 └── DNS for entire network - if down, name resolution fails
 Traefik (CT 202)
 └── Reverse proxy - if down, external access to services fails
 Router (10.10.10.1)
 └── Everything - gateway for all traffic
 ```
 ### API Quick Reference
 | Service | Device | Endpoint | Auth |
 |---------|--------|----------|------|
 | Syncthing | Mac Mini | `http://127.0.0.1:8384/rest/` | `X-API-Key: oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5` |
 | Syncthing | MacBook | `http://127.0.0.1:8384/rest/` (via SSH) | `X-API-Key: qYkNdVLwy9qZZZ6MqnJr7tHX7KKdxGMJ` |
 | Syncthing | Phone | `https://10.10.10.54:8384/rest/` | `X-API-Key: Xxz3jDT4akUJe6psfwZsbZwG2LhfZuDM` |
 | Proxmox | PVE | `https://10.10.10.120:8006/api2/json/` | SSH key auth |
 | Proxmox | PVE2 | `https://10.10.10.102:8006/api2/json/` | SSH key auth |
 ### Common Maintenance Tasks
 When user asks for maintenance or you notice issues:
 1. **Check Syncthing sync status** - Any folders behind? Errors?
 2. **Verify all devices connected** - Run connection check
 3. **Check disk space** - `ssh pve 'df -h'`, `ssh pve2 'df -h'`
 4. **Review ZFS pool health** - `ssh pve 'zpool status'`
 5. **Check for stuck processes** - High CPU? Memory pressure?
 6. **Verify backups** - Are critical folders syncing?
 ### Emergency Commands
 ```bash
 # Restart VM on Proxmox
 ssh pve 'qm stop VMID && qm start VMID'
 # Check what's using CPU
 ssh pve 'ps aux --sort=-%cpu | head -10'
 # Check ZFS pool status (via QEMU agent)
 ssh pve 'qm guest exec 100 -- bash -c "zpool status vault"'
 # Check EMC enclosure fans
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses --index=coo,-1 --get=speed_code /dev/sg15"'
 # Force Syncthing rescan
 curl -X POST "http://127.0.0.1:8384/rest/db/scan?folder=FOLDER" -H "X-API-Key: API_KEY"
 # Restart Syncthing on Windows (when stuck)
 sshpass -p 'GrilledCh33s3#' ssh claude@10.10.10.150 'Stop-Process -Name syncthing -Force; Start-ScheduledTask -TaskName "Syncthing"'
 # Get all device IPs from router
 expect -c 'spawn ssh root@10.10.10.1 "cat /proc/net/arp"; expect "Password:"; send "GrilledCh33s3#\r"; expect eof'
 ```
 ## Overview
 Two Proxmox servers running various VMs and containers for home infrastructure, media, development, and AI workloads.
 ## Servers
 ### PVE (10.10.10.120) - Primary
 - **CPU**: AMD Ryzen Threadripper PRO 3975WX (32-core, 64 threads, 280W TDP)
 - **RAM**: 128 GB
 - **Storage**:
  - `nvme-mirror1`: 2x Sabrent Rocket Q NVMe (3.6TB usable)
  - `nvme-mirror2`: 2x Kingston SFYRD 2TB (1.8TB usable)
  - `rpool`: 2x Samsung 870 QVO 4TB SSD mirror (3.6TB usable)
 - **GPUs**:
  - NVIDIA Quadro P2000 (75W TDP) - Plex transcoding
  - NVIDIA TITAN RTX (280W TDP) - AI workloads, passed to saltbox/lmdev1
 - **Role**: Primary VM host, TrueNAS, media services
 ### PVE2 (10.10.10.102) - Secondary
 - **CPU**: AMD Ryzen Threadripper PRO 3975WX (32-core, 64 threads, 280W TDP)
 - **RAM**: 128 GB
 - **Storage**:
  - `nvme-mirror3`: 2x NVMe mirror
  - `local-zfs2`: 2x WD Red 6TB HDD mirror
 - **GPUs**:
  - NVIDIA RTX A6000 (300W TDP) - passed to trading-vm
 - **Role**: Trading platform, development
 ## SSH Access
 ### SSH Key Authentication (All Hosts)
 SSH keys are configured in `~/.ssh/config` on both Mac Mini and MacBook. Use the `~/.ssh/homelab` key.
 | Host Alias | IP | User | Type | Notes |
 |------------|-----|------|------|-------|
 | `pve` | 10.10.10.120 | root | Proxmox | Primary server |
 | `pve2` | 10.10.10.102 | root | Proxmox | Secondary server |
 | `truenas` | 10.10.10.200 | root | VM | NAS/storage |
 | `saltbox` | 10.10.10.100 | hutson | VM | Media automation |
 | `lmdev1` | 10.10.10.111 | hutson | VM | AI/LLM development |
 | `docker-host` | 10.10.10.206 | hutson | VM | Docker services |
 | `fs-dev` | 10.10.10.5 | hutson | VM | Development |
 | `copyparty` | 10.10.10.201 | hutson | VM | File sharing |
 | `gitea-vm` | 10.10.10.220 | hutson | VM | Git server |
 | `trading-vm` | 10.10.10.221 | hutson | VM | AI trading platform |
 | `pihole` | 10.10.10.10 | root | LXC | DNS/Ad blocking |
 | `traefik` | 10.10.10.250 | root | LXC | Reverse proxy |
 | `findshyt` | 10.10.10.8 | root | LXC | Custom app |
 **Usage examples:**
 ```bash
 ssh pve 'qm list'                    # List VMs
 ssh truenas 'zpool status vault'     # Check ZFS pool
 ssh saltbox 'docker ps'              # List containers
 ssh pihole 'pihole status'           # Check Pi-hole
 ```
 ### Password Auth (Special Cases)
 | Device | IP | User | Auth Method | Notes |
 |--------|-----|------|-------------|-------|
 | UniFi Router | 10.10.10.1 | root | expect (keyboard-interactive) | Gateway |
 | Windows PC | 10.10.10.150 | claude | sshpass | PowerShell, use `;` not `&&` |
 | HomeAssistant | 10.10.10.110 | - | QEMU agent only | No SSH server |
 **Router access (requires expect):**
 ```bash
 # Run command on router
 expect -c 'spawn ssh root@10.10.10.1 "hostname"; expect "Password:"; send "GrilledCh33s3#\r"; expect eof'
 # Get ARP table (all device IPs)
 expect -c 'spawn ssh root@10.10.10.1 "cat /proc/net/arp"; expect "Password:"; send "GrilledCh33s3#\r"; expect eof'
 ```
 **Windows PC access:**
 ```bash
 sshpass -p 'GrilledCh33s3#' ssh claude@10.10.10.150 'Get-Process | Select -First 5'
 ```
 **HomeAssistant (no SSH, use QEMU agent):**
 ```bash
 ssh pve 'qm guest exec 110 -- bash -c "ha core info"'
 ```
 ## VMs and Containers
 ### PVE (10.10.10.120)
 | VMID | Name | vCPUs | RAM | Purpose | GPU/Passthrough | QEMU Agent |
 |------|------|-------|-----|---------|-----------------|------------|
 | 100 | truenas | 8 | 32GB | NAS, storage | LSI SAS2308 HBA, Samsung NVMe | Yes |
 | 101 | saltbox | 16 | 16GB | Media automation | TITAN RTX | Yes |
 | 105 | fs-dev | 10 | 8GB | Development | - | Yes |
 | 110 | homeassistant | 2 | 2GB | Home automation | - | No |
 | 111 | lmdev1 | 8 | 32GB | AI/LLM development | TITAN RTX | Yes |
 | 201 | copyparty | 2 | 2GB | File sharing | - | Yes |
 | 206 | docker-host | 2 | 4GB | Docker services | - | Yes |
 | 200 | pihole (CT) | - | - | DNS/Ad blocking | - | N/A |
 | 202 | traefik (CT) | - | - | Reverse proxy | - | N/A |
 | 205 | findshyt (CT) | - | - | Custom app | - | N/A |
 ### PVE2 (10.10.10.102)
 | VMID | Name | vCPUs | RAM | Purpose | GPU/Passthrough | QEMU Agent |
 |------|------|-------|-----|---------|-----------------|------------|
 | 300 | gitea-vm | 2 | 4GB | Git server | - | Yes |
 | 301 | trading-vm | 16 | 32GB | AI trading platform | RTX A6000 | Yes |
 ### QEMU Guest Agent
 VMs with QEMU agent can be managed via `qm guest exec`:
 ```bash
 # Execute command in VM
 ssh pve 'qm guest exec 100 -- bash -c "zpool status vault"'
 # Get VM IP addresses
 ssh pve 'qm guest exec 100 -- bash -c "ip addr"'
 ```
 Only VM 110 (homeassistant) lacks QEMU agent - use its web UI instead.
 ## Power Management
 ### Estimated Power Draw
 - **PVE**: 500-750W (CPU + TITAN RTX + P2000 + storage + HBAs)
 - **PVE2**: 450-600W (CPU + RTX A6000 + storage)
 - **Combined**: ~1000-1350W under load
 ### Optimizations Applied
 1. **KSMD Disabled** (2024-12-17 updated)
   - Was consuming 44-57% CPU on PVE with negative profit
   - Caused CPU temp to rise from 74°C to 83°C
   - Savings: ~7-10W + significant temp reduction
   - Made permanent via:
     - systemd service: `/etc/systemd/system/disable-ksm.service`
     - **ksmtuned masked**: `systemctl mask ksmtuned` (prevents re-enabling)
   - **Note**: KSM can get re-enabled by Proxmox updates. If CPU is hot, check:
     ```bash
     cat /sys/kernel/mm/ksm/run  # Should be 0
     ps aux | grep ksmd          # Should show 0% CPU
     # If KSM is running (run=1), disable it:
     echo 0 > /sys/kernel/mm/ksm/run
     systemctl mask ksmtuned
     ```
 2. **Syncthing Rescan Intervals** (2024-12-16)
   - Changed aggressive 60s rescans to 3600s for large folders
   - Affected: downloads (38GB), documents (11GB), desktop (7.2GB), movies, pictures, notes, config
   - Savings: ~60-80W (TrueNAS VM was at constant 86% CPU)
 3. **CPU Governor Optimization** (2024-12-16)
   - PVE: `powersave` governor + `balance_power` EPP (amd-pstate-epp driver)
   - PVE2: `schedutil` governor (acpi-cpufreq driver)
   - Made permanent via systemd service: `/etc/systemd/system/cpu-powersave.service`
   - Savings: ~60-120W combined (CPUs now idle at 1.7-2.2GHz vs 4GHz)
 4. **GPU Power States** (2024-12-16) - Verified optimal
   - RTX A6000: 11W idle (P8 state)
   - TITAN RTX: 2-3W idle (P8 state)
   - Quadro P2000: 25W (P0 - Plex keeps it active)
 5. **ksmtuned Disabled** (2024-12-16)
   - KSM tuning daemon was still running after KSMD disabled
   - Stopped and disabled on both servers
   - Savings: ~2-5W
 6. **HDD Spindown on PVE2** (2024-12-16)
   - local-zfs2 pool (2x WD Red 6TB) had only 768KB used but drives spinning 24/7
   - Set 30-minute spindown via `hdparm -S 241`
   - Persistent via udev rule: `/etc/udev/rules.d/69-hdd-spindown.rules`
   - Savings: ~10-16W when spun down
 ### Potential Optimizations
 - [ ] PCIe ASPM power management
 - [ ] NMI watchdog disable
 ## Memory Configuration
 - Ballooning enabled on most VMs but not actively used
 - No memory overcommit (98GB allocated on 128GB physical for PVE)
 - KSMD was wasting CPU with no benefit (negative general_profit)
 ## Network
 See [NETWORK.md](NETWORK.md) for full details.
 ### Network Ranges
 | Network | Range | Purpose |
 |---------|-------|---------|
 | LAN | 10.10.10.0/24 | Primary network, all external access |
 | Internal | 10.10.20.0/24 | Inter-VM only (storage, NFS/iSCSI) |
 ### PVE Bridges (10.10.10.120)
 | Bridge | NIC | Speed | Purpose | Use For |
 |--------|-----|-------|---------|---------|
 | vmbr0 | enp1s0 | 1 Gb | Management | General VMs/CTs |
 | vmbr1 | enp35s0f0 | 10 Gb | High-speed LXC | Bandwidth-heavy containers |
 | vmbr2 | enp35s0f1 | 10 Gb | High-speed VM | TrueNAS, Saltbox, storage VMs |
 | vmbr3 | (none) | Virtual | Internal only | NFS/iSCSI traffic, no internet |
 ### Quick Reference
 ```bash
 # Add VM to standard network (1Gb)
 qm set VMID --net0 virtio,bridge=vmbr0
 # Add VM to high-speed network (10Gb)
 qm set VMID --net0 virtio,bridge=vmbr2
 # Add secondary NIC for internal storage network
 qm set VMID --net1 virtio,bridge=vmbr3
 ```
 - MTU 9000 (jumbo frames) on all bridges
 ## Common Commands
 ```bash
 # Check VM status
 ssh pve 'qm list'
 ssh pve2 'qm list'
 # Check container status
 ssh pve 'pct list'
 # Monitor CPU/power
 ssh pve 'top -bn1 | head -20'
 # Check ZFS pools
 ssh pve 'zpool status'
 # Check GPU (if nvidia-smi installed in VM)
 ssh pve 'lspci | grep -i nvidia'
 ```
 ## Remote Claude Code Sessions (Mac Mini)
 ### Overview
 The Mac Mini (`hutson-mac-mini.local`) runs the Happy Coder daemon, enabling on-demand Claude Code sessions accessible from anywhere via the Happy Coder mobile app. Sessions are created when you need them - no persistent tmux sessions required.
 ### Architecture
 ```
 Mac Mini (100.108.89.58 via Tailscale)
 ├── launchd (auto-starts on boot)
 │   └── com.hutson.happy-daemon.plist (starts Happy daemon)
 ├── Happy Coder daemon (manages remote sessions)
 └── Tailscale (secure remote access)
 ```
 ### How It Works
 1. Happy daemon runs on Mac Mini (auto-starts on boot)
 2. Open Happy Coder app on phone/tablet
 3. Start a new Claude session from the app
 4. Session runs in any working directory you choose
 5. Session ends when you're done - no cleanup needed
 ### Quick Commands
 ```bash
 # Check daemon status
 happy daemon list
 # Start a new session manually (from Mac Mini terminal)
 cd ~/Projects/homelab && happy claude
 # Check active sessions
 happy daemon list
 ```
 ### Mobile Access Setup (One-time)
 1. Download Happy Coder app:
   - iOS: https://apps.apple.com/us/app/happy-claude-code-client/id6748571505
   - Android: https://play.google.com/store/apps/details?id=com.ex3ndr.happy
 2. On Mac Mini, run: `happy auth` and scan QR code with the app
 3. Daemon auto-starts on boot via launchd
 ### Daemon Management
 ```bash
 happy daemon start    # Start daemon
 happy daemon stop     # Stop daemon
 happy daemon status   # Check status
 happy daemon list     # List active sessions
 ```
 ### Remote Access via SSH + Tailscale
 From any device on Tailscale network:
 ```bash
 # SSH to Mac Mini
 ssh hutson@100.108.89.58
 # Or via hostname
 ssh hutson@mac-mini
 # Start Claude in desired directory
 cd ~/Projects/homelab && happy claude
 ```
 ### Files & Configuration
 | File | Purpose |
 |------|---------|
 | `~/Library/LaunchAgents/com.hutson.happy-daemon.plist` | launchd auto-start Happy daemon |
 | `~/.happy/` | Happy Coder config and logs |
 ### Troubleshooting
 ```bash
 # Check if daemon is running
 pgrep -f "happy.*daemon"
 # Check launchd status
 launchctl list | grep happy
 # List active sessions
 happy daemon list
 # Restart daemon
 happy daemon stop && happy daemon start
 # If Tailscale is disconnected
 /Applications/Tailscale.app/Contents/MacOS/Tailscale up
 ```
 ## Agent and Tool Guidelines
 ### Background Agents
 - **Always spin up background agents when doing multiple independent tasks**
 - Background agents allow parallel execution of tasks that don't depend on each other
 - This improves efficiency and reduces total execution time
 - Use background agents for tasks like running tests, builds, or searches simultaneously
 ### MCP Tools for Web Searches
 #### ref.tools - Documentation Lookups
 - **`mcp__Ref__ref_search_documentation`**: Search through documentation for specific topics
 - **`mcp__Ref__ref_read_url`**: Read and parse content from documentation URLs
 #### Exa MCP - General Web and Code Searches
 - **`mcp__exa__web_search_exa`**: General web searches for current information
 - **`mcp__exa__get_code_context_exa`**: Code-related searches and repository lookups
 ### MCP Tools Reference Table
 | Tool Name | Provider | Purpose | Use Case |
 |-----------|----------|---------|----------|
 | `mcp__Ref__ref_search_documentation` | ref.tools | Search documentation | Finding specific topics in official docs |
 | `mcp__Ref__ref_read_url` | ref.tools | Read documentation URLs | Parsing and extracting content from doc pages |
 | `mcp__exa__web_search_exa` | Exa MCP | General web search | Current events, general information lookup |
 | `mcp__exa__get_code_context_exa` | Exa MCP | Code-specific search | Finding code examples, repository searches |
 ## Reverse Proxy Architecture (Traefik)
 ### Overview
 There are **TWO separate Traefik instances** handling different services:
 | Instance | Location | IP | Purpose | Manages |
 |----------|----------|-----|---------|---------|
 | **Traefik-Primary** | CT 202 | **10.10.10.250** | General services | All non-Saltbox services |
 | **Traefik-Saltbox** | VM 101 (Docker) | **10.10.10.100** | Saltbox services | Plex, *arr apps, media stack |
 ### ⚠️ CRITICAL RULE: Which Traefik to Use
 **When adding ANY new service:**
 - ✅ **Use Traefik-Primary (10.10.10.250)** - Unless service lives inside Saltbox VM
 - ❌ **DO NOT touch Traefik-Saltbox** - It manages Saltbox services with their own certificates
 **Why this matters:**
 - Traefik-Saltbox has complex Saltbox-managed configs
 - Messing with it breaks Plex, Sonarr, Radarr, and all media services
 - Each Traefik has its own Let's Encrypt certificates
 - Mixing them causes certificate conflicts
 ### Traefik-Primary (CT 202) - For New Services
 **Location**: `/etc/traefik/` on Container 202
 **Config**: `/etc/traefik/traefik.yaml`
 **Dynamic Configs**: `/etc/traefik/conf.d/*.yaml`
 **Services using Traefik-Primary (10.10.10.250):**
 - excalidraw.htsn.io → 10.10.10.206:8080 (docker-host)
 - findshyt.htsn.io → 10.10.10.205 (CT 205)
 - gitea (git.htsn.io) → 10.10.10.220:3000
 - homeassistant → 10.10.10.110
 - lmdev → 10.10.10.111
 - pihole → 10.10.10.200
 - truenas → 10.10.10.200
 - proxmox → 10.10.10.120
 - copyparty → 10.10.10.201
 - aitrade → trading server
 - pulse.htsn.io → 10.10.10.206:7655 (Pulse monitoring)
 **Access Traefik config:**
 ```bash
 # From Mac Mini:
 ssh pve 'pct exec 202 -- cat /etc/traefik/traefik.yaml'
 ssh pve 'pct exec 202 -- ls /etc/traefik/conf.d/'
 # Edit a service config:
 ssh pve 'pct exec 202 -- vi /etc/traefik/conf.d/myservice.yaml'
 ```
 ### Traefik-Saltbox (VM 101) - DO NOT MODIFY
 **Location**: `/opt/traefik/` inside Saltbox VM
 **Managed by**: Saltbox Ansible playbooks
 **Mounts**: Docker bind mount from `/opt/traefik` → `/etc/traefik` in container
 **Services using Traefik-Saltbox (10.10.10.100):**
 - Plex (plex.htsn.io)
 - Sonarr, Radarr, Lidarr
 - SABnzbd, NZBGet, qBittorrent
 - Overseerr, Tautulli, Organizr
 - Jackett, NZBHydra2
 - Authelia (SSO)
 - All other Saltbox-managed containers
 **View Saltbox Traefik (read-only):**
 ```bash
 ssh pve 'qm guest exec 101 -- bash -c "docker exec traefik cat /etc/traefik/traefik.yml"'
 ```
 ### Adding a New Public Service - Complete Workflow
 Follow these steps to deploy a new service and make it publicly accessible at `servicename.htsn.io`.
 #### Step 0. Deploy Your Service
 First, deploy your service on the appropriate host:
 **Option A: Docker on docker-host (10.10.10.206)**
 ```bash
 ssh hutson@10.10.10.206
 sudo mkdir -p /opt/myservice
 cat > /opt/myservice/docker-compose.yml << 'EOF'
 version: "3.8"
 services:
  myservice:
    image: myimage:latest
    ports:
      - "8080:80"
    restart: unless-stopped
 EOF
 cd /opt/myservice && sudo docker-compose up -d
 ```
 **Option B: New LXC Container on PVE**
 ```bash
 ssh pve 'pct create CTID local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
  --hostname myservice --memory 2048 --cores 2 \
  --net0 name=eth0,bridge=vmbr0,ip=10.10.10.XXX/24,gw=10.10.10.1 \
  --rootfs local-zfs:8 --unprivileged 1 --start 1'
 ```
 **Option C: New VM on PVE**
 ```bash
 ssh pve 'qm create VMID --name myservice --memory 2048 --cores 2 \
  --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci'
 ```
 #### Step 1. Create Traefik Config File
 Use this template for new services on **Traefik-Primary (CT 202)**:
 ```yaml
 # /etc/traefik/conf.d/myservice.yaml
 http:
  routers:
    # HTTPS router
    myservice-secure:
      entryPoints:
        - websecure
      rule: "Host(`myservice.htsn.io`)"
      service: myservice
      tls:
        certResolver: cloudflare  # Use 'cloudflare' for proxied domains, 'letsencrypt' for DNS-only
      priority: 50
    # HTTP → HTTPS redirect
    myservice-redirect:
      entryPoints:
        - web
      rule: "Host(`myservice.htsn.io`)"
      middlewares:
        - myservice-https-redirect
      service: myservice
      priority: 50
  services:
    myservice:
      loadBalancer:
        servers:
          - url: "http://10.10.10.XXX:PORT"
  middlewares:
    myservice-https-redirect:
      redirectScheme:
        scheme: https
        permanent: true
 ```
 ### SSL Certificates
 Traefik has **two certificate resolvers** configured:
 | Resolver | Use When | Challenge Type | Notes |
 |----------|----------|----------------|-------|
 | `letsencrypt` | Cloudflare DNS-only (gray cloud) | HTTP-01 | Requires port 80 reachable |
 | `cloudflare` | Cloudflare Proxied (orange cloud) | DNS-01 | Works with Cloudflare proxy |
 **⚠️ Important:** If Cloudflare proxy is enabled (orange cloud), HTTP challenge fails because Cloudflare redirects HTTP→HTTPS. Use `cloudflare` resolver instead.
 **Cloudflare API credentials** are configured in `/etc/systemd/system/traefik.service`:
 ```bash
 Environment="CF_API_EMAIL=cloudflare@htsn.io"
 Environment="CF_API_KEY=849ebefd163d2ccdec25e49b3e1b3fe2cdadc"
 ```
 **Certificate storage:**
 - HTTP challenge certs: `/etc/traefik/acme.json`
 - DNS challenge certs: `/etc/traefik/acme-cf.json`
 **Deploy the config:**
 ```bash
 # Create file on CT 202
 ssh pve 'pct exec 202 -- bash -c "cat > /etc/traefik/conf.d/myservice.yaml << '\''EOF'\''
 <paste config here>
 EOF"'
 # Traefik auto-reloads (watches conf.d directory)
 # Check logs:
 ssh pve 'pct exec 202 -- tail -f /var/log/traefik/traefik.log'
 ```
 #### 2. Add Cloudflare DNS Entry
 **Cloudflare Credentials:**
 - Email: `cloudflare@htsn.io`
 - API Key: `849ebefd163d2ccdec25e49b3e1b3fe2cdadc`
 **Manual method (via Cloudflare Dashboard):**
 1. Go to https://dash.cloudflare.com/
 2. Select `htsn.io` domain
 3. DNS → Add Record
 4. Type: `A`, Name: `myservice`, IPv4: `70.237.94.174`, Proxied: ☑️
 **Automated method (CLI script):**
 Save this as `~/bin/add-cloudflare-dns.sh`:
 ```bash
 #!/bin/bash
 # Add DNS record to Cloudflare for htsn.io
 SUBDOMAIN="$1"
 CF_EMAIL="cloudflare@htsn.io"
 CF_API_KEY="849ebefd163d2ccdec25e49b3e1b3fe2cdadc"
 ZONE_ID="c0f5a80448c608af35d39aa820a5f3af"  # htsn.io zone
 PUBLIC_IP="70.237.94.174"  # Update if IP changes: curl -s ifconfig.me
 if [ -z "$SUBDOMAIN" ]; then
  echo "Usage: $0 <subdomain>"
  echo "Example: $0 myservice  # Creates myservice.htsn.io"
  exit 1
 fi
 curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" \
  -H "X-Auth-Email: $CF_EMAIL" \
  -H "X-Auth-Key: $CF_API_KEY" \
  -H "Content-Type: application/json" \
  --data "{
    \"type\":\"A\",
    \"name\":\"$SUBDOMAIN\",
    \"content\":\"$PUBLIC_IP\",
    \"ttl\":1,
    \"proxied\":true
  }" | jq .
 ```
 **Usage:**
 ```bash
 chmod +x ~/bin/add-cloudflare-dns.sh
 ~/bin/add-cloudflare-dns.sh excalidraw  # Creates excalidraw.htsn.io
 ```
 #### 3. Testing
 ```bash
 # Check if DNS resolves
 dig myservice.htsn.io
 # Test HTTP redirect
 curl -I http://myservice.htsn.io
 # Test HTTPS
 curl -I https://myservice.htsn.io
 # Check Traefik dashboard (if enabled)
 # Access: http://10.10.10.250:8080/dashboard/
 ```
 #### Step 4. Update Documentation
 After deploying, update these files:
 1. **IP-ASSIGNMENTS.md** - Add to Services & Reverse Proxy Mapping table
 2. **CLAUDE.md** - Add to "Services using Traefik-Primary" list (line ~495)
 ### Quick Reference - One-Liner Commands
 ```bash
 # === DEPLOY SERVICE (example: myservice on docker-host port 8080) ===
 # 1. Create Traefik config
 ssh pve 'pct exec 202 -- bash -c "cat > /etc/traefik/conf.d/myservice.yaml << EOF
 http:
  routers:
    myservice-secure:
      entryPoints: [websecure]
      rule: Host(\\\`myservice.htsn.io\\\`)
      service: myservice
      tls: {certResolver: letsencrypt}
  services:
    myservice:
      loadBalancer:
        servers:
          - url: http://10.10.10.206:8080
 EOF"'
 # 2. Add Cloudflare DNS
 curl -s -X POST "https://api.cloudflare.com/client/v4/zones/c0f5a80448c608af35d39aa820a5f3af/dns_records" \
  -H "X-Auth-Email: cloudflare@htsn.io" \
  -H "X-Auth-Key: 849ebefd163d2ccdec25e49b3e1b3fe2cdadc" \
  -H "Content-Type: application/json" \
  --data '{"type":"A","name":"myservice","content":"70.237.94.174","proxied":true}'
 # 3. Test (wait a few seconds for DNS propagation)
 curl -I https://myservice.htsn.io
 ```
 ### Traefik Troubleshooting
 ```bash
 # View Traefik logs (CT 202)
 ssh pve 'pct exec 202 -- tail -f /var/log/traefik/traefik.log'
 # Check if config is valid
 ssh pve 'pct exec 202 -- cat /etc/traefik/conf.d/myservice.yaml'
 # List all dynamic configs
 ssh pve 'pct exec 202 -- ls -la /etc/traefik/conf.d/'
 # Check certificate
 ssh pve 'pct exec 202 -- cat /etc/traefik/acme.json | jq'
 # Restart Traefik (if needed)
 ssh pve 'pct exec 202 -- systemctl restart traefik'
 ```
 ### Certificate Management
 **Let's Encrypt certificates** are automatically managed by Traefik.
 **Certificate storage:**
 - Traefik-Primary: `/etc/traefik/acme.json` on CT 202
 - Traefik-Saltbox: `/opt/traefik/acme.json` on VM 101
 **Certificate renewal:**
 - Automatic via HTTP-01 challenge
 - Traefik checks every 24h
 - Renews 30 days before expiry
 **If certificates fail:**
 ```bash
 # Check acme.json permissions (must be 600)
 ssh pve 'pct exec 202 -- ls -la /etc/traefik/acme.json'
 # Check Traefik can reach Let's Encrypt
 ssh pve 'pct exec 202 -- curl -I https://acme-v02.api.letsencrypt.org/directory'
 # Delete bad certificate (Traefik will re-request)
 ssh pve 'pct exec 202 -- rm /etc/traefik/acme.json'
 ssh pve 'pct exec 202 -- touch /etc/traefik/acme.json'
 ssh pve 'pct exec 202 -- chmod 600 /etc/traefik/acme.json'
 ssh pve 'pct exec 202 -- systemctl restart traefik'
 ```
 ### Docker Service with Traefik Labels (Alternative)
 If deploying a service via Docker on `docker-host` (VM 206), you can use Traefik labels instead of config files:
 ```yaml
 # docker-compose.yml
 services:
  myservice:
    image: myimage:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.myservice.rule=Host(`myservice.htsn.io`)"
      - "traefik.http.routers.myservice.entrypoints=websecure"
      - "traefik.http.routers.myservice.tls.certresolver=letsencrypt"
      - "traefik.http.services.myservice.loadbalancer.server.port=8080"
    networks:
      - traefik
 networks:
  traefik:
    external: true
 ```
 **Note**: This requires Traefik to have access to Docker socket and be on same network.
 ## Cloudflare API Access
 **Credentials** (stored in Saltbox config):
 - Email: `cloudflare@htsn.io`
 - API Key: `849ebefd163d2ccdec25e49b3e1b3fe2cdadc`
 - Domain: `htsn.io`
 **Retrieve from Saltbox:**
 ```bash
 ssh pve 'qm guest exec 101 -- bash -c "cat /srv/git/saltbox/accounts.yml | grep -A2 cloudflare"'
 ```
 **Cloudflare API Documentation:**
 - API Docs: https://developers.cloudflare.com/api/
 - DNS Records: https://developers.cloudflare.com/api/operations/dns-records-for-a-zone-create-dns-record
 **Common API operations:**
 ```bash
 # Set credentials
 CF_EMAIL="cloudflare@htsn.io"
 CF_API_KEY="849ebefd163d2ccdec25e49b3e1b3fe2cdadc"
 ZONE_ID="c0f5a80448c608af35d39aa820a5f3af"
 # List all DNS records
 curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" \
  -H "X-Auth-Email: $CF_EMAIL" \
  -H "X-Auth-Key: $CF_API_KEY" | jq
 # Add A record
 curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" \
  -H "X-Auth-Email: $CF_EMAIL" \
  -H "X-Auth-Key: $CF_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{"type":"A","name":"subdomain","content":"IP","proxied":true}'
 # Delete record
 curl -X DELETE "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$RECORD_ID" \
  -H "X-Auth-Email: $CF_EMAIL" \
  -H "X-Auth-Key: $CF_API_KEY"
 ```
 ## Related Documentation
 | File | Description |
 |------|-------------|
 | [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md) | EMC storage enclosure (SES commands, LCC troubleshooting, maintenance) |
 | [HOMEASSISTANT.md](HOMEASSISTANT.md) | Home Assistant API access, automations, integrations |
 | [NETWORK.md](NETWORK.md) | Network bridges, VLANs, which bridge to use for new VMs |
 | [IP-ASSIGNMENTS.md](IP-ASSIGNMENTS.md) | Complete IP address assignments for all devices and services |
 | [SYNCTHING.md](SYNCTHING.md) | Syncthing setup, API access, device list, troubleshooting |
 | [SHELL-ALIASES.md](SHELL-ALIASES.md) | ZSH aliases for Claude Code (`chomelab`, `ctrading`, etc.) |
 | [configs/](configs/) | Symlinks to shared shell configs |
 ---
 ## Backlog
 Future improvements and maintenance tasks:
 | Priority | Task | Notes |
 |----------|------|-------|
 | Medium | **Re-IP all devices** | Current IP scheme is inconsistent. Plan: VMs 10.10.10.100-199, LXCs 10.10.10.200-249, Services 10.10.10.250-254 |
 | Low | Install SSH on HomeAssistant | Currently only accessible via QEMU agent |
 | Low | Set up SSH key for router | Currently requires expect/password |
 ---
 ## Changelog
 ### 2024-12-20
 **SSH Key Deployment - All Systems**
 - Added SSH keys to ALL VMs and LXCs (13 total hosts now accessible via key)
 - Updated `~/.ssh/config` with complete host aliases
 - Fixed permissions: FindShyt LXC `.ssh` ownership, enabled PermitRootLogin on LXCs
 - Hosts now accessible: pve, pve2, truenas, saltbox, lmdev1, docker-host, fs-dev, copyparty, gitea-vm, trading-vm, pihole, traefik, findshyt
 **Documentation Updates**
 - Rewrote SSH Access section with complete host table
 - Added Password Auth section for router/Windows/HomeAssistant
 - Added Backlog section with re-IP task
 ### 2024-12-19
 **EMC Storage Enclosure - LCC B Failure**
 - Diagnosed loud fan issue (speed code 5 → 4160 RPM)
 - Root cause: Faulty LCC B controller causing false readings
 - Resolution: Switched SAS cable to LCC A, fans now quiet (speed code 3 → 2670 RPM)
 - Replacement ordered: EMC 303-108-000E ($14.95 eBay)
 - Created [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md) with full documentation
 **SSH Key Consolidation**
 - Renamed `~/.ssh/ai_trading_ed25519` → `~/.ssh/homelab`
 - Updated `~/.ssh/config` on MacBook with all homelab hosts
 - SSH key auth now works for: pve, pve2, docker-host, fs-dev, copyparty, lmdev1, gitea-vm, trading-vm
 - No more sshpass needed for PVE servers
 **QEMU Guest Agent Deployment**
 - Installed on: docker-host (206), fs-dev (105), copyparty (201)
 - All PVE VMs now have agent except homeassistant (110)
 - Can now use `qm guest exec` for remote commands
 **VM Configuration Updates**
 - docker-host: Fixed SSH key in cloud-init
 - fs-dev: Fixed `.ssh` directory ownership (1000 → 1001)
 - copyparty: Changed from DHCP to static IP (10.10.10.201)
 **Documentation Updates**
 - Updated CLAUDE.md SSH section (removed sshpass examples)
 - Added QEMU Agent column to VM tables
 - Added storage enclosure troubleshooting to runbooks
--- a/EMC-ENCLOSURE.md
+++ b/EMC-ENCLOSURE.md
@@ -0,0 +1,247 @@
 # EMC Storage Enclosure Documentation
 ## Hardware Overview
 | Component | Details |
 |-----------|---------|
 | **Model** | EMC ESES Viper DAE (KTN-STL3) |
 | **Capacity** | 15x 3.5" SAS/SATA drive bays |
 | **SES Device** | `/dev/sg15` (on TrueNAS) |
 | **Connection** | SAS to LSI SAS2308 HBA (mpt2sas driver) |
 | **Location** | Connected to PVE (10.10.10.120) via TrueNAS VM |
 ## Components
 ### LCC Controllers (Link Control Cards)
 The enclosure has **dual LCC controllers** for redundancy:
 | Controller | Slot | Status | Notes |
 |------------|------|--------|-------|
 | **LCC A** | Left | Working | Currently in use |
 | **LCC B** | Right | Faulty | Causes high fan speed, SAS discovery failure |
 **Replacement Part**: EMC 303-108-000E VIPER 6G SAS LCC (~$15 on eBay)
 ### Power Supplies
 Two redundant PSUs with integrated fans.
 ### Fans
 Multiple cooling fans controlled by enclosure firmware. Fan speeds are **automatically managed** based on temperature - manual override is not supported on EMC ESES enclosures.
 **Fan Speed Codes**:
 | Code | Description | RPM (approx) |
 |------|-------------|--------------|
 | 1 | Lowest | ~1500 |
 | 2 | Second lowest | ~2000 |
 | 3 | Third lowest | ~2670 |
 | 4 | Medium | ~3300 |
 | 5 | Fifth | ~4160 |
 | 6 | Sixth | ~4800 |
 | 7 | Highest | ~5500+ |
 ## ZFS Pool Using This Enclosure
 ```
 Pool: vault
 Size: 164TB raidz1
 Drives: 13x HDD in raidz1 + special mirror + NVMe cache/log
 Mount: /mnt/vault on TrueNAS
 ```
 ## SES Commands Reference
 All commands run from TrueNAS (VM 100):
 ```bash
 # Check overall enclosure status
 sg_ses -p 0x02 /dev/sg15
 # Check fan speeds
 sg_ses --index=coo,-1 --get=speed_code /dev/sg15
 # Check temperatures
 sg_ses -p 0x02 /dev/sg15 | grep -E "(Temperature|Cooling)"
 # Check PSU status
 sg_ses -p 0x02 /dev/sg15 | grep -A5 "Power supply"
 # Check LCC controller status
 sg_ses -p 0x02 /dev/sg15 | grep -A5 "Enclosure services controller"
 # List all SES elements
 sg_ses -p 0x07 /dev/sg15
 # Identify enclosure (flash LEDs)
 sg_ses --index=enc,0 --set=ident:1 /dev/sg15
 ```
 ### Running SES Commands via Proxmox
 ```bash
 # From Mac (via SSH key auth)
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15"'
 # Quick fan check
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses --index=coo,-1 --get=speed_code /dev/sg15"'
 # Quick temp check
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep Temperature"'
 ```
 ## Troubleshooting
 ### Symptom: Fans Running Loud (Speed 5+)
 **Possible Causes**:
 1. **Faulty LCC controller** - Switch to other LCC
 2. **High temperatures** - Check temp sensors
 3. **PSU issue** - Check PSU status via SES
 4. **Failed drive** - Check drive status LEDs
 **Diagnosis Steps**:
 ```bash
 # 1. Check current fan speed
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses --index=coo,-1 --get=speed_code /dev/sg15"'
 # Normal: 1-3, High: 4-5, Critical: 6-7
 # 2. Check temperatures
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep Temperature"'
 # Normal: 25-40C, Warning: 45-50C, Critical: 55C+
 # 3. Check for component failures
 ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep -i fail"'
 # 4. If no obvious cause, try switching LCC
 # Power down enclosure, move SAS cable to other LCC port
 ```
 ### Symptom: Drives Not Detected After Enclosure Power Cycle
 **Possible Causes**:
 1. Enclosure not fully initialized (wait for green LEDs to stop blinking)
 2. Faulty LCC controller
 3. SAS cable loose
 4. HBA needs rescan
 **Diagnosis Steps**:
 ```bash
 # 1. Check SAS link status
 cat /sys/class/sas_phy/*/negotiated_linkrate
 # 2. Check for expanders (should show enclosure)
 lsscsi -g | grep -i enclo
 # 3. Force HBA rescan
 echo "- - -" > /sys/class/scsi_host/host0/scan
 # 4. If no expander, check SAS cable and try other LCC port
 ```
 ### Symptom: Pool Won't Import After Enclosure Maintenance
 ```bash
 # 1. Wait for enclosure to fully initialize (1-2 minutes)
 # 2. Rescan for devices
 echo "- - -" > /sys/class/scsi_host/host0/scan
 # 3. Import pool
 zpool import vault
 # 4. If read-only mount issues, reboot TrueNAS
 ssh pve 'qm reboot 100'
 ```
 ## Maintenance Procedures
 ### Safe Shutdown for Enclosure Maintenance
 ```bash
 # 1. Stop services using the pool
 ssh pve 'qm guest exec 101 -- bash -c "docker stop \$(docker ps -q)"'
 # 2. Shutdown TrueNAS (auto-exports ZFS pool)
 ssh pve 'qm shutdown 100 --timeout 120'
 # 3. Wait for TrueNAS to fully stop
 ssh pve 'while qm status 100 | grep -q running; do sleep 5; done'
 # 4. Power off enclosure
 # (Physical switch or PDU)
 # 5. Perform maintenance
 # 6. Power on enclosure, wait for initialization (green LEDs solid)
 # 7. Start TrueNAS
 ssh pve 'qm start 100'
 # 8. Verify pool imported
 ssh pve 'qm guest exec 100 -- bash -c "zpool status vault"'
 ```
 ### Hot-Swap LCC Controller
 LCCs can be hot-swapped while enclosure is running:
 1. Order replacement LCC (EMC 303-108-000E)
 2. Move SAS cable to working LCC (if not already)
 3. Wait for drives to come online via new LCC
 4. Remove faulty LCC
 5. Install replacement LCC
 6. Optionally move SAS cable back to original port
 ## Incident Log
 ### 2024-12-19: LCC B Failure
 **Symptoms**:
 - Fans running at speed code 5 (~4160 RPM) - very loud
 - After enclosure power cycle, drives not detected
 - SAS link UP (4 PHYs at 6.0 Gbit) but no expander discovery
 **Root Cause**:
 LCC B controller malfunction causing:
 - False temperature/error readings → high fan speed
 - SAS expander not responding → drives not enumerated
 **Resolution**:
 1. Moved SAS cable from LCC B to LCC A
 2. Drives immediately appeared
 3. Fan speed dropped to code 3 (2670 RPM) - quiet
 4. Imported vault pool, all data intact
 **Replacement Ordered**:
 - Part: EMC 303-108-000E VIPER 6G SAS LCC
 - Source: eBay
 - Price: $14.95 + free shipping
 ## LED Status Reference
 ### Drive LEDs
 | LED | Color | Status |
 |-----|-------|--------|
 | Solid Blue | Power | Drive has power |
 | Blinking Blue | Activity | I/O in progress |
 | Solid Amber | Fault | Drive failed |
 | Blinking Amber | Identify | Drive being located |
 ### LCC LEDs
 | LED | Color | Status |
 |-----|-------|--------|
 | Solid Green | Link | SAS connection active |
 | Blinking Green | Activity | Data transfer |
 | Amber | Fault | LCC issue |
 ### PSU LEDs
 | LED | Color | Status |
 |-----|-------|--------|
 | Solid Green | OK | Power supply healthy |
 | Off | No Power | No AC input |
 | Amber | Fault | PSU failure |
 ## Related Documentation
 - [CLAUDE.md](CLAUDE.md) - Main homelab documentation
 - [IP-ASSIGNMENTS.md](IP-ASSIGNMENTS.md) - Network configuration
 - TrueNAS Web UI: https://10.10.10.200
--- a/HOMEASSISTANT.md
+++ b/HOMEASSISTANT.md
@@ -0,0 +1,145 @@
 # Home Assistant
 ## Overview
 | Setting | Value |
 |---------|-------|
 | VM ID | 110 |
 | Host | PVE (10.10.10.120) |
 | IP Address | 10.10.10.210 (DHCP - should be static) |
 | Port | 8123 |
 | Web UI | http://10.10.10.210:8123 |
 | OS | Home Assistant OS 16.3 |
 | Version | 2025.11.3 (update available: 2025.12.3) |
 ## API Access
 Home Assistant uses Long-Lived Access Tokens for API authentication.
 ### Getting an API Token
 1. Go to http://10.10.10.210:8123
 2. Click your profile (bottom left)
 3. Scroll to "Long-Lived Access Tokens"
 4. Click "Create Token"
 5. Name it (e.g., "Claude Code")
 6. Copy the token (only shown once!)
 ### API Configuration
 ```
 API_URL: http://10.10.10.210:8123/api
 API_TOKEN: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiIwZThjZmJjMzVlNDA0NzYwOTMzMjg3MTQ5ZjkwOGU2NyIsImlhdCI6MTc2NTk5MjQ4OCwiZXhwIjoyMDgxMzUyNDg4fQ.r743tsb3E5NNlrwEEu9glkZdiI4j_3SKIT1n5PGUytY
 ```
 ### API Examples
 ```bash
 # Set these variables
 HA_URL="http://10.10.10.210:8123"
 HA_TOKEN="your-token-here"
 # Check API is working
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/"
 # Get all states
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states" | jq
 # Get specific entity state
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states/light.living_room" | jq
 # Turn on a light
 curl -X POST -H "Authorization: Bearer $HA_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"entity_id": "light.living_room"}' \
  "$HA_URL/api/services/light/turn_on"
 # Turn off a light
 curl -X POST -H "Authorization: Bearer $HA_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"entity_id": "light.living_room"}' \
  "$HA_URL/api/services/light/turn_off"
 # Call any service
 curl -X POST -H "Authorization: Bearer $HA_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"entity_id": "switch.my_switch"}' \
  "$HA_URL/api/services/switch/toggle"
 ```
 ## Common Tasks
 ### List All Entities
 ```bash
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states" | jq '.[].entity_id'
 ```
 ### List Entities by Domain
 ```bash
 # All lights
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states" | jq '[.[] | select(.entity_id | startswith("light."))]'
 # All switches
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states" | jq '[.[] | select(.entity_id | startswith("switch."))]'
 # All sensors
 curl -s -H "Authorization: Bearer $HA_TOKEN" "$HA_URL/api/states" | jq '[.[] | select(.entity_id | startswith("sensor."))]'
 ```
 ### Get Entity History
 ```bash
 # Last 24 hours for an entity
 curl -s -H "Authorization: Bearer $HA_TOKEN" \
  "$HA_URL/api/history/period?filter_entity_id=sensor.temperature" | jq
 ```
 ## Device Summary
 **265 total entities**
 | Domain | Count | Examples |
 |--------|-------|----------|
 | scene | 87 | Lighting scenes |
 | light | 41 | Kitchen, Living room, Bedroom, Office, Cabinet, etc. |
 | switch | 36 | Automations, Sonos controls, Motion sensors |
 | sensor | 28 | Various sensors |
 | number | 21 | Settings/controls |
 | event | 17 | Event triggers |
 | binary_sensor | 13 | Motion, door sensors |
 | media_player | 8 | Sonos speakers (Bedroom, Living Room, Kitchen, Console) |
 ### Lights by Room
 - **Kitchen**: Kitchen light
 - **Living Room**: Living room, Living Room Lamp, TV Bias
 - **Bedroom**: Bedroom, Bedside Lamp 1 & 2, Dresser
 - **Office**: Office, Office Floor Lamp, Office Lamp
 - **Guest Room**: Guest Bed Left, Guest Lamp Right
 - **Other**: Cabinet 1 & 2, Pantry, Bathroom, Front Porch, etc.
 ### Sonos Speakers
 - Bedroom (with surround)
 - Living Room (with surround)
 - Kitchen
 - Console
 ### Motion Sensors
 - Kitchen Motion
 - Office Sensor
 ## Integrations
 - **Philips Hue** - Lights
 - **Sonos** - Speakers
 - **Motion Sensors** - Various locations
 ## Automations
 TODO: Document key automations
 ## TODO
 - [ ] Set static IP (currently DHCP at .210, should be .110)
 - [ ] Add API token to this document
 - [ ] Document installed integrations
 - [ ] Document automations
 - [ ] Set up Traefik reverse proxy (ha.htsn.io)
--- a/INFRASTRUCTURE.md
+++ b/INFRASTRUCTURE.md
@@ -0,0 +1,330 @@
 # Homelab Infrastructure Documentation
 ## Network Topology
 ```
                                    ┌─────────────────┐
                                    │    Internet     │
                                    └────────┬────────┘
                                             │
                                    ┌────────▼────────┐
                                    │  Router/Firewall │
                                    │   10.10.10.1     │
                                    └────────┬────────┘
                                             │
                    ┌────────────────────────┼────────────────────────┐
                    │                        │                        │
           ┌────────▼────────┐      ┌────────▼────────┐      ┌────────▼────────┐
           │   Main Switch   │      │  Storage VLAN   │      │   Tailscale     │
           │   vmbr0/vmbr2   │      │     vmbr3       │      │  100.x.x.x/8    │
           │  10.10.10.0/24  │      │   (Jumbo 9000)  │      │                 │
           └────────┬────────┘      └────────┬────────┘      └─────────────────┘
                    │                        │
        ┌───────────┼───────────┐            │
        │           │           │            │
   ┌────▼───┐  ┌────▼───┐  ┌────▼───┐       │
   │  PVE   │  │  PVE2  │  │ Other  │       │
   │  .120  │  │  .102  │  │ Devices│       │
   └────┬───┘  └────┬───┘  └────────┘       │
        │           │                        │
        └───────────┴────────────────────────┘
                    │
            ┌───────▼───────┐
            │   TrueNAS     │
            │ (Storage via  │
            │  HBA/NVMe)    │
            └───────────────┘
 ```
 ## IP Address Assignments
 ### Management Network (10.10.10.0/24)
 | IP Address | Hostname | Description |
 |------------|----------|-------------|
 | 10.10.10.1 | router | Gateway/Firewall |
 | 10.10.10.102 | pve2 | Proxmox Server 2 |
 | 10.10.10.120 | pve | Proxmox Server 1 (Primary) |
 | 10.10.10.123 | mac-mini | Mac Mini (Syncthing node) |
 | 10.10.10.150 | windows-pc | Windows PC (Syncthing node) |
 | 10.10.10.147 | macbook | MacBook Pro (Syncthing node) |
 | 10.10.10.200 | truenas | TrueNAS (Storage/Syncthing hub) |
 | 10.10.10.220 | gitea-vm | Git Server |
 | 10.10.10.221 | trading-vm | AI Trading Platform |
 ### Tailscale Network (100.x.x.x)
 | IP Address | Hostname | Description |
 |------------|----------|-------------|
 | 100.88.161.110 | macbook | MacBook |
 | 100.106.175.37 | phone | Mobile Device |
 | 100.108.89.58 | mac-mini | Mac Mini |
 ---
 ## Server Hardware
 ### PVE (10.10.10.120) - Primary Virtualization Host
 | Component | Specification |
 |-----------|---------------|
 | **CPU** | AMD Ryzen Threadripper PRO 3975WX (32C/64T, 280W TDP) |
 | **RAM** | 128 GB DDR4 ECC |
 | **Boot** | Samsung 870 QVO 4TB (mirrored) |
 | **NVMe Pool 1** | 2x Sabrent Rocket Q NVMe (nvme-mirror1, 3.6TB) |
 | **NVMe Pool 2** | 2x Kingston SFYRD 2TB (nvme-mirror2, 1.8TB) |
 | **GPU 1** | NVIDIA Quadro P2000 (75W) - Plex transcoding |
 | **GPU 2** | NVIDIA TITAN RTX (280W) - AI workloads |
 | **HBA** | LSI SAS2308 - Passed to TrueNAS |
 | **NVMe Controller** | Samsung PM9A1 - Passed to TrueNAS |
 ### PVE2 (10.10.10.102) - Secondary Virtualization Host
 | Component | Specification |
 |-----------|---------------|
 | **CPU** | AMD Ryzen Threadripper PRO 3975WX (32C/64T, 280W TDP) |
 | **RAM** | 128 GB DDR4 ECC |
 | **NVMe Pool** | 2x NVMe (nvme-mirror3) |
 | **HDD Pool** | 2x WD Red 6TB (local-zfs2, mirrored) |
 | **GPU** | NVIDIA RTX A6000 (300W) - AI Trading |
 ---
 ## Virtual Machines
 ### PVE (10.10.10.120)
 | VMID | Name | vCPUs | RAM | Storage | Purpose | Passthrough |
 |------|------|-------|-----|---------|---------|-------------|
 | 100 | truenas | 8 | 32GB | rpool | NAS/Storage | LSI SAS2308 HBA, Samsung NVMe |
 | 101 | saltbox | 16 | 16GB | rpool/nvme-mirror1/2 | Media automation | TITAN RTX |
 | 105 | fs-dev | 10 | 8GB | nvme-mirror1 | Development | - |
 | 110 | homeassistant | 2 | 2GB | nvme-mirror2 | Home automation | - |
 | 111 | lmdev1 | 8 | 32GB | nvme-mirror1 | AI/LLM development | TITAN RTX |
 | 201 | copyparty | 2 | 2GB | nvme-mirror1 | File sharing | - |
 | 206 | docker-host | 2 | 4GB | rpool | Docker services | - |
 ### PVE2 (10.10.10.102)
 | VMID | Name | vCPUs | RAM | Storage | Purpose | Passthrough |
 |------|------|-------|-----|---------|---------|-------------|
 | 300 | gitea-vm | 2 | 4GB | nvme-mirror3 | Git server | - |
 | 301 | trading-vm | 16 | 32GB | nvme-mirror3 | AI trading platform | RTX A6000 |
 ---
 ## LXC Containers
 ### PVE (10.10.10.120)
 | VMID | Name | Purpose | Status |
 |------|------|---------|--------|
 | 200 | pihole | DNS/Ad blocking | Running |
 | 202 | traefik | Reverse proxy | Running |
 | 205 | findshyt | Custom application | Running |
 | 500 | dev1 | Development | Stopped |
 ---
 ## Storage Architecture
 ```
 PVE (10.10.10.120)
 ├── rpool (Samsung 870 QVO 4TB mirror)
 │   ├── Proxmox system
 │   ├── VM 100 (truenas) boot
 │   ├── VM 101 (saltbox) boot
 │   └── VM 206 (docker-host)
 │
 ├── nvme-mirror1 (Sabrent Rocket Q mirror, 3.6TB)
 │   ├── VM 101 (saltbox) data
 │   ├── VM 105 (fs-dev)
 │   ├── VM 111 (lmdev1)
 │   └── VM 201 (copyparty)
 │
 └── nvme-mirror2 (Kingston SFYRD mirror, 1.8TB)
    ├── VM 101 (saltbox) data
    └── VM 110 (homeassistant)
 PVE2 (10.10.10.102)
 ├── nvme-mirror3 (NVMe mirror)
 │   ├── VM 300 (gitea-vm)
 │   └── VM 301 (trading-vm)
 │
 └── local-zfs2 (WD Red 6TB mirror)
    └── Backup/archive storage
 TrueNAS (VM 100 on PVE)
 ├── HBA Passthrough (LSI SAS2308)
 │   └── [Physical drives managed by TrueNAS]
 │
 └── NVMe Passthrough (Samsung PM9A1)
    └── [NVMe drives managed by TrueNAS]
 ```
 ---
 ## Services Map
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                         EXTERNAL ACCESS                          │
 ├─────────────────────────────────────────────────────────────────┤
 │  Tailscale VPN ──► All services accessible via 100.x.x.x        │
 │  Traefik (CT 202) ──► Reverse proxy for web services            │
 └─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                          CORE SERVICES                           │
 ├─────────────────────────────────────────────────────────────────┤
 │  PiHole (CT 200) ──► DNS + Ad blocking                          │
 │  TrueNAS (VM 100) ──► NAS, Syncthing, Storage                   │
 │  Gitea (VM 300) ──► Git repository hosting                      │
 │  Home Assistant (VM 110) ──► Home automation                    │
 └─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                         MEDIA SERVICES                           │
 ├─────────────────────────────────────────────────────────────────┤
 │  Saltbox (VM 101) ──► Plex, *arr stack, media automation        │
 │  CopyParty (VM 201) ──► File sharing                            │
 └─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                       DEVELOPMENT/AI                             │
 ├─────────────────────────────────────────────────────────────────┤
 │  Trading VM (VM 301) ──► AI trading platform (RTX A6000)        │
 │  LMDev1 (VM 111) ──► LLM development (TITAN RTX)                │
 │  FS-Dev (VM 105) ──► General development                        │
 │  Docker Host (VM 206) ──► Containerized services                │
 └─────────────────────────────────────────────────────────────────┘
 ```
 ---
 ## Syncthing Topology
 ```
                    ┌─────────────────┐
                    │    TrueNAS      │
                    │  (Hub/Server)   │
                    │  Port 20910     │
                    └────────┬────────┘
                             │
         ┌───────────────────┼───────────────────┐
         │                   │                   │
    ┌────▼────┐        ┌────▼────┐        ┌────▼────┐
    │ MacBook │        │ Mac Mini│        │ Windows │
    │ .147    │        │ .123    │        │ PC .150 │
    └─────────┘        └─────────┘        └─────────┘
 Synced Folders:
 ├── antigravity (310MB)
 ├── bin (23KB)
 ├── claude-code (257MB)
 ├── claude-desktop (784MB)
 ├── config (436KB)
 ├── cursor (459MB)
 ├── desktop (7.2GB)
 ├── documents (11GB)
 ├── dotconfig (212MB)
 ├── downloads (38GB)
 ├── movies (334MB)
 ├── music (606KB)
 ├── notes (73KB)
 ├── pictures (259MB)
 └── projects (3.1GB)
 ```
 ---
 ## Power Consumption
 ### Estimated Power Draw
 | Component | Idle | Load | Notes |
 |-----------|------|------|-------|
 | **PVE CPU** | 50W | 280W | TR PRO 3975WX |
 | **PVE2 CPU** | 50W | 280W | TR PRO 3975WX |
 | **TITAN RTX** | 20W | 280W | Passthrough to saltbox/lmdev1 |
 | **RTX A6000** | 25W | 300W | Passthrough to trading-vm |
 | **Quadro P2000** | 10W | 75W | Plex transcoding |
 | **Storage (per server)** | 30W | 50W | NVMe + SSD mirrors |
 | **Base system (each)** | 50W | 60W | Motherboard, RAM, fans |
 ### Total Estimates
 - **Idle**: ~400-500W combined
 - **Moderate load**: ~700-900W combined
 - **Full load**: ~1200-1400W combined
 ### Power Optimizations Applied
 1. KSMD disabled on both hosts (saved ~10W)
 2. Syncthing rescan intervals increased (saved ~60-80W from TrueNAS CPU)
 3. CPU governor optimization (saved ~60-120W)
   - PVE: `powersave` + `balance_power` EPP (amd-pstate-epp)
   - PVE2: `schedutil` (acpi-cpufreq)
 4. ksmtuned service disabled on both hosts (saved ~2-5W)
 5. HDD spindown on PVE2 - 30 min timeout (saved ~10-16W)
   - local-zfs2 pool (2x WD Red 6TB) essentially empty
 **Total estimated savings: ~142-231W**
 ---
 ## SSH Access
 ### Credentials
 | Host | IP Address | Username | Password | Notes |
 |------|------------|----------|----------|-------|
 | Hutson-PC | 10.10.10.150 | claude | GrilledCh33s3# | Windows PC |
 | MacBook | 10.10.10.147 | hutson | GrilledCh33s3# | MacBook Pro |
 | TrueNAS | 10.10.10.200 | truenas_admin | GrilledCh33s3# | SSH key configured |
 ### SSH Keys
 The Mac Mini has an SSH key configured at `~/.ssh/id_ed25519` for passwordless authentication to Proxmox hosts and other infrastructure.
 For Proxmox servers (PVE and PVE2), SSH access is configured in `~/.ssh/config`:
 ```
 Host pve
    HostName 10.10.10.120
    User root
    IdentityFile ~/.ssh/ai_trading_ed25519
 Host pve2
    HostName 10.10.10.102
    User root
    IdentityFile ~/.ssh/ai_trading_ed25519
 ```
 ---
 ## Credentials Management
 Sensitive credentials are stored in `/Users/hutson/Projects/homelab/.env` for use with infrastructure management scripts and automation.
 This file contains:
 - Service passwords
 - API keys
 - Database credentials
 - Other sensitive configuration values
 **Note**: The `.env` file is git-ignored and should never be committed to version control.
 ---
 ## Configuration Backups
 Configuration files are backed up in `/Users/hutson/Projects/homelab/configs/` directory.
 ### Current Backups
 | File | Description |
 |------|-------------|
 | ghostty.conf | Ghostty terminal emulator configuration |
 This directory serves as a centralized location for storing configuration backups from various systems and applications in the homelab environment.
--- a/IP-ASSIGNMENTS.md
+++ b/IP-ASSIGNMENTS.md
@@ -0,0 +1,139 @@
 # IP Address Assignments
 This document tracks all IP addresses in the homelab infrastructure.
 ## Network Overview
 | Network | Range | Purpose |
 |---------|-------|---------|
 | Management VLAN | 10.10.10.0/24 | Primary network for all devices |
 | Storage VLAN | 10.10.20.0/24 | NFS/iSCSI storage traffic |
 | Tailscale | 100.x.x.x | VPN overlay network |
 ## Infrastructure Devices
 | IP Address | Device | Type | Notes |
 |------------|--------|------|-------|
 | 10.10.10.1 | UniFi UCG-Fiber | Router | Gateway for all traffic |
 | 10.10.10.120 | PVE | Proxmox Host | Primary server (Threadripper PRO 3975WX) |
 | 10.10.10.102 | PVE2 | Proxmox Host | Secondary server (Threadripper PRO 3975WX) |
 ## Virtual Machines - PVE (10.10.10.120)
 | VMID | Name | IP Address | Purpose | Status |
 |------|------|------------|---------|--------|
 | 100 | truenas | 10.10.10.200 | NAS, central Syncthing hub | Running |
 | 101 | saltbox | 10.10.10.100 | Media automation, Plex, *arr apps | Running |
 | 105 | fs-dev | 10.10.10.5 | Development environment | Running |
 | 110 | homeassistant | 10.10.10.110 | Home automation | Running |
 | 111 | lmdev1 | 10.10.10.111 | AI/LLM development (TITAN RTX) | Running |
 | 201 | copyparty | 10.10.10.201 | File sharing | Running |
 | 206 | docker-host | 10.10.10.206 | Docker services (Excalidraw, etc.) | Running |
 ## Containers (LXC) - PVE (10.10.10.120)
 | CTID | Name | IP Address | Purpose | Status |
 |------|------|------------|---------|--------|
 | 200 | pihole | 10.10.10.10 | DNS/Ad blocking | Running |
 | 202 | traefik | 10.10.10.250 | Reverse proxy (Traefik-Primary) | Running |
 | 205 | findshyt | 10.10.10.8 | Custom app | Running |
 | 500 | dev1 | DHCP | Development container | Stopped |
 ## Virtual Machines - PVE2 (10.10.10.102)
 | VMID | Name | IP Address | Purpose | Status |
 |------|------|------------|---------|--------|
 | 300 | gitea-vm | 10.10.10.220 | Git server | Running |
 | 301 | trading-vm | 10.10.10.221 | AI trading platform (RTX A6000) | Running |
 ## Workstations & Personal Devices
 | IP Address | Tailscale IP | Device | User | Notes |
 |------------|--------------|--------|------|-------|
 | 10.10.10.147 | 100.88.161.1 | MacBook Pro | hutson | Laptop |
 | 10.10.10.148 | 100.108.89.58 | Mac Mini | hutson | Persistent Claude sessions |
 | 10.10.10.150 | 100.120.97.76 | Hutson-PC (Windows) | claude/micro | Windows workstation |
 | 10.10.10.54 | - | Android Phone | hutson | Syncthing mobile |
 ## Services & Reverse Proxy Mapping
 | Service | Domain | Backend IP:Port | Traefik Instance |
 |---------|--------|-----------------|------------------|
 | Traefik-Primary | - | 10.10.10.250 | Self (CT 202) |
 | Traefik-Saltbox | - | 10.10.10.100 | Self (VM 101) |
 | FindShyt | findshyt.htsn.io | 10.10.10.8:3000 | Traefik-Primary |
 | Gitea | git.htsn.io | 10.10.10.220:3000 | Traefik-Primary |
 | Home Assistant | ha.htsn.io | 10.10.10.110:8123 | Traefik-Primary |
 | TrueNAS | nas.htsn.io | 10.10.10.200 | Traefik-Primary |
 | Proxmox | pve.htsn.io | 10.10.10.120:8006 | Traefik-Primary |
 | CopyParty | cp.htsn.io | 10.10.10.201:3923 | Traefik-Primary |
 | LMDev | lmdev.htsn.io | 10.10.10.111 | Traefik-Primary |
 | Excalidraw | excalidraw.htsn.io | 10.10.10.206:8080 | Traefik-Primary |
 | Plex | plex.htsn.io | 10.10.10.100:32400 | Traefik-Saltbox |
 | Sonarr | sonarr.htsn.io | 10.10.10.100:8989 | Traefik-Saltbox |
 | Radarr | radarr.htsn.io | 10.10.10.100:7878 | Traefik-Saltbox |
 ## Reserved/Available IPs
 ### Currently Used (10.10.10.x)
 - .1 - Router (gateway)
 - .5 - fs-dev
 - .8 - FindShyt
 - .10 - PiHole (DNS)
 - .54 - Android Phone
 - .100 - Saltbox (Traefik-Saltbox)
 - .102 - PVE2
 - .110 - Home Assistant
 - .111 - LMDev1
 - .120 - PVE
 - .147 - MacBook Pro
 - .148 - Mac Mini
 - .150 - Windows PC
 - .200 - TrueNAS
 - .201 - CopyParty
 - .206 - Docker-host
 - .220 - Gitea
 - .221 - Trading VM
 - .250 - Traefik-Primary
 ### Available Ranges
 - 10.10.10.2 - 10.10.10.4 (3 IPs)
 - 10.10.10.6 - 10.10.10.7 (2 IPs)
 - 10.10.10.9 (1 IP)
 - 10.10.10.11 - 10.10.10.53 (43 IPs)
 - 10.10.10.55 - 10.10.10.99 (45 IPs)
 - 10.10.10.101 (1 IP)
 - 10.10.10.103 - 10.10.10.109 (7 IPs)
 - 10.10.10.112 - 10.10.10.119 (8 IPs)
 - 10.10.10.121 - 10.10.10.146 (26 IPs)
 - 10.10.10.149 (1 IP)
 - 10.10.10.151 - 10.10.10.199 (49 IPs)
 - 10.10.10.202 - 10.10.10.205 (4 IPs)
 - 10.10.10.207 - 10.10.10.219 (13 IPs)
 - 10.10.10.222 - 10.10.10.249 (28 IPs)
 - 10.10.10.251 - 10.10.10.254 (4 IPs)
 ## Docker Host Services (10.10.10.206)
 | Service | Port | Purpose |
 |---------|------|---------|
 | Excalidraw | 8080 | Whiteboard/diagramming (excalidraw.htsn.io) |
 | Portainer CE | 9000, 9443 | Local Docker management UI |
 | Portainer Agent | 9001 | Remote management from other Portainer |
 | Gotenberg | 3000 | PDF generation API |
 ## Syncthing API Endpoints
 | Device | IP | Port | API Key |
 |--------|-----|------|---------|
 | Mac Mini | 127.0.0.1 | 8384 | oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5 |
 | MacBook | 127.0.0.1 (via SSH) | 8384 | qYkNdVLwy9qZZZ6MqnJr7tHX7KKdxGMJ |
 | Android Phone | 10.10.10.54 | 8384 | Xxz3jDT4akUJe6psfwZsbZwG2LhfZuDM |
 | TrueNAS | 10.10.10.200 | 8384 | (check TrueNAS config) |
 ## Notes
 - **MTU 9000** (jumbo frames) enabled on storage networks
 - **Tailscale** provides VPN access from anywhere
 - **DNS** handled by PiHole at 10.10.10.10
 - All new services should use **Traefik-Primary (10.10.10.250)** unless they're Saltbox services
--- a/NETWORK.md
+++ b/NETWORK.md
@@ -0,0 +1,226 @@
 # Network Architecture
 ## Network Ranges
 | Network | Range | Purpose | Gateway |
 |---------|-------|---------|---------|
 | LAN | 10.10.10.0/24 | Primary network, management, general access | 10.10.10.1 (UniFi Router) |
 | Storage/Internal | 10.10.20.0/24 | Inter-VM traffic, NFS/iSCSI, no external access | 10.10.20.1 (vmbr3) |
 | Tailscale | 100.x.x.x | VPN overlay for remote access | N/A |
 ## PVE (10.10.10.120) - Network Bridges
 ### Physical NICs
 | Interface | Speed | Type | MAC Address | Connected To |
 |-----------|-------|------|-------------|--------------|
 | enp1s0 | 1 Gbps | Onboard NIC | e0:4f:43:e6:41:6c | Switch → UniFi eth5 |
 | enp35s0f0 | 10 Gbps | Intel X550 Port 0 | b4:96:91:39:86:98 | Switch → UniFi eth5 |
 | enp35s0f1 | 10 Gbps | Intel X550 Port 1 | b4:96:91:39:86:99 | Switch → UniFi eth5 |
 **Note:** All three NICs connect through a switch to the UniFi Gateway's 10Gb SFP+ port (eth5). No direct firewall connection.
 ### Bridge Configuration
 #### vmbr0 - Management Bridge (1Gb)
 - **Physical NIC**: enp1s0 (1 Gbps onboard)
 - **IP**: 10.10.10.120/24
 - **Gateway**: 10.10.10.1
 - **MTU**: 9000
 - **Purpose**: General VM/CT networking, management access
 - **Use for**: Most VMs and containers that need basic internet access
 **VMs/CTs on vmbr0:**
 | VMID | Name | IP |
 |------|------|-----|
 | 105 | fs-dev | 10.10.10.5 |
 | 110 | homeassistant | 10.10.10.110 |
 | 201 | copyparty | DHCP |
 | 206 | docker-host | 10.10.10.206 |
 | 200 | pihole (CT) | 10.10.10.10 |
 | 205 | findshyt (CT) | 10.10.10.205 |
 ---
 #### vmbr1 - High-Speed LXC Bridge (10Gb)
 - **Physical NIC**: enp35s0f0 (10 Gbps Intel X550)
 - **IP**: 10.10.10.121/24
 - **Gateway**: 10.10.10.1
 - **MTU**: 9000
 - **Purpose**: High-bandwidth LXC containers and VMs
 - **Use for**: Containers/VMs that need high throughput to network
 **VMs/CTs on vmbr1:**
 | VMID | Name | IP |
 |------|------|-----|
 | 111 | lmdev1 | 10.10.10.111 |
 ---
 #### vmbr2 - High-Speed VM Bridge (10Gb)
 - **Physical NIC**: enp35s0f1 (10 Gbps Intel X550)
 - **IP**: 10.10.10.122/24
 - **Gateway**: (none configured)
 - **MTU**: 9000
 - **Purpose**: High-bandwidth VMs, storage traffic
 - **Use for**: VMs that need high throughput (TrueNAS, Saltbox)
 **VMs/CTs on vmbr2:**
 | VMID | Name | IP |
 |------|------|-----|
 | 100 | truenas | 10.10.10.200 |
 | 101 | saltbox | 10.10.10.100 |
 | 202 | traefik (CT) | 10.10.10.250 |
 ---
 #### vmbr3 - Internal-Only Bridge (Virtual)
 - **Physical NIC**: None (isolated virtual network)
 - **IP**: 10.10.20.1/24
 - **Gateway**: N/A (no external routing)
 - **MTU**: 9000
 - **Purpose**: Inter-VM communication without external access
 - **Use for**: Storage traffic (NFS/iSCSI), internal APIs, secure VM-to-VM
 **VMs with secondary interface on vmbr3:**
 | VMID | Name | Internal IP | Notes |
 |------|------|-------------|-------|
 | 100 | truenas | (check TrueNAS config) | NFS/iSCSI server |
 | 101 | saltbox | (check VM config) | Media storage access |
 | 111 | lmdev1 | (check VM config) | AI model storage |
 | 201 | copyparty | 10.10.20.201 | Confirmed via cloud-init |
 ---
 ## PVE2 (10.10.10.102) - Network Bridges
 ### Physical NICs
 | Interface | Speed | Type | MAC Address | Connected To |
 |-----------|-------|------|-------------|--------------|
 | nic0 | Unknown | Unused | e0:4f:43:e6:1b:e3 | Not connected |
 | nic1 | 10 Gbps | Primary NIC | a0:36:9f:26:b9:bc | **Direct to UCG-Fiber (10Gb negotiated)** |
 **Note:** PVE2 connects directly to the UCG-Fiber. Link negotiates at 10Gb.
 ### Bridge Configuration
 #### vmbr0 - Single Bridge (10Gb)
 - **Physical NIC**: nic1 (10 Gbps)
 - **IP**: 10.10.10.102/24
 - **Gateway**: 10.10.10.1
 - **Purpose**: All VMs on PVE2
 **VMs on vmbr0:**
 | VMID | Name | IP |
 |------|------|-----|
 | 300 | gitea-vm | 10.10.10.220 |
 | 301 | trading-vm | 10.10.10.221 |
 ---
 ## Which Bridge to Use?
 | Scenario | Bridge | Reason |
 |----------|--------|--------|
 | General VM/CT | vmbr0 | Standard networking, 1Gb is sufficient |
 | High-bandwidth VM (media, AI) | vmbr1 or vmbr2 | 10Gb for large file transfers |
 | Storage-heavy VM (NAS access) | vmbr2 + vmbr3 | 10Gb external + internal storage network |
 | Isolated internal service | vmbr3 only | No external access, secure |
 | VM needing both external + internal | vmbr0/1/2 + vmbr3 | Dual-homed configuration |
 ## Traffic Flow
 ```
 Internet
    │
    ▼
 ┌─────────────────────────────────────────────────────────────┐
 │           UCG-Fiber (10.10.10.1)                            │
 │                                                             │
 │   eth5 (10Gb SFP+)            switch0 (eth0-eth4, 10Gb)     │
 │        │                               │                    │
 └────────┼───────────────────────────────┼────────────────────┘
         │                               │
         ▼                               │
 ┌─────────────────────┐                  │
 │   10Gb Switch       │                  │
 └─────────────────────┘                  │
    │       │       │                    │
    │       │       │                    │
    ▼       ▼       ▼                    ▼
 enp1s0  enp35s0f0  enp35s0f1          nic1
 (1Gb)   (10Gb)    (10Gb)            (10Gb)
    │       │         │                  │
    ▼       ▼         ▼                  ▼
 vmbr0   vmbr1     vmbr2              vmbr0
    │       │         │                  │
    │       │         │                  │
    PVE     PVE       PVE              PVE2
 General  lmdev1   TrueNAS,         gitea-vm,
  VMs              Saltbox,         trading-vm
                   Traefik
 Internal Only (no external access):
 ┌─────────────────────────────────────┐
 │  vmbr3 (10.10.20.0/24) - Virtual    │
 │  No physical NIC - inter-VM only    │
 │                                     │
 │  TrueNAS ◄──► Saltbox              │
 │      ▲           ▲                  │
 │      │           │                  │
 │      └─── lmdev1 ┘                  │
 │            ▲                        │
 │            │                        │
 │        copyparty                    │
 └─────────────────────────────────────┘
 ```
 ## Determining Physical Connections
 To determine which 10Gb port goes where, check:
 1. **Physical cable tracing** - Follow cables from server to switch/firewall
 2. **Switch port status** - Check UniFi controller for connected ports
 3. **MAC addresses** - Compare `ip link show` MACs with switch ARP table
 ```bash
 # On PVE - get MAC addresses
 ip link show enp35s0f0 | grep ether
 ip link show enp35s0f1 | grep ether
 # On router - check ARP
 ssh root@10.10.10.1 'cat /proc/net/arp'
 ```
 ## Adding a New VM to a Specific Network
 ```bash
 # Add VM to vmbr0 (standard)
 qm set VMID --net0 virtio,bridge=vmbr0
 # Add VM to vmbr2 (10Gb)
 qm set VMID --net0 virtio,bridge=vmbr2
 # Add second NIC for internal network
 qm set VMID --net1 virtio,bridge=vmbr3
 # For containers
 pct set CTID --net0 name=eth0,bridge=vmbr0,ip=10.10.10.XXX/24,gw=10.10.10.1
 ```
 ## MTU Configuration
 All bridges use **MTU 9000** (jumbo frames) for optimal storage performance.
 If adding a new VM that will access NFS/iSCSI storage, ensure the guest OS also uses MTU 9000:
 ```bash
 # Linux guest
 ip link set eth0 mtu 9000
 # Permanent (netplan)
 # /etc/netplan/00-installer-config.yaml
 network:
  ethernets:
    eth0:
      mtu: 9000
 ```
--- a/SHELL-ALIASES.md
+++ b/SHELL-ALIASES.md
@@ -0,0 +1,147 @@
 # Shell Aliases & Shortcuts
 ## Overview
 ZSH aliases for quickly launching Claude Code in project directories with `--dangerously-skip-permissions` enabled. Aliases sync across devices via Syncthing.
 ## Setup
 ### File Locations
 ```
 ~/.config/shell/shared.zsh          # Main shared config (sourced by .zshrc)
 ~/.config/shell/claude-aliases.zsh  # Claude Code aliases
 ~/Projects/homelab/configs/         # Symlinks for reference
 ```
 ### Installation
 Add to `~/.zshrc`:
 ```bash
 source ~/.config/shell/shared.zsh
 ```
 ## Claude Code Aliases
 ### Quick Start (--continue)
 Continue the most recent session in each project:
 | Alias | Directory | Command |
 |-------|-----------|---------|
 | `chomelab` | ~/Projects/homelab | `claude --dangerously-skip-permissions --continue` |
 | `ctrading` | ~/Projects/ai-trading-platform | `claude --dangerously-skip-permissions --continue` |
 | `cnotes` | ~/Notes | `claude --dangerously-skip-permissions --continue --ide` |
 | `chome` | ~ | `claude --dangerously-skip-permissions --continue` |
 | `cfindshyt` | ~/Desktop/findshyt-working-folder | `claude --dangerously-skip-permissions --continue` |
 | `ciconik` | ~/Projects/iconik-uploader | `claude --dangerously-skip-permissions --continue` |
 | `cghostty` | ~/.config/ghostty | `claude --dangerously-skip-permissions --continue` |
 | `cprojects` | ~/Projects | `claude --dangerously-skip-permissions --continue` |
 | `cclaudeui` | ~/Projects/claude-ui | `claude --dangerously-skip-permissions --continue` |
 | `clucid` | ~/Projects/lucidlink-upgrade | `claude --dangerously-skip-permissions --continue` |
 | `cbeeper` | ~/Projects/beeper | `claude --dangerously-skip-permissions --continue` |
 ### Resume (--resume)
 Show list of sessions to pick from:
 | Alias | Directory |
 |-------|-----------|
 | `chomelab-r` | ~/Projects/homelab |
 | `ctrading-r` | ~/Projects/ai-trading-platform |
 | `cnotes-r` | ~/Notes |
 | `chome-r` | ~ |
 | `ciconik-r` | ~/Projects/iconik-uploader |
 | `cbeeper-r` | ~/Projects/beeper |
 ### Fresh Start (no flags)
 Start a new session without resuming:
 | Alias | Directory |
 |-------|-----------|
 | `chomelab-new` | ~/Projects/homelab |
 | `ctrading-new` | ~/Projects/ai-trading-platform |
 | `cnotes-new` | ~/Notes |
 | `chome-new` | ~ |
 ## Usage Examples
 ```bash
 # Continue homelab session
 chomelab
 # Pick from recent homelab sessions
 chomelab-r
 # Start fresh homelab session
 chomelab-new
 # Quick AI trading work
 ctrading
 ```
 ## Adding New Aliases
 Edit `~/.config/shell/claude-aliases.zsh`:
 ```bash
 # Template for new project
 alias cproject='cd ~/Projects/new-project && claude --dangerously-skip-permissions --continue'
 alias cproject-r='cd ~/Projects/new-project && claude --dangerously-skip-permissions --resume'
 alias cproject-new='cd ~/Projects/new-project && claude --dangerously-skip-permissions'
 ```
 Changes sync automatically to all devices via Syncthing (~/.config folder).
 ## Enterprise/Work Aliases (claude-gateway)
 Use `ec` prefix for work Claude account via `claude-gateway`:
 ### Quick Start (--continue)
 | Alias | Directory |
 |-------|-----------|
 | `echomelab` | ~/Projects/homelab |
 | `ectrading` | ~/Projects/ai-trading-platform |
 | `ecnotes` | ~/Notes |
 | `echome` | ~ |
 | `ecfindshyt` | ~/Desktop/findshyt-working-folder |
 | `eciconik` | ~/Projects/iconik-uploader |
 | `ecghostty` | ~/.config/ghostty |
 | `ecprojects` | ~/Projects |
 | `ecclaudeui` | ~/Projects/claude-ui |
 | `eclucid` | ~/Projects/lucidlink-upgrade |
 | `ecbeeper` | ~/Projects/beeper |
 ### Resume & Fresh
 - Resume: `echomelab-r`, `ectrading-r`, `ecnotes-r`, `echome-r`, `eciconik-r`, `ecbeeper-r`
 - Fresh: `echomelab-new`, `ectrading-new`, `ecnotes-new`, `echome-new`
 ## Full Alias File
 Located at: `~/.config/shell/claude-aliases.zsh`
 ```bash
 # Claude Code Project Aliases
 # Main projects
 alias chome='cd ~ && claude --dangerously-skip-permissions --continue'
 alias ctrading='cd ~/Projects/ai-trading-platform && claude --dangerously-skip-permissions --continue'
 alias ciconik='cd ~/Projects/iconik-uploader && claude --dangerously-skip-permissions --continue'
 alias cnotes='cd ~/Notes && claude --dangerously-skip-permissions --continue --ide'
 alias chomelab='cd ~/Projects/homelab && claude --dangerously-skip-permissions --continue'
 alias cfindshyt='cd ~/Desktop/findshyt-working-folder && claude --dangerously-skip-permissions --continue'
 alias cghostty='cd ~/.config/ghostty && claude --dangerously-skip-permissions --continue'
 alias cprojects='cd ~/Projects && claude --dangerously-skip-permissions --continue'
 alias cclaudeui='cd ~/projects/claude-ui && claude --dangerously-skip-permissions --continue'
 alias clucid='cd ~/Projects/lucidlink-upgrade && claude --dangerously-skip-permissions --continue'
 alias cbeeper='cd ~/Projects/beeper && claude --dangerously-skip-permissions --continue'
 # Resume variants
 alias chome-r='cd ~ && claude --dangerously-skip-permissions --resume'
 alias ctrading-r='cd ~/Projects/ai-trading-platform && claude --dangerously-skip-permissions --resume'
 alias ciconik-r='cd ~/Projects/iconik-uploader && claude --dangerously-skip-permissions --resume'
 alias cnotes-r='cd ~/Notes && claude --dangerously-skip-permissions --resume --ide'
 alias chomelab-r='cd ~/Projects/homelab && claude --dangerously-skip-permissions --resume'
 alias cbeeper-r='cd ~/Projects/beeper && claude --dangerously-skip-permissions --resume'
 # Fresh start
 alias chome-new='cd ~ && claude --dangerously-skip-permissions'
 alias ctrading-new='cd ~/Projects/ai-trading-platform && claude --dangerously-skip-permissions'
 alias cnotes-new='cd ~/Notes && claude --dangerously-skip-permissions --ide'
 alias chomelab-new='cd ~/Projects/homelab && claude --dangerously-skip-permissions'
 ```
--- a/SYNCTHING.md
+++ b/SYNCTHING.md
@@ -0,0 +1,166 @@
 # Syncthing Setup
 ## Overview
 Syncthing provides real-time file synchronization across all devices. Files sync automatically when devices connect.
 ## Devices
 | Device | ID Prefix | Local IP | Tailscale IP | Port | Role |
 |--------|-----------|----------|--------------|------|------|
 | Mac Mini | L3PJR73 | 10.10.10.123 | 100.108.89.58 | 22000 | Primary workstation |
 | MacBook Pro | 3TFMYEI | 10.10.10.147 | 100.88.161.1 | 22000 | Laptop |
 | TrueNAS | TPO72EY | 10.10.10.200 | 100.100.94.71 | 20978 | Storage server (central hub) |
 | Windows PC | YDCPUQK | 10.10.10.150 | 100.120.97.76 | 22000 | Windows workstation |
 | Phone (Android) | XLMZCCH | 10.10.10.54 | 100.106.175.37 | 22000 | Android, Notes only, HTTPS API |
 ## Network Configuration
 **IPv4 Only** - All devices configured with explicit IPv4 addresses (no dynamic/IPv6):
 - Local network: `10.10.10.0/24`
 - Tailscale network: `100.x.x.x`
 Device address format: `tcp4://IP:PORT` (e.g., `tcp4://10.10.10.123:22000`)
 ## Synced Folders
 | Folder | Path | Devices | Notes |
 |--------|------|---------|-------|
 | Downloads | ~/Downloads | Mac Mini, MacBook, TrueNAS, Windows | Large folder, 3600s rescan |
 | Notes | ~/Notes | Mac Mini, MacBook, TrueNAS | Documentation |
 | Projects | ~/Projects | Mac Mini, MacBook, TrueNAS | Code repositories |
 | bin | ~/bin | Mac Mini, MacBook, TrueNAS | Scripts and tools |
 | Documents | ~/Documents | Mac Mini, MacBook, TrueNAS | Personal documents |
 | Desktop | ~/Desktop | Mac Mini, MacBook, TrueNAS | Desktop files |
 | config | ~/.config | Mac Mini, MacBook | Shell configs, app settings |
 | Antigravity | ~/.gemini | Mac Mini, MacBook, TrueNAS | Gemini config |
 ## API Access
 ### Mac Mini
 ```bash
 API_KEY="oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5"
 curl -s "http://127.0.0.1:8384/rest/system/status" -H "X-API-Key: $API_KEY"
 ```
 ### MacBook Pro
 ```bash
 API_KEY="qYkNdVLwy9qZZZ6MqnJr7tHX7KKdxGMJ"
 curl -s "http://127.0.0.1:8384/rest/system/status" -H "X-API-Key: $API_KEY"
 ```
 ### Windows PC
 ```bash
 API_KEY="KPHGteJv6APPE7zFun33b3qM3Vn5KSA7"
 curl -s "http://10.10.10.150:8384/rest/system/status" -H "X-API-Key: $API_KEY"
 ```
 ### Phone (Android) - Uses HTTPS
 ```bash
 API_KEY="Xxz3jDT4akUJe6psfwZsbZwG2LhfZuDM"
 # Access via local IP (use -k to skip cert verification)
 curl -sk "https://10.10.10.54:8384/rest/system/status" -H "X-API-Key: $API_KEY"
 # Or via Tailscale
 curl -sk "https://100.106.175.37:8384/rest/system/status" -H "X-API-Key: $API_KEY"
 ```
 ## Common Commands
 ### Check Status
 ```bash
 # Folder status
 curl -s "http://127.0.0.1:8384/rest/db/status?folder=downloads" -H "X-API-Key: $API_KEY"
 # Connection status
 curl -s "http://127.0.0.1:8384/rest/system/connections" -H "X-API-Key: $API_KEY"
 # Device completion for a folder
 curl -s "http://127.0.0.1:8384/rest/db/completion?folder=downloads&device=DEVICE_ID" -H "X-API-Key: $API_KEY"
 ```
 ### Check Errors
 ```bash
 curl -s "http://127.0.0.1:8384/rest/folder/errors?folder=downloads" -H "X-API-Key: $API_KEY"
 ```
 ### Rescan Folder
 ```bash
 curl -X POST "http://127.0.0.1:8384/rest/db/scan?folder=downloads" -H "X-API-Key: $API_KEY"
 ```
 ## Configuration Files
 | Device | Config Path |
 |--------|-------------|
 | Mac Mini | ~/Library/Application Support/Syncthing/config.xml |
 | MacBook Pro | ~/Library/Application Support/Syncthing/config.xml |
 | TrueNAS | /mnt/tank/syncthing/config/config.xml |
 ## Performance Tuning
 ### Speed Optimizations (2024-12-17)
 #### Global Options
 | Setting | Value | Effect |
 |---------|-------|--------|
 | `numConnections` | 4 | Parallel transfers per device |
 | `compression` | never | No CPU overhead on fast LAN |
 | `setLowPriority` | false | Normal CPU priority |
 | `connectionPriorityQuicLan` | 10 | QUIC preferred on LAN |
 | `connectionPriorityTcpLan` | 20 | TCP fallback on LAN |
 | `connectionPriorityQuicWan` | 30 | QUIC preferred on WAN |
 | `connectionPriorityTcpWan` | 40 | TCP fallback on WAN |
 | `progressUpdateIntervalS` | -1 | Disabled progress updates (reduces overhead) |
 | `maxConcurrentIncomingRequestKiB` | 1048576 | 1GB buffer for incoming requests |
 **Applied to**: Mac Mini, MacBook, Windows PC (Phone uses 512MB buffer)
 #### Folder-Level Settings
 | Setting | Value | Effect |
 |---------|-------|--------|
 | `pullerMaxPendingKiB` | 131072-262144 | 128-256MB pending data buffer per folder |
 **Applied to**: downloads, projects, documents, desktop, notes folders
 ### Rescan Intervals (set to 3600s for large folders)
 Large folders like Downloads use 1-hour rescan intervals to reduce CPU usage:
 - File system watcher handles real-time changes
 - Hourly rescan catches anything missed
 ### Power Optimization
 From CLAUDE.md - Syncthing rescan optimization saved ~60-80W on TrueNAS VM.
 ## Troubleshooting
 ### Device Not Syncing
 1. Check connection status:
 ```bash
 curl -s "http://127.0.0.1:8384/rest/system/connections" -H "X-API-Key: $API_KEY" | python3 -c "import sys,json; d=json.load(sys.stdin)['connections']; [print(f'{k[:7]}: {v[\"connected\"]}') for k,v in d.items()]"
 ```
 2. Check folder completion:
 ```bash
 curl -s "http://127.0.0.1:8384/rest/db/status?folder=FOLDER" -H "X-API-Key: $API_KEY"
 ```
 3. Check for errors:
 ```bash
 curl -s "http://127.0.0.1:8384/rest/folder/errors?folder=FOLDER" -H "X-API-Key: $API_KEY"
 ```
 ### Many Pending Deletes
 If a device shows thousands of "needDeletes", it means files were deleted elsewhere and need to propagate. This is normal after reorganization - let it complete.
 ### Web UI
 Access Syncthing web interface at http://127.0.0.1:8384
 ## SSH Access to Devices
 ### MacBook Pro (via Tailscale)
 ```bash
 sshpass -p 'GrilledCh33s3#' ssh -o StrictHostKeyChecking=no hutson@100.88.161.1
 ```
 ### Check Syncthing remotely
 ```bash
 sshpass -p 'GrilledCh33s3#' ssh hutson@100.88.161.1 'curl -s "http://127.0.0.1:8384/rest/db/status?folder=downloads" -H "X-API-Key: qYkNdVLwy9qZZZ6MqnJr7tHX7KKdxGMJ"'
 ```
--- a/configs/claude-aliases.zsh
+++ b/configs/claude-aliases.zsh
@@ -0,0 +1 @@
 /Users/hutson/.config/shell/claude-aliases.zsh
--- a/configs/ghostty.conf
+++ b/configs/ghostty.conf
@@ -0,0 +1,5 @@
 theme = Gruvbox Dark
 font-feature = -liga
 font-size = 16
 font-family = "JetBrains Mono"
 split-divider-color = #83a598
--- a/mcp-central/.env.example
+++ b/mcp-central/.env.example
@@ -0,0 +1,16 @@
 # MCP Central Server Environment Variables
 # Copy to .env and fill in your values
 # Airtable
 AIRTABLE_API_KEY=patIrM3XYParyuHQL.xxxxx
 # Exa
 EXA_API_KEY=your_exa_api_key
 # TickTick (if using)
 TICKTICK_CLIENT_ID=your_client_id
 TICKTICK_CLIENT_SECRET=your_client_secret
 # Slack (if using)
 SLACK_BOT_TOKEN=xoxb-xxxxx
 SLACK_USER_TOKEN=xoxp-xxxxx
--- a/mcp-central/README.md
+++ b/mcp-central/README.md
@@ -0,0 +1,129 @@
 # Centralized MCP Servers for Homelab
 ## Current State of MCP Remote Access
 **The Problem**: Most MCP servers use `stdio` transport (local process communication).
 Claude Code clients expect to spawn local processes.
 **The Solution**: Use `mcp-remote` to bridge local clients to remote servers.
 ## Architecture
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                    docker-host (10.10.10.206)                  │
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
 │  │ airtable-mcp│  │   exa-mcp   │  │ ticktick-mcp│   ...       │
 │  │  :3001/sse  │  │  :3002/sse  │  │  :3003/sse  │             │
 │  └─────────────┘  └─────────────┘  └─────────────┘             │
 └─────────────────────────────────────────────────────────────────┘
              ▲               ▲               ▲
              │               │               │
       ┌──────┴───────────────┴───────────────┴──────┐
       │              Tailscale / LAN                │
       └──────┬───────────────┬───────────────┬──────┘
              │               │               │
    ┌─────────▼─────┐ ┌───────▼───────┐ ┌─────▼─────────┐
    │   MacBook     │ │   Mac Mini    │ │  Windows PC   │
    │ Claude Code   │ │ Claude Code   │ │ Claude Code   │
    │  mcp-remote   │ │  mcp-remote   │ │  mcp-remote   │
    └───────────────┘ └───────────────┘ └───────────────┘
 ```
 ## Setup
 ### Step 1: Deploy MCP Servers on docker-host
 ```bash
 ssh hutson@10.10.10.206
 cd /opt/mcp-central
 docker-compose up -d
 ```
 ### Step 2: Configure Claude Code Clients
 Each device needs `mcp-remote` installed and configured.
 **Install mcp-remote:**
 ```bash
 npm install -g mcp-remote
 ```
 **Update ~/.claude/settings.json:**
 ```json
 {
  "mcpServers": {
    "airtable": {
      "command": "npx",
      "args": ["mcp-remote", "http://10.10.10.206:3001/sse"]
    },
    "exa": {
      "command": "npx",
      "args": ["mcp-remote", "http://10.10.10.206:3002/sse"]
    },
    "ticktick": {
      "command": "npx",
      "args": ["mcp-remote", "http://10.10.10.206:3003/sse"]
    }
  }
 }
 ```
 **For remote access via Tailscale, use Tailscale IP:**
 ```json
 {
  "mcpServers": {
    "airtable": {
      "command": "npx",
      "args": ["mcp-remote", "http://100.x.x.x:3001/sse"]
    }
  }
 }
 ```
 ## Which Servers Can Be Centralized?
 | Server | Centralizable | Notes |
 |--------|--------------|-------|
 | Airtable | Yes | Just needs API key |
 | Exa | Yes | Just needs API key |
 | TickTick | Yes | OAuth token stored server-side |
 | Slack | Yes | Bot token stored server-side |
 | Ref | Yes | API key only |
 | Beeper | No | Needs local Beeper Desktop |
 | Google Sheets | Partial | OAuth flow needs user interaction |
 | Monarch Money | Partial | Credentials stored server-side |
 ## Alternative: Shared Config File
 If full centralization is too complex, you can at least share the config:
 1. Store `settings.json` in a synced folder (e.g., Syncthing `configs/`)
 2. Symlink from each device:
   ```bash
   ln -s ~/Sync/configs/claude-settings.json ~/.claude/settings.json
   ```
 This doesn't centralize the servers, but ensures all devices have the same config.
 ## Traefik Integration (Optional)
 Add to Traefik for HTTPS access:
 ```yaml
 # /etc/traefik/conf.d/mcp.yaml
 http:
  routers:
    mcp-airtable:
      rule: "Host(`mcp-airtable.htsn.io`)"
      service: mcp-airtable
      tls:
        certResolver: cloudflare
  services:
    mcp-airtable:
      loadBalancer:
        servers:
          - url: "http://10.10.10.206:3001"
 ```
 Then use: `http://mcp-airtable.htsn.io/sse` in your config.
--- a/mcp-central/docker-compose.yml
+++ b/mcp-central/docker-compose.yml
@@ -0,0 +1,58 @@
 # Centralized MCP Server Stack
 # Deploy on docker-host (10.10.10.206)
 # All Claude Code clients connect via HTTP/SSE
 version: "3.8"
 services:
  # MCP Gateway - Routes all MCP requests
  mcp-gateway:
    image: node:20-slim
    container_name: mcp-gateway
    working_dir: /app
    volumes:
      - ./gateway:/app
    ports:
      - "3100:3100"
    command: node server.js
    restart: unless-stopped
    environment:
      - PORT=3100
    networks:
      - mcp-network
  # Airtable MCP Server
  airtable-mcp:
    image: node:20-slim
    container_name: airtable-mcp
    working_dir: /app
    command: sh -c "npm install airtable-mcp-server && npx airtable-mcp-server"
    environment:
      - AIRTABLE_API_KEY=${AIRTABLE_API_KEY}
      - MCP_TRANSPORT=sse
      - MCP_PORT=3001
    ports:
      - "3001:3001"
    restart: unless-stopped
    networks:
      - mcp-network
  # Exa MCP Server
  exa-mcp:
    image: node:20-slim
    container_name: exa-mcp
    working_dir: /app
    command: sh -c "npm install @anthropic/mcp-server-exa && npx @anthropic/mcp-server-exa"
    environment:
      - EXA_API_KEY=${EXA_API_KEY}
      - MCP_TRANSPORT=sse
      - MCP_PORT=3002
    ports:
      - "3002:3002"
    restart: unless-stopped
    networks:
      - mcp-network
 networks:
  mcp-network:
    driver: bridge
--- a/scripts/fix-immich-raf-files.sh
+++ b/scripts/fix-immich-raf-files.sh
@@ -0,0 +1,159 @@
 #!/bin/bash
 #
 # Fix Immich RAF files that were mislabeled as JPG
 # This script:
 # 1. Finds all JPG files that are actually Fujifilm RAF (RAW) files
 # 2. Renames them from .jpg to .raf on the filesystem
 # 3. Updates Immich's database to match
 # 4. Triggers thumbnail regeneration
 #
 # Run from Mac Mini or any machine with SSH access to PVE
 #
 set -e
 # Config
 SSH_PASS="GrilledCh33s3#"
 PVE_IP="10.10.10.120"
 SSH_OPTS="-o StrictHostKeyChecking=no"
 # Colors
 RED='\033[0;31m'
 GREEN='\033[0;32m'
 YELLOW='\033[1;33m'
 NC='\033[0m'
 echo "=========================================="
 echo "  Immich RAF File Fixer"
 echo "=========================================="
 echo ""
 # Test connectivity
 echo "Testing connection to Saltbox..."
 if ! sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP 'qm status 101' &>/dev/null; then
    echo -e "${RED}Error: Cannot connect to PVE or Saltbox VM not running${NC}"
    exit 1
 fi
 echo -e "${GREEN}Connected${NC}"
 echo ""
 # Step 1: Find mislabeled files
 echo "Step 1: Finding JPG files that are actually RAF..."
 echo ""
 MISLABELED_COUNT=$(sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP 'qm guest exec 101 -- bash -c "echo \"SELECT COUNT(*) FROM asset a JOIN asset_exif e ON a.id = e.\\\"assetId\\\" WHERE a.\\\"originalFileName\\\" ILIKE '"'"'%.jpg'"'"' AND e.\\\"fileSizeInByte\\\" > 35000000 AND e.make = '"'"'FUJIFILM'"'"';\" | docker exec -i immich-postgres psql -U hutson -d immich -t"' 2>/dev/null | grep -o '[0-9]*' | head -1)
 echo -e "Found ${YELLOW}${MISLABELED_COUNT}${NC} mislabeled files"
 echo ""
 if [ "$MISLABELED_COUNT" -eq 0 ]; then
    echo -e "${GREEN}No mislabeled files found. Nothing to fix!${NC}"
    exit 0
 fi
 # Confirm before proceeding
 read -p "Proceed with fixing these files? (y/N) " -n 1 -r
 echo ""
 if [[ ! $REPLY =~ ^[Yy]$ ]]; then
    echo "Aborted."
    exit 0
 fi
 echo ""
 echo "Step 2: Creating fix script on Saltbox..."
 # Create the fix script on Saltbox
 sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP 'qm guest exec 101 -- bash -c "cat > /tmp/fix-raf-files.sh << '"'"'SCRIPT'"'"'
 #!/bin/bash
 set -e
 echo "Getting list of mislabeled files..."
 # Get list of files to fix
 docker exec -i immich-postgres psql -U hutson -d immich -t -A -F\",\" -c "
 SELECT a.id, a.\"originalPath\", a.\"originalFileName\"
 FROM asset a
 JOIN asset_exif e ON a.id = e.\"assetId\"
 WHERE a.\"originalFileName\" ILIKE '"'"'"'"'"'"'"'"'%.jpg'"'"'"'"'"'"'"'"'
 AND e.\"fileSizeInByte\" > 35000000
 AND e.make = '"'"'"'"'"'"'"'"'FUJIFILM'"'"'"'"'"'"'"'"'
 " > /tmp/files_to_fix.csv
 TOTAL=$(wc -l < /tmp/files_to_fix.csv)
 echo "Processing $TOTAL files..."
 COUNT=0
 ERRORS=0
 while IFS="," read -r asset_id old_path old_filename; do
    COUNT=$((COUNT + 1))
    # Skip empty lines
    [ -z "$asset_id" ] && continue
    # Calculate new paths
    new_filename=$(echo "$old_filename" | sed "s/\.[jJ][pP][gG]$/.RAF/")
    new_path=$(echo "$old_path" | sed "s/\.[jJ][pP][gG]$/.raf/")
    echo "[$COUNT/$TOTAL] $old_filename -> $new_filename"
    # Rename file on filesystem (inside immich container)
    if docker exec immich test -f "$old_path"; then
        docker exec immich mv "$old_path" "$new_path" 2>/dev/null
        if [ $? -ne 0 ]; then
            echo "  ERROR: Failed to rename file"
            ERRORS=$((ERRORS + 1))
            continue
        fi
    else
        echo "  WARNING: File not found at $old_path"
        ERRORS=$((ERRORS + 1))
        continue
    fi
    # Update database
    docker exec -i immich-postgres psql -U hutson -d immich -c "
        UPDATE asset
        SET \"originalPath\" = '"'"'"'"'"'"'"'"'$new_path'"'"'"'"'"'"'"'"',
            \"originalFileName\" = '"'"'"'"'"'"'"'"'$new_filename'"'"'"'"'"'"'"'"'
        WHERE id = '"'"'"'"'"'"'"'"'$asset_id'"'"'"'"'"'"'"'"'::uuid;
    " > /dev/null 2>&1
    if [ $? -ne 0 ]; then
        echo "  ERROR: Failed to update database"
        # Try to rename back
        docker exec immich mv "$new_path" "$old_path" 2>/dev/null
        ERRORS=$((ERRORS + 1))
        continue
    fi
 done < /tmp/files_to_fix.csv
 echo ""
 echo "=========================================="
 echo "Completed: $((COUNT - ERRORS)) fixed, $ERRORS errors"
 echo "=========================================="
 # Cleanup
 rm -f /tmp/files_to_fix.csv
 SCRIPT
 chmod +x /tmp/fix-raf-files.sh"'
 echo ""
 echo "Step 3: Running fix script (this may take a while)..."
 echo ""
 # Run the fix script
 sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP 'qm guest exec 101 -- bash -c "/tmp/fix-raf-files.sh"' 2>&1 | grep -o '"out-data"[^}]*' | sed 's/"out-data" *: *"//' | sed 's/\\n/\n/g' | sed 's/\\t/\t/g' | sed 's/"$//'
 echo ""
 echo "Step 4: Restarting Immich to pick up changes..."
 sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP 'qm guest exec 101 -- bash -c "docker restart immich"' > /dev/null 2>&1
 echo -e "${GREEN}Done!${NC}"
 echo ""
 echo "Next steps:"
 echo "1. Go to Immich Admin -> Jobs -> Thumbnail Generation -> All -> Start"
 echo "2. This will regenerate thumbnails for all assets"
 echo ""
--- a/scripts/health-check.sh
+++ b/scripts/health-check.sh
@@ -0,0 +1,318 @@
 #!/bin/bash
 #
 # Homelab Health Check & Recovery Script
 # Run this to check status and bring services online
 #
 # Usage: ./health-check.sh [--fix]
 #   Without --fix: Read-only health check
 #   With --fix: Attempt to start stopped services and fix issues
 #
 set -e
 # Colors
 RED='\033[0;31m'
 GREEN='\033[0;32m'
 YELLOW='\033[1;33m'
 NC='\033[0m' # No Color
 # Config
 SSH_PASS="GrilledCh33s3#"
 PVE_IP="10.10.10.120"
 PVE2_IP="10.10.10.102"
 SSH_OPTS="-o StrictHostKeyChecking=no -o ConnectTimeout=5"
 FIX_MODE=false
 if [[ "$1" == "--fix" ]]; then
    FIX_MODE=true
    echo -e "${YELLOW}Running in FIX mode - will attempt to start stopped services${NC}"
    echo ""
 fi
 # Helper functions
 ssh_pve() {
    sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE_IP "$@" 2>/dev/null
 }
 ssh_pve2() {
    sshpass -p "$SSH_PASS" ssh $SSH_OPTS root@$PVE2_IP "$@" 2>/dev/null
 }
 print_status() {
    if [[ "$2" == "ok" ]]; then
        echo -e "  ${GREEN}✓${NC} $1"
    elif [[ "$2" == "warn" ]]; then
        echo -e "  ${YELLOW}!${NC} $1"
    else
        echo -e "  ${RED}✗${NC} $1"
    fi
 }
 # Check if sshpass is installed
 if ! command -v sshpass &> /dev/null; then
    echo -e "${RED}Error: sshpass is not installed${NC}"
    echo "Install with: brew install hudochenkov/sshpass/sshpass"
    exit 1
 fi
 echo "================================"
 echo "  HOMELAB HEALTH CHECK"
 echo "  $(date '+%Y-%m-%d %H:%M:%S')"
 echo "================================"
 echo ""
 # ============================================
 # PVE (Primary Server)
 # ============================================
 echo "--- PVE (10.10.10.120) ---"
 # Check connectivity
 if ssh_pve "echo ok" > /dev/null 2>&1; then
    print_status "PVE Reachable" "ok"
 else
    print_status "PVE Unreachable" "fail"
    echo ""
    echo "--- PVE2 (10.10.10.102) ---"
    if ssh_pve2 "echo ok" > /dev/null 2>&1; then
        print_status "PVE2 Reachable" "ok"
    else
        print_status "PVE2 Unreachable" "fail"
    fi
    exit 1
 fi
 # Check cluster quorum
 QUORUM=$(ssh_pve "pvecm status 2>&1 | grep 'Quorate:' | awk '{print \$2}'" || echo "Unknown")
 if [[ "$QUORUM" == "Yes" ]]; then
    print_status "Cluster Quorum: $QUORUM" "ok"
 else
    print_status "Cluster Quorum: $QUORUM" "fail"
 fi
 # Check CPU temp
 TEMP=$(ssh_pve 'for f in /sys/class/hwmon/hwmon*/temp*_input; do label=$(cat ${f%_input}_label 2>/dev/null); if [ "$label" = "Tctl" ]; then echo $(($(cat $f)/1000)); fi; done')
 if [[ -n "$TEMP" ]]; then
    if [[ "$TEMP" -lt 85 ]]; then
        print_status "CPU Temp: ${TEMP}°C" "ok"
    elif [[ "$TEMP" -lt 90 ]]; then
        print_status "CPU Temp: ${TEMP}°C (warm)" "warn"
    else
        print_status "CPU Temp: ${TEMP}°C (HOT!)" "fail"
    fi
 fi
 # Check ZFS pools
 ZFS_STATUS=$(ssh_pve "zpool status -x" || echo "Unknown")
 if [[ "$ZFS_STATUS" == "all pools are healthy" ]]; then
    print_status "ZFS Pools: Healthy" "ok"
 else
    print_status "ZFS Pools: $ZFS_STATUS" "fail"
 fi
 # Check VMs
 echo ""
 echo "  VMs:"
 CRITICAL_VMS="100 101 110 206"  # TrueNAS, Saltbox, HomeAssistant, Docker-host
 STOPPED_VMS=""
 TRUENAS_ZFS_SUSPENDED=false
 while IFS= read -r line; do
    VMID=$(echo "$line" | awk '{print $1}')
    NAME=$(echo "$line" | awk '{print $2}')
    STATUS=$(echo "$line" | awk '{print $3}')
    if [[ "$STATUS" == "running" ]]; then
        print_status "$VMID $NAME: $STATUS" "ok"
    else
        print_status "$VMID $NAME: $STATUS" "fail"
        if [[ " $CRITICAL_VMS " =~ " $VMID " ]]; then
            STOPPED_VMS="$STOPPED_VMS $VMID"
        fi
    fi
 done < <(ssh_pve "qm list" | tail -n +2)
 # Check TrueNAS ZFS (VM 100) if running
 if ssh_pve "qm status 100" 2>/dev/null | grep -q running; then
    echo ""
    echo "  TrueNAS ZFS:"
    TRUENAS_ZFS=$(ssh_pve 'qm guest exec 100 -- bash -c "zpool list -H -o name,health vault 2>/dev/null"' 2>/dev/null | grep -o '"out-data"[^}]*' | sed 's/"out-data" : "//' | tr -d '\\n"' || echo "Unknown")
    if [[ "$TRUENAS_ZFS" == *"ONLINE"* ]]; then
        print_status "vault pool: ONLINE" "ok"
    elif [[ "$TRUENAS_ZFS" == *"SUSPENDED"* ]]; then
        print_status "vault pool: SUSPENDED (needs zpool clear)" "fail"
        TRUENAS_ZFS_SUSPENDED=true
    elif [[ "$TRUENAS_ZFS" == *"DEGRADED"* ]]; then
        print_status "vault pool: DEGRADED" "warn"
    else
        print_status "vault pool: $TRUENAS_ZFS" "fail"
    fi
 fi
 # Check Containers
 echo ""
 echo "  Containers:"
 CRITICAL_CTS="200 202"  # PiHole, Traefik
 STOPPED_CTS=""
 while IFS= read -r line; do
    CTID=$(echo "$line" | awk '{print $1}')
    STATUS=$(echo "$line" | awk '{print $2}')
    NAME=$(echo "$line" | awk '{print $4}')
    if [[ "$STATUS" == "running" ]]; then
        print_status "$CTID $NAME: $STATUS" "ok"
    else
        print_status "$CTID $NAME: $STATUS" "fail"
        if [[ " $CRITICAL_CTS " =~ " $CTID " ]]; then
            STOPPED_CTS="$STOPPED_CTS $CTID"
        fi
    fi
 done < <(ssh_pve "pct list" | tail -n +2)
 # ============================================
 # PVE2 (Secondary Server)
 # ============================================
 echo ""
 echo "--- PVE2 (10.10.10.102) ---"
 if ssh_pve2 "echo ok" > /dev/null 2>&1; then
    print_status "PVE2 Reachable" "ok"
    # Check CPU temp
    TEMP2=$(ssh_pve2 'for f in /sys/class/hwmon/hwmon*/temp*_input; do label=$(cat ${f%_input}_label 2>/dev/null); if [ "$label" = "Tctl" ]; then echo $(($(cat $f)/1000)); fi; done')
    if [[ -n "$TEMP2" ]]; then
        if [[ "$TEMP2" -lt 85 ]]; then
            print_status "CPU Temp: ${TEMP2}°C" "ok"
        elif [[ "$TEMP2" -lt 90 ]]; then
            print_status "CPU Temp: ${TEMP2}°C (warm)" "warn"
        else
            print_status "CPU Temp: ${TEMP2}°C (HOT!)" "fail"
        fi
    fi
    # Check VMs
    echo ""
    echo "  VMs:"
    while IFS= read -r line; do
        VMID=$(echo "$line" | awk '{print $1}')
        NAME=$(echo "$line" | awk '{print $2}')
        STATUS=$(echo "$line" | awk '{print $3}')
        if [[ "$STATUS" == "running" ]]; then
            print_status "$VMID $NAME: $STATUS" "ok"
        else
            print_status "$VMID $NAME: $STATUS" "fail"
        fi
    done < <(ssh_pve2 "qm list" | tail -n +2)
 else
    print_status "PVE2 Unreachable" "fail"
 fi
 # ============================================
 # FIX MODE - Start stopped services
 # ============================================
 if $FIX_MODE && [[ -n "$STOPPED_VMS" || -n "$STOPPED_CTS" || "$TRUENAS_ZFS_SUSPENDED" == "true" ]]; then
    echo ""
    echo "================================"
    echo "  RECOVERY MODE"
    echo "================================"
    # Fix TrueNAS ZFS SUSPENDED state first (critical for mounts)
    if [[ "$TRUENAS_ZFS_SUSPENDED" == "true" ]]; then
        echo ""
        echo "Clearing TrueNAS ZFS pool errors..."
        ZFS_CLEAR_RESULT=$(ssh_pve 'qm guest exec 100 -- bash -c "zpool clear vault 2>&1 && zpool list -H -o health vault"' 2>/dev/null | grep -o '"out-data"[^}]*' | sed 's/"out-data" : "//' | tr -d '\\n"' || echo "FAILED")
        if [[ "$ZFS_CLEAR_RESULT" == *"ONLINE"* ]]; then
            print_status "vault pool recovered: ONLINE" "ok"
        else
            print_status "vault pool recovery failed: $ZFS_CLEAR_RESULT" "fail"
        fi
        sleep 5  # Give ZFS time to stabilize
    fi
    # Start TrueNAS first (it provides storage)
    if [[ " $STOPPED_VMS " =~ " 100 " ]]; then
        echo ""
        echo "Starting TrueNAS (VM 100) first..."
        ssh_pve "qm start 100" && print_status "TrueNAS started" "ok" || print_status "Failed to start TrueNAS" "fail"
        echo "Waiting 60s for TrueNAS to boot..."
        sleep 60
    fi
    # Start other VMs
    for VMID in $STOPPED_VMS; do
        if [[ "$VMID" != "100" ]]; then
            NAME=$(ssh_pve "qm config $VMID | grep '^name:' | awk '{print \$2}'")
            echo "Starting VM $VMID ($NAME)..."
            ssh_pve "qm start $VMID" && print_status "$NAME started" "ok" || print_status "Failed to start $NAME" "fail"
            sleep 5
        fi
    done
    # Start containers
    for CTID in $STOPPED_CTS; do
        NAME=$(ssh_pve "pct config $CTID | grep '^hostname:' | awk '{print \$2}'")
        echo "Starting CT $CTID ($NAME)..."
        ssh_pve "pct start $CTID" && print_status "$NAME started" "ok" || print_status "Failed to start $NAME" "fail"
        sleep 3
    done
    # Mount TrueNAS shares on Saltbox if Saltbox is running
    if ssh_pve "qm status 101" 2>/dev/null | grep -q running; then
        echo ""
        echo "Checking TrueNAS mounts on Saltbox..."
        sleep 10  # Give services time to start
        MOUNT_STATUS=$(ssh_pve 'qm guest exec 101 -- bash -c "mount | grep -c Media"' 2>/dev/null | grep -o '"out-data"[^}]*' | grep -o '[0-9]' || echo "0")
        if [[ "$MOUNT_STATUS" == "0" ]]; then
            echo "Mounting TrueNAS shares..."
            ssh_pve 'qm guest exec 101 -- bash -c "mount /mnt/local/Media; mount /mnt/local/downloads"' 2>/dev/null
            print_status "TrueNAS mounts attempted" "ok"
            # Restart Immich
            echo "Restarting Immich..."
            ssh_pve 'qm guest exec 101 -- bash -c "docker restart immich"' 2>/dev/null
            print_status "Immich restarted" "ok"
        else
            print_status "TrueNAS mounts already present" "ok"
        fi
    fi
 fi
 # ============================================
 # Summary
 # ============================================
 echo ""
 echo "================================"
 echo "  SUMMARY"
 echo "================================"
 ISSUES=0
 if [[ -n "$STOPPED_VMS" ]] && ! $FIX_MODE; then
    echo -e "${YELLOW}Stopped critical VMs:${NC}$STOPPED_VMS"
    ISSUES=$((ISSUES + 1))
 fi
 if [[ -n "$STOPPED_CTS" ]] && ! $FIX_MODE; then
    echo -e "${YELLOW}Stopped critical containers:${NC}$STOPPED_CTS"
    ISSUES=$((ISSUES + 1))
 fi
 if [[ "$TRUENAS_ZFS_SUSPENDED" == "true" ]] && ! $FIX_MODE; then
    echo -e "${RED}TrueNAS ZFS pool SUSPENDED!${NC} SMB mounts will fail."
    ISSUES=$((ISSUES + 1))
 fi
 if [[ "$ISSUES" -eq 0 ]]; then
    echo -e "${GREEN}All critical services healthy!${NC}"
 else
    echo ""
    echo -e "Run ${YELLOW}./health-check.sh --fix${NC} to attempt recovery"
 fi
 echo ""
 echo "Done: $(date '+%Y-%m-%d %H:%M:%S')"
		`@@ -0,0 +1 @@`
							`/Users/hutson/.config/shell/claude-aliases.zsh`