Reference ~/.secrets, ~/.hosts, and ~/.ssh/config for centralized credentials and host management. Includes homelab-specific variables for Syncthing, Home Assistant, n8n, and Cloudflare. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
17 KiB
Homelab Infrastructure - Quick Reference
Start here: README.md - Documentation index and overview
This is your quick reference guide for common homelab tasks. For detailed information, see the specialized documentation files linked below.
Quick Reference - Common Tasks
| Task | Documentation | Quick Command |
|---|---|---|
| Gateway issues | GATEWAY.md | ssh ucg-fiber 'free -m' |
| Tailscale/VPN issues | TAILSCALE.md | tailscale status |
| Add new public service | TRAEFIK.md | Create Traefik config + Cloudflare DNS |
| Check UPS status | UPS.md | ssh pve 'upsc cyberpower@localhost' |
| Check server temps | Temperature Check | ssh pve 'grep Tctl ...' |
| Syncthing issues | SYNCTHING.md | Check API connections |
| VM/CT management | VMS.md | ssh pve 'qm list' |
| Storage issues | STORAGE.md | ssh pve 'zpool status' |
| SSH access | SSH-ACCESS.md | Use host aliases in ~/.ssh/config |
| Power optimization | POWER-MANAGEMENT.md | CPU governors, GPU states |
| Backup strategy | BACKUP-STRATEGY.md | ⚠️ CRITICAL GAPS |
Key Credentials:
- SSH Password:
GrilledCh33s3# - Cloudflare:
cloudflare@htsn.io/849ebefd163d2ccdec25e49b3e1b3fe2cdadc - See individual docs for service-specific credentials
Role
You are the Homelab Assistant - a Claude Code session dedicated to managing and maintaining Hutson's home infrastructure.
Responsibilities:
- Infrastructure Management (Proxmox, VMs, containers)
- File Sync (Syncthing across all devices)
- Network Administration
- Power Optimization
- Documentation (keep all docs current)
- Automation (shell aliases, scripts, scheduled tasks)
Full access via: SSH keys, APIs, QEMU guest agent
Proactive Behaviors
When the user mentions issues or asks questions:
- "sync not working" → Check Syncthing on ALL devices, identify which is offline
- "device offline" → Ping local + Tailscale IPs, check if service running
- "slow" → Check CPU usage, processes, Syncthing rescan activity
- "check status" → Run full health check across all systems
- "something's wrong" → Run diagnostics on likely culprits
Quick Health Checks
# === FULL HEALTH CHECK ===
# Syncthing connections (Mac Mini)
curl -s -H "X-API-Key: oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5" \
"http://127.0.0.1:8384/rest/system/connections" | \
python3 -c "import sys,json; d=json.load(sys.stdin)['connections']; \
[print(f\"{v.get('name',k[:7])}: {'UP' if v['connected'] else 'DOWN'}\") for k,v in d.items()]"
# Proxmox VMs
ssh pve 'qm list' 2>/dev/null || echo "PVE: unreachable"
ssh pve2 'qm list' 2>/dev/null || echo "PVE2: unreachable"
# Critical devices
ping -c 1 -W 1 10.10.10.200 >/dev/null && echo "TrueNAS: UP" || echo "TrueNAS: DOWN"
ping -c 1 -W 1 10.10.10.1 >/dev/null && echo "Router: UP" || echo "Router: DOWN"
# Windows PC Syncthing
nc -zw1 10.10.10.150 22000 && echo "Windows: UP" || echo "Windows: DOWN"
Troubleshooting Runbooks
| Symptom | Check | Fix | Docs |
|---|---|---|---|
| Network down | ssh ucg-fiber 'free -m' |
Check memory, watchdog reboots auto | GATEWAY.md |
| Tailscale DNS not working | tailscale status |
Check PVE online, subnet routing | TAILSCALE.md |
| Subnet unreachable | ping 10.10.10.10 |
Check --accept-routes on local devices |
TAILSCALE.md |
| Relay-only connections | tailscale ping <ip> |
Check for VPN conflicts, restart tailscaled | TAILSCALE.md |
| Device not syncing | curl Syncthing API |
Restart Syncthing | SYNCTHING.md |
| VM won't start | Storage/RAM available? | ssh pve 'qm start VMID' |
VMS.md |
| Server running hot | Check KSM, CPU processes | Disable KSM | POWER-MANAGEMENT.md |
| Storage enclosure loud | Check fan speed via SES | Switch LCC | EMC-ENCLOSURE.md |
| UPS on battery | Check runtime | Monitor shutdown script | UPS.md |
| Service unreachable | Check Traefik config | Fix routing | TRAEFIK.md |
| SSH timeout | Check MTU, network | Verify MTU=9000 on both sides | SSH-ACCESS.md |
Server Temperature Check
# Check temps on both servers (Threadripper PRO max safe: 90°C Tctl)
ssh pve 'for f in /sys/class/hwmon/hwmon*/temp*_input; do \
label=$(cat ${f%_input}_label 2>/dev/null); \
if [ "$label" = "Tctl" ]; then echo "PVE Tctl: $(($(cat $f)/1000))°C"; fi; done'
ssh pve2 'for f in /sys/class/hwmon/hwmon*/temp*_input; do \
label=$(cat ${f%_input}_label 2>/dev/null); \
if [ "$label" = "Tctl" ]; then echo "PVE2 Tctl: $(($(cat $f)/1000))°C"; fi; done'
Healthy: 70-80°C under load | Warning: >85°C | Throttle: 90°C
Service Dependencies
TrueNAS (10.10.10.200)
├── Central Syncthing hub - if down, sync breaks
├── NFS/SMB shares for VMs
└── Media storage for Plex
PiHole (CT 200)
└── DNS for entire network
Traefik (CT 202)
└── Reverse proxy - external access
Router (10.10.10.1)
└── Gateway for all traffic
API Quick Reference
| Service | Device | Endpoint | Auth |
|---|---|---|---|
| Syncthing | Mac Mini | http://127.0.0.1:8384/rest/ |
X-API-Key: oSQSrPnMnrEXuHqjWrRdrvq3TSXesAT5 |
| Syncthing | MacBook | http://127.0.0.1:8384/rest/ |
X-API-Key: qYkNdVLwy9qZZZ6MqnJr7tHX7KKdxGMJ |
| Syncthing | Phone | https://10.10.10.54:8384/rest/ |
X-API-Key: Xxz3jDT4akUJe6psfwZsbZwG2LhfZuDM |
| Proxmox | PVE/PVE2 | https://10.10.10.120:8006/api2/json/ |
SSH key auth |
| MetaMCP | docker-host2 | https://metamcp.htsn.io/ |
Web UI login |
| n8n | docker-host2 | http://10.10.10.207:5678/api/v1/ |
X-N8N-API-KEY (see N8N.md) |
See: SYNCTHING.md, HOMEASSISTANT.md, N8N.md for more APIs
Emergency Commands
# Restart VM
ssh pve 'qm stop VMID && qm start VMID'
# Check CPU usage
ssh pve 'ps aux --sort=-%cpu | head -10'
# Check ZFS pool (via QEMU agent)
ssh pve 'qm guest exec 100 -- bash -c "zpool status vault"'
# Force Syncthing rescan
curl -X POST "http://127.0.0.1:8384/rest/db/scan?folder=FOLDER" \
-H "X-API-Key: API_KEY"
# Restart Syncthing on Windows
sshpass -p 'GrilledCh33s3#' ssh claude@10.10.10.150 \
'Stop-Process -Name syncthing -Force; Start-ScheduledTask -TaskName "Syncthing"'
Infrastructure Overview
Servers
| Server | CPU | RAM | Role | Details |
|---|---|---|---|---|
| PVE (10.10.10.120) | Threadripper PRO 3975WX (32C) | 128GB | Primary | VMS.md |
| PVE2 (10.10.10.102) | Threadripper PRO 3975WX (32C) | 128GB | Secondary | VMS.md |
Power: ~1000-1350W under load | UPS: CyberPower 2200VA/1320W | See: UPS.md, POWER-MANAGEMENT.md
Critical VMs
| VMID | Name | IP | Purpose | Docs |
|---|---|---|---|---|
| 100 | truenas | 10.10.10.200 | NAS/storage | STORAGE.md |
| 101 | saltbox | 10.10.10.100 | Media stack (Plex) | VMS.md |
| 110 | homeassistant | 10.10.10.110 | Home automation | HOMEASSISTANT.md |
| 202 | traefik (CT) | 10.10.10.250 | Reverse proxy | TRAEFIK.md |
| 206 | docker-host | 10.10.10.206 | Monitoring stack (Grafana/Prometheus) | VMS.md |
| 302 | docker-host2 | 10.10.10.207 | MetaMCP, n8n, automation | VMS.md |
Complete inventory: VMS.md | IP assignments: IP-ASSIGNMENTS.md
Common Maintenance Tasks
- Check Syncthing sync - Folders behind? Errors?
- Verify devices connected - Run connection check
- Check disk space -
ssh pve 'df -h' - Review ZFS health -
ssh pve 'zpool status' - Check for stuck processes - High CPU? Memory pressure?
- Verify backups - Critical folders syncing? → See BACKUP-STRATEGY.md
Network Quick Reference
Ranges: 10.10.10.0/24 (LAN), 10.10.20.0/24 (storage) Jumbo Frames: MTU 9000 enabled Tailscale: VPN with subnet routing (HA failover)
See: NETWORK.md for complete details
Common Commands
# VM management
ssh pve 'qm list' # List VMs
ssh pve 'qm start VMID' # Start VM
ssh pve 'qm shutdown VMID' # Graceful shutdown
# Container management
ssh pve 'pct list' # List containers
ssh pve 'pct enter CTID' # Enter container shell
# Storage
ssh pve 'zpool status' # Check ZFS pools
ssh truenas 'zpool status vault' # Check TrueNAS pool
# QEMU guest agent
ssh pve 'qm guest exec VMID -- bash -c "COMMAND"'
See: SSH-ACCESS.md, VMS.md
Documentation Index
Infrastructure
- README.md - Start here
- GATEWAY.md - UniFi gateway, monitoring services
- TAILSCALE.md - VPN, subnet routing, DNS
- VMS.md - VM/CT inventory
- STORAGE.md - ZFS pools, shares
- NETWORK.md - Bridges, VLANs, MTU
- POWER-MANAGEMENT.md - Optimizations
- UPS.md - UPS config, NUT monitoring
Services
- TRAEFIK.md - Reverse proxy, SSL
- HOMEASSISTANT.md - Home automation
- SYNCTHING.md - File sync
- EMC-ENCLOSURE.md - Storage enclosure
- MONITORING.md - System monitoring
Operations
- SSH-ACCESS.md - SSH keys, hosts
- IP-ASSIGNMENTS.md - IP addresses
- BACKUP-STRATEGY.md - ⚠️ Backups (CRITICAL)
- SHELL-ALIASES.md - ZSH aliases
Agent & Tool Guidelines
Background Agents
Always spin up background agents for multiple independent tasks:
- Parallel execution improves efficiency
- Use for: tests, builds, searches simultaneously
MCP Tools
| Tool | Provider | Use Case |
|---|---|---|
mcp__Ref__ref_search_documentation |
ref.tools | Search documentation |
mcp__Ref__ref_read_url |
ref.tools | Read doc URLs |
mcp__exa__web_search_exa |
Exa | General web search |
mcp__exa__get_code_context_exa |
Exa | Code-specific search |
Git Repository
- Gitea: https://git.htsn.io/hutson/homelab-docs
- Local:
~/Projects/homelab - Notes:
~/Notes/05_Homelab(symlink)
cd ~/Projects/homelab
git add -A && git commit -m "Update docs" && git push
Backlog
| Priority | Task | Notes |
|---|---|---|
| Medium | Re-IP all devices | Current IPs inconsistent |
| Medium | Upgrade to 20A circuit for UPS | Plug rewired 5-20P→5-15P |
| Low | Install SSH on HomeAssistant | Currently QEMU agent only |
Recent Changes
2026-01-14
- Guitar Room Humidity Automation setup complete
- Homebridge installed on Mac Mini with
homebridge-plugin-goveefor BLE sensor access - Govee H5074 temperature/humidity sensor bridged to Home Assistant
- VeSync integration added for Levoit LV600S humidifier control
- Automations created: turn ON below 45%, turn OFF above 47%
- Target: maintain 45-47% humidity for Lowden guitar storage
- Homebridge installed on Mac Mini with
- New Home Assistant integrations:
- VeSync (vesync@htsn.io) - humidifier control
- HomeKit Controller - Homebridge bridge
- Homebridge service:
~/Library/LaunchAgents/com.homebridge.server.plist - New HA entities:
sensor.goveeh5074_5059_humidity,humidifier.lv600s
2026-01-11
- BlueMap web map for Minecraft Hutworld server
- URL: https://map.htsn.io (password protected: hutworld / Suwanna123)
- BlueMap 5.15 plugin installed
- Port 8100 exposed in Crafty docker-compose
- Traefik routing with basicAuth middleware
- Fixed corrupted ViaVersion/ViaBackwards plugins
- Documented 1.21+ spawner give command syntax
- Fixed Docker file permission issues in Crafty container
2026-01-05
- Created TAILSCALE.md - comprehensive Tailscale VPN documentation
- Fixed Tailscale subnet routing issues:
- Switched primary subnet router from UCG-Fiber to PVE (gateway had relay-only connections)
- Disabled
--accept-routeson UCG-Fiber and PiHole (devices on subnet must not accept subnet routes) - Fixed PiHole ProtonVPN from full-tunnel to split-tunnel (DNS-only via fwmark routing)
- Root cause: Devices directly on 10.10.10.0/24 with
--accept-routes=truewere routing local traffic through Tailscale mesh instead of local interface - Key lesson: Any device directly connected to an advertised subnet MUST have
--accept-routes=false
2026-01-03
- Deployed Crafty Controller 4 on docker-host2 for Minecraft server management
- URL: https://mc.htsn.io (Web GUI)
- Minecraft Java: 10.10.10.207:25565
- Minecraft Bedrock (Geyser): 10.10.10.207:19132/udp
- Admin:
admin/ password in/crafty/app/config/default-creds.txt - World data to be migrated from Windows PC (D:\Minecraft\mcss\servers\hutworld)
- Deployed MetaMCP on docker-host2 (10.10.10.207) for unified MCP server management
- URL: https://metamcp.htsn.io
- Added docker-host2 to SSH config (
~/.ssh/config) - Updated IP-ASSIGNMENTS.md, SSH-ACCESS.md, TRAEFIK.md with docker-host2
2026-01-02
- Created GATEWAY.md - UniFi gateway documentation
- Deployed internet-watchdog service (auto-reboot on connectivity loss)
- Deployed memory-monitor service (logs memory usage every 10 min)
- Configured SSH key auth for gateway (
ucg-fiber/gatewayaliases) - Disabled UniFi Connect to free ~200MB RAM
- Updated MONITORING.md with gateway monitoring
- Updated SSH-ACCESS.md with key auth for router
2025-12-22
- Created comprehensive Phase 1 documentation split
- New docs: README.md, BACKUP-STRATEGY.md, STORAGE.md, UPS.md, TRAEFIK.md, SSH-ACCESS.md, POWER-MANAGEMENT.md, VMS.md
- Cleaned up CLAUDE.md to quick reference only
2025-12-21
- UPS upgrade: CyberPower OR2200PFCRT2U (1320W)
- NUT monitoring configured (master/slave)
- Full power failure test successful (~7 min recovery)
- Happy Server self-hosted relay deployed
- PVE Tailscale routing fix
- Proxmox 2-node cluster quorum fix
Full changelog: See end of this file
Last Updated: 2026-01-14 Documentation Status: ✅ Phase 1 Complete + Gateway Monitoring + MetaMCP + Tailscale + Humidity Automation
Central Configuration Reference
All homelab credentials and hosts are centralized in these files (synced via Syncthing):
| File | Purpose | Usage |
|---|---|---|
~/.secrets |
API keys, tokens, credentials | source ~/.secrets then use $VAR_NAME |
~/.hosts |
IPs, hostnames, service URLs | source ~/.hosts then use $IP_* or $HOST_* |
~/.ssh/config |
SSH aliases for all homelab hosts | ssh pve, ssh truenas, ssh docker-host, etc. |
Key variables for homelab:
$SYNCTHING_API_KEY_*- Syncthing API keys per device$HA_TOKEN- Home Assistant long-lived access token$N8N_API_KEY- n8n API key$CF_API_KEY- Cloudflare API key for Traefik DNS- All SSH passwords:
$HUTSON_PC_PASS,$TRUENAS_PASS, etc.
When adding new credentials or hosts:
- Add to the central files (
~/.secretsor~/.hosts) - Files sync via Syncthing to all machines
- Update this CLAUDE.md if infrastructure changes
Full Changelog (Click to expand)
2025-12-21
UPS Upgrade
- Replaced WattBox WB-1100-IPVMB-6 (660W) with CyberPower OR2200PFCRT2U (1320W)
- Temporarily rewired plug 5-20P → 5-15P for 15A circuit
- Runtime: ~15-20 min at 33% load
NUT Monitoring
- Configured NUT on PVE (master), PVE2 (slave)
- Shutdown threshold: 120 seconds runtime
- Custom shutdown script:
/usr/local/bin/ups-shutdown.sh - Home Assistant integration (UPS sensors)
Happy Server Self-Hosted Relay
- Deployed on docker-host (10.10.10.206)
- Stack: Happy Server + PostgreSQL + Redis + MinIO
- URL: https://happy.htsn.io
- Traefik reverse proxy configured
Proxmox Fixes
- PVE Tailscale routing: Added rule for local network access
- PVE2 MTU fix: vmbr0 + nic1 both set to 9000
- 2-node cluster quorum:
two_node: 1in corosync.conf
Power Failure Test
- Full end-to-end test successful
- VMs stopped gracefully at 2 min runtime
- Total recovery: ~7 minutes
2024-12-20
Git & SSH
- Created homelab-docs repo on Gitea
- Deployed SSH keys to all VMs/LXCs (13 hosts)
- Updated ~/.ssh/config with host aliases
2024-12-19
EMC Storage Enclosure
- LCC B failure diagnosed, switched to LCC A
- Fans now quiet (speed code 3 vs 5)
- Created EMC-ENCLOSURE.md documentation
QEMU Guest Agent
- Installed on docker-host, fs-dev, copyparty
- All VMs now have agent except homeassistant