Files
homelab-docs/VMS.md
2026-01-05 12:28:33 -05:00

14 KiB
Raw Permalink Blame History

VMs and Containers

Complete inventory of all virtual machines and LXC containers across both Proxmox servers.

Overview

Server VMs LXCs Total
PVE (10.10.10.120) 6 3 9
PVE2 (10.10.10.102) 3 0 3
Total 9 3 12

PVE (10.10.10.120) - Primary Server

Virtual Machines

VMID Name IP vCPUs RAM Storage Purpose GPU/Passthrough QEMU Agent
100 truenas 10.10.10.200 8 32GB nvme-mirror1 NAS, central file storage LSI SAS2308 HBA, Samsung NVMe Yes
101 saltbox 10.10.10.100 16 16GB nvme-mirror1 Media automation (Plex, *arr) TITAN RTX Yes
105 fs-dev 10.10.10.5 10 8GB rpool Development environment - Yes
110 homeassistant 10.10.10.110 2 2GB rpool Home automation platform - No
111 lmdev1 10.10.10.111 8 32GB nvme-mirror1 AI/LLM development TITAN RTX Yes
201 copyparty 10.10.10.201 2 2GB rpool File sharing service - Yes
206 docker-host 10.10.10.206 2 4GB rpool Docker services (Excalidraw, Happy, Pulse) - Yes

LXC Containers

CTID Name IP RAM Storage Purpose
200 pihole 10.10.10.10 - rpool DNS, ad blocking
202 traefik 10.10.10.250 - rpool Reverse proxy (primary)
205 findshyt 10.10.10.8 - rpool Custom app

PVE2 (10.10.10.102) - Secondary Server

Virtual Machines

VMID Name IP vCPUs RAM Storage Purpose GPU/Passthrough QEMU Agent
300 gitea-vm 10.10.10.220 2 4GB nvme-mirror3 Git server (Gitea) - Yes
301 trading-vm 10.10.10.221 16 32GB nvme-mirror3 AI trading platform RTX A6000 Yes
302 docker-host2 10.10.10.207 4 8GB nvme-mirror3 Docker host (n8n, automation) - Yes

LXC Containers

None on PVE2.


VM Details

100 - TrueNAS (Storage Server)

Purpose: Central NAS for all file storage, NFS/SMB shares, and media libraries

Specs:

  • OS: TrueNAS SCALE
  • vCPUs: 8
  • RAM: 32 GB
  • Storage: nvme-mirror1 (OS), EMC storage enclosure (data pool via HBA passthrough)
  • Network:
    • Primary: 10 Gb (vmbr2)
    • Secondary: Internal storage network (vmbr3 @ 10.10.20.x)

Hardware Passthrough:

  • LSI SAS2308 HBA (for EMC enclosure drives)
  • Samsung NVMe (for ZFS caching)

ZFS Pools:

  • vault: Main storage pool on EMC drives
  • Boot pool on passed-through NVMe

See: STORAGE.md, EMC-ENCLOSURE.md


101 - Saltbox (Media Automation)

Purpose: Media server stack - Plex, Sonarr, Radarr, SABnzbd, Overseerr, etc.

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 16
  • RAM: 16 GB
  • Storage: nvme-mirror1
  • Network: 10 Gb (vmbr2)

GPU Passthrough:

  • NVIDIA TITAN RTX (for Plex hardware transcoding)

Services:

  • Plex Media Server (plex.htsn.io)
  • Sonarr, Radarr, Lidarr (TV/movie/music automation)
  • SABnzbd, NZBGet (downloaders)
  • Overseerr (request management)
  • Tautulli (Plex stats)
  • Organizr (dashboard)
  • Authelia (SSO authentication)
  • Traefik (reverse proxy - separate from CT 202)

Managed By: Saltbox Ansible playbooks See: SALTBOX.md (coming soon)


105 - fs-dev (Development Environment)

Purpose: General development work, testing, prototyping

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 10
  • RAM: 8 GB
  • Storage: rpool
  • Network: 1 Gb (vmbr0)

110 - Home Assistant (Home Automation)

Purpose: Smart home automation platform

Specs:

  • OS: Home Assistant OS
  • vCPUs: 2
  • RAM: 2 GB
  • Storage: rpool
  • Network: 1 Gb (vmbr0)

Access:

Special Notes:

  • No QEMU agent (Home Assistant OS doesn't support it)
  • No SSH server by default (access via web terminal)

111 - lmdev1 (AI/LLM Development)

Purpose: AI model development, fine-tuning, inference

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 8
  • RAM: 32 GB
  • Storage: nvme-mirror1
  • Network: 1 Gb (vmbr0)

GPU Passthrough:

  • NVIDIA TITAN RTX (shared with Saltbox, but can be dedicated if needed)

Installed:

  • CUDA toolkit
  • Python 3.11+
  • PyTorch, TensorFlow
  • Hugging Face transformers

201 - Copyparty (File Sharing)

Purpose: Simple HTTP file sharing server

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 2
  • RAM: 2 GB
  • Storage: rpool
  • Network: 1 Gb (vmbr0)

Access: https://copyparty.htsn.io


206 - docker-host (Docker Services)

Purpose: General-purpose Docker host for miscellaneous services

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 2
  • RAM: 4 GB
  • Storage: rpool
  • Network: 1 Gb (vmbr0)
  • CPU: host passthrough (for x86-64-v3 support)

Services Running:

  • Excalidraw (excalidraw.htsn.io) - Whiteboard
  • Happy Coder relay server (happy.htsn.io) - Self-hosted relay for Happy Coder mobile app
  • Pulse (pulse.htsn.io) - Monitoring dashboard

Docker Compose Files: /opt/*/docker-compose.yml


300 - gitea-vm (Git Server)

Purpose: Self-hosted Git server

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 2
  • RAM: 4 GB
  • Storage: nvme-mirror3 (PVE2)
  • Network: 1 Gb (vmbr0)

Access: https://git.htsn.io

Repositories:

  • homelab-docs (this documentation)
  • Personal projects
  • Private repos

301 - trading-vm (AI Trading Platform)

Purpose: Algorithmic trading system with AI models

Specs:

  • OS: Ubuntu 22.04
  • vCPUs: 16
  • RAM: 32 GB
  • Storage: nvme-mirror3 (PVE2)
  • Network: 1 Gb (vmbr0)

GPU Passthrough:

  • NVIDIA RTX A6000 (300W TDP, 48GB VRAM)

Software:

  • Trading algorithms
  • AI models for market prediction
  • Real-time data feeds
  • Backtesting infrastructure

LXC Container Details

200 - Pi-hole (DNS & Ad Blocking)

Purpose: Network-wide DNS server and ad blocker

Type: LXC (unprivileged) OS: Ubuntu 22.04 IP: 10.10.10.10 Storage: rpool

Access:

Configuration:

  • Upstream DNS: Cloudflare (1.1.1.1)
  • DHCP: Disabled (router handles DHCP)
  • Interface: All interfaces

Usage: Set router DNS to 10.10.10.10 for network-wide ad blocking


202 - Traefik (Reverse Proxy)

Purpose: Primary reverse proxy for all public-facing services

Type: LXC (unprivileged) OS: Ubuntu 22.04 IP: 10.10.10.250 Storage: rpool

Configuration: /etc/traefik/ Dynamic Configs: /etc/traefik/conf.d/*.yaml

See: TRAEFIK.md for complete documentation

⚠️ Important: This is the PRIMARY Traefik instance. Do NOT confuse with Saltbox's Traefik (VM 101).


205 - FindShyt (Custom App)

Purpose: Custom application (details TBD)

Type: LXC (unprivileged) OS: Ubuntu 22.04 IP: 10.10.10.8 Storage: rpool

Access: https://findshyt.htsn.io


VM Startup Order & Dependencies

Power-On Sequence

When servers boot (after power failure or restart), VMs/CTs start in this order:

PVE (10.10.10.120)

Order Wait VMID Name Reason
1 30s 100 TrueNAS ⚠️ Storage must start first - other VMs depend on NFS
2 60s 101 Saltbox Depends on TrueNAS NFS mounts for media
3 10s 105, 110, 111, 201, 206 Other VMs General VMs, no critical dependencies
4 5s 200, 202, 205 Containers Lightweight, start quickly

Configure startup order (already set):

# View current config
ssh pve 'qm config 100 | grep -E "startup|onboot"'

# Set startup order (example)
ssh pve 'qm set 100 --onboot 1 --startup order=1,up=30'
ssh pve 'qm set 101 --onboot 1 --startup order=2,up=60'

PVE2 (10.10.10.102)

Order Wait VMID Name
1 10s 300, 301 All VMs

Less critical - no dependencies between PVE2 VMs.


Resource Allocation Summary

Total Allocated (PVE)

Resource Allocated Physical % Used
vCPUs 56 64 (32 cores × 2 threads) 88%
RAM 98 GB 128 GB 77%

Note: vCPU overcommit is acceptable (VMs rarely use all cores simultaneously)

Total Allocated (PVE2)

Resource Allocated Physical % Used
vCPUs 18 64 28%
RAM 36 GB 128 GB 28%

PVE2 has significant headroom for additional VMs.


Adding a New VM

Quick Template

# Create VM
ssh pve 'qm create VMID \
  --name myvm \
  --memory 4096 \
  --cores 2 \
  --net0 virtio,bridge=vmbr0 \
  --scsihw virtio-scsi-pci \
  --scsi0 nvme-mirror1:32 \
  --boot order=scsi0 \
  --ostype l26 \
  --agent enabled=1'

# Attach ISO for installation
ssh pve 'qm set VMID --ide2 local:iso/ubuntu-22.04.iso,media=cdrom'

# Start VM
ssh pve 'qm start VMID'

# Access console
ssh pve 'qm vncproxy VMID' # Then connect with VNC client
# Or via Proxmox web UI

Cloud-Init Template (Faster)

Use cloud-init for automated VM deployment:

# Download cloud image
ssh pve 'wget https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img -O /var/lib/vz/template/iso/ubuntu-22.04-cloud.img'

# Create VM
ssh pve 'qm create VMID --name myvm --memory 4096 --cores 2 --net0 virtio,bridge=vmbr0'

# Import disk
ssh pve 'qm importdisk VMID /var/lib/vz/template/iso/ubuntu-22.04-cloud.img nvme-mirror1'

# Attach disk
ssh pve 'qm set VMID --scsi0 nvme-mirror1:vm-VMID-disk-0'

# Add cloud-init drive
ssh pve 'qm set VMID --ide2 nvme-mirror1:cloudinit'

# Set boot disk
ssh pve 'qm set VMID --boot order=scsi0'

# Configure cloud-init (user, SSH key, network)
ssh pve 'qm set VMID --ciuser hutson --sshkeys ~/.ssh/homelab.pub --ipconfig0 ip=10.10.10.XXX/24,gw=10.10.10.1'

# Enable QEMU agent
ssh pve 'qm set VMID --agent enabled=1'

# Resize disk (cloud images are small by default)
ssh pve 'qm resize VMID scsi0 +30G'

# Start VM
ssh pve 'qm start VMID'

Cloud-init VMs boot ready-to-use with SSH keys, static IP, and user configured.


Adding a New LXC Container

# Download template (if not already downloaded)
ssh pve 'pveam update'
ssh pve 'pveam available | grep ubuntu'
ssh pve 'pveam download local ubuntu-22.04-standard_22.04-1_amd64.tar.zst'

# Create container
ssh pve 'pct create CTID local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
  --hostname mycontainer \
  --memory 2048 \
  --cores 2 \
  --net0 name=eth0,bridge=vmbr0,ip=10.10.10.XXX/24,gw=10.10.10.1 \
  --rootfs local-zfs:8 \
  --unprivileged 1 \
  --features nesting=1 \
  --start 1'

# Set root password
ssh pve 'pct exec CTID -- passwd'

# Add SSH key
ssh pve 'pct exec CTID -- mkdir -p /root/.ssh'
ssh pve 'pct exec CTID -- bash -c "echo \"$(cat ~/.ssh/homelab.pub)\" >> /root/.ssh/authorized_keys"'
ssh pve 'pct exec CTID -- chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys'

GPU Passthrough Configuration

Current GPU Assignments

GPU Location Passed To VMID Purpose
NVIDIA Quadro P2000 PVE - - Proxmox host (Plex transcoding via driver)
NVIDIA TITAN RTX PVE saltbox, lmdev1 101, 111 Media transcoding + AI dev (shared)
NVIDIA RTX A6000 PVE2 trading-vm 301 AI trading (dedicated)

How to Pass GPU to VM

  1. Identify GPU PCI ID:

    ssh pve 'lspci | grep -i nvidia'
    # Example output:
    # 81:00.0 VGA compatible controller: NVIDIA Corporation TU102 [TITAN RTX] (rev a1)
    # 81:00.1 Audio device: NVIDIA Corporation TU102 High Definition Audio Controller (rev a1)
    
  2. Pass GPU to VM (include both VGA and Audio):

    ssh pve 'qm set VMID -hostpci0 81:00.0,pcie=1'
    # If multi-function device (GPU + Audio), use:
    ssh pve 'qm set VMID -hostpci0 81:00,pcie=1'
    
  3. Configure VM for GPU:

    # Set machine type to q35
    ssh pve 'qm set VMID --machine q35'
    
    # Set BIOS to OVMF (UEFI)
    ssh pve 'qm set VMID --bios ovmf'
    
    # Add EFI disk
    ssh pve 'qm set VMID --efidisk0 nvme-mirror1:1,format=raw,efitype=4m,pre-enrolled-keys=1'
    
  4. Reboot VM and install NVIDIA drivers inside the VM

See: GPU-PASSTHROUGH.md (coming soon) for detailed guide


Backup Priority

See BACKUP-STRATEGY.md for complete backup plan.

Critical VMs (Must Backup)

Priority VMID Name Reason
🔴 CRITICAL 100 truenas All storage lives here - catastrophic if lost
🟡 HIGH 101 saltbox Complex media stack config
🟡 HIGH 110 homeassistant Home automation config
🟡 HIGH 300 gitea-vm Git repositories (code, docs)
🟡 HIGH 301 trading-vm Trading algorithms and AI models

Medium Priority

VMID Name Notes
200 pihole Easy to rebuild, but DNS config valuable
202 traefik Config files backed up separately

Low Priority (Ephemeral/Rebuildable)

VMID Name Notes
105 fs-dev Development - code is in Git
111 lmdev1 Ephemeral development
201 copyparty Simple app, easy to redeploy
206 docker-host Docker Compose files backed up separately

Quick Reference Commands

# List all VMs
ssh pve 'qm list'
ssh pve2 'qm list'

# List all containers
ssh pve 'pct list'

# Start/stop VM
ssh pve 'qm start VMID'
ssh pve 'qm stop VMID'
ssh pve 'qm shutdown VMID'  # Graceful

# Start/stop container
ssh pve 'pct start CTID'
ssh pve 'pct stop CTID'
ssh pve 'pct shutdown CTID'  # Graceful

# VM console
ssh pve 'qm terminal VMID'

# Container console
ssh pve 'pct enter CTID'

# Clone VM
ssh pve 'qm clone VMID NEW_VMID --name newvm'

# Delete VM
ssh pve 'qm destroy VMID'

# Delete container
ssh pve 'pct destroy CTID'


Last Updated: 2025-12-22