581 lines
14 KiB
Markdown
581 lines
14 KiB
Markdown
# VMs and Containers
|
||
|
||
Complete inventory of all virtual machines and LXC containers across both Proxmox servers.
|
||
|
||
## Overview
|
||
|
||
| Server | VMs | LXCs | Total |
|
||
|--------|-----|------|-------|
|
||
| **PVE** (10.10.10.120) | 6 | 3 | 9 |
|
||
| **PVE2** (10.10.10.102) | 3 | 0 | 3 |
|
||
| **Total** | **9** | **3** | **12** |
|
||
|
||
---
|
||
|
||
## PVE (10.10.10.120) - Primary Server
|
||
|
||
### Virtual Machines
|
||
|
||
| VMID | Name | IP | vCPUs | RAM | Storage | Purpose | GPU/Passthrough | QEMU Agent |
|
||
|------|------|-----|-------|-----|---------|---------|-----------------|------------|
|
||
| **100** | truenas | 10.10.10.200 | 8 | 32GB | nvme-mirror1 | NAS, central file storage | LSI SAS2308 HBA, Samsung NVMe | ✅ Yes |
|
||
| **101** | saltbox | 10.10.10.100 | 16 | 16GB | nvme-mirror1 | Media automation (Plex, *arr) | TITAN RTX | ✅ Yes |
|
||
| **105** | fs-dev | 10.10.10.5 | 10 | 8GB | rpool | Development environment | - | ✅ Yes |
|
||
| **110** | homeassistant | 10.10.10.110 | 2 | 2GB | rpool | Home automation platform | - | ❌ No |
|
||
| **111** | lmdev1 | 10.10.10.111 | 8 | 32GB | nvme-mirror1 | AI/LLM development | TITAN RTX | ✅ Yes |
|
||
| **201** | copyparty | 10.10.10.201 | 2 | 2GB | rpool | File sharing service | - | ✅ Yes |
|
||
| **206** | docker-host | 10.10.10.206 | 2 | 4GB | rpool | Docker services (Excalidraw, Happy, Pulse) | - | ✅ Yes |
|
||
|
||
### LXC Containers
|
||
|
||
| CTID | Name | IP | RAM | Storage | Purpose |
|
||
|------|------|-----|-----|---------|---------|
|
||
| **200** | pihole | 10.10.10.10 | - | rpool | DNS, ad blocking |
|
||
| **202** | traefik | 10.10.10.250 | - | rpool | Reverse proxy (primary) |
|
||
| **205** | findshyt | 10.10.10.8 | - | rpool | Custom app |
|
||
|
||
---
|
||
|
||
## PVE2 (10.10.10.102) - Secondary Server
|
||
|
||
### Virtual Machines
|
||
|
||
| VMID | Name | IP | vCPUs | RAM | Storage | Purpose | GPU/Passthrough | QEMU Agent |
|
||
|------|------|-----|-------|-----|---------|---------|-----------------|------------|
|
||
| **300** | gitea-vm | 10.10.10.220 | 2 | 4GB | nvme-mirror3 | Git server (Gitea) | - | ✅ Yes |
|
||
| **301** | trading-vm | 10.10.10.221 | 16 | 32GB | nvme-mirror3 | AI trading platform | RTX A6000 | ✅ Yes |
|
||
| **302** | docker-host2 | 10.10.10.207 | 4 | 8GB | nvme-mirror3 | Docker host (n8n, automation) | - | ✅ Yes |
|
||
|
||
### LXC Containers
|
||
|
||
None on PVE2.
|
||
|
||
---
|
||
|
||
## VM Details
|
||
|
||
### 100 - TrueNAS (Storage Server)
|
||
|
||
**Purpose**: Central NAS for all file storage, NFS/SMB shares, and media libraries
|
||
|
||
**Specs**:
|
||
- **OS**: TrueNAS SCALE
|
||
- **vCPUs**: 8
|
||
- **RAM**: 32 GB
|
||
- **Storage**: nvme-mirror1 (OS), EMC storage enclosure (data pool via HBA passthrough)
|
||
- **Network**:
|
||
- Primary: 10 Gb (vmbr2)
|
||
- Secondary: Internal storage network (vmbr3 @ 10.10.20.x)
|
||
|
||
**Hardware Passthrough**:
|
||
- LSI SAS2308 HBA (for EMC enclosure drives)
|
||
- Samsung NVMe (for ZFS caching)
|
||
|
||
**ZFS Pools**:
|
||
- `vault`: Main storage pool on EMC drives
|
||
- Boot pool on passed-through NVMe
|
||
|
||
**See**: [STORAGE.md](STORAGE.md), [EMC-ENCLOSURE.md](EMC-ENCLOSURE.md)
|
||
|
||
---
|
||
|
||
### 101 - Saltbox (Media Automation)
|
||
|
||
**Purpose**: Media server stack - Plex, Sonarr, Radarr, SABnzbd, Overseerr, etc.
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 16
|
||
- **RAM**: 16 GB
|
||
- **Storage**: nvme-mirror1
|
||
- **Network**: 10 Gb (vmbr2)
|
||
|
||
**GPU Passthrough**:
|
||
- NVIDIA TITAN RTX (for Plex hardware transcoding)
|
||
|
||
**Services**:
|
||
- Plex Media Server (plex.htsn.io)
|
||
- Sonarr, Radarr, Lidarr (TV/movie/music automation)
|
||
- SABnzbd, NZBGet (downloaders)
|
||
- Overseerr (request management)
|
||
- Tautulli (Plex stats)
|
||
- Organizr (dashboard)
|
||
- Authelia (SSO authentication)
|
||
- Traefik (reverse proxy - separate from CT 202)
|
||
|
||
**Managed By**: Saltbox Ansible playbooks
|
||
**See**: [SALTBOX.md](#) (coming soon)
|
||
|
||
---
|
||
|
||
### 105 - fs-dev (Development Environment)
|
||
|
||
**Purpose**: General development work, testing, prototyping
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 10
|
||
- **RAM**: 8 GB
|
||
- **Storage**: rpool
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
---
|
||
|
||
### 110 - Home Assistant (Home Automation)
|
||
|
||
**Purpose**: Smart home automation platform
|
||
|
||
**Specs**:
|
||
- **OS**: Home Assistant OS
|
||
- **vCPUs**: 2
|
||
- **RAM**: 2 GB
|
||
- **Storage**: rpool
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
**Access**:
|
||
- Web UI: https://homeassistant.htsn.io
|
||
- API: See [HOMEASSISTANT.md](HOMEASSISTANT.md)
|
||
|
||
**Special Notes**:
|
||
- ❌ No QEMU agent (Home Assistant OS doesn't support it)
|
||
- No SSH server by default (access via web terminal)
|
||
|
||
---
|
||
|
||
### 111 - lmdev1 (AI/LLM Development)
|
||
|
||
**Purpose**: AI model development, fine-tuning, inference
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 8
|
||
- **RAM**: 32 GB
|
||
- **Storage**: nvme-mirror1
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
**GPU Passthrough**:
|
||
- NVIDIA TITAN RTX (shared with Saltbox, but can be dedicated if needed)
|
||
|
||
**Installed**:
|
||
- CUDA toolkit
|
||
- Python 3.11+
|
||
- PyTorch, TensorFlow
|
||
- Hugging Face transformers
|
||
|
||
---
|
||
|
||
### 201 - Copyparty (File Sharing)
|
||
|
||
**Purpose**: Simple HTTP file sharing server
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 2
|
||
- **RAM**: 2 GB
|
||
- **Storage**: rpool
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
**Access**: https://copyparty.htsn.io
|
||
|
||
---
|
||
|
||
### 206 - docker-host (Docker Services)
|
||
|
||
**Purpose**: General-purpose Docker host for miscellaneous services
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 2
|
||
- **RAM**: 4 GB
|
||
- **Storage**: rpool
|
||
- **Network**: 1 Gb (vmbr0)
|
||
- **CPU**: `host` passthrough (for x86-64-v3 support)
|
||
|
||
**Services Running**:
|
||
- Excalidraw (excalidraw.htsn.io) - Whiteboard
|
||
- Happy Coder relay server (happy.htsn.io) - Self-hosted relay for Happy Coder mobile app
|
||
- Pulse (pulse.htsn.io) - Monitoring dashboard
|
||
|
||
**Docker Compose Files**: `/opt/*/docker-compose.yml`
|
||
|
||
---
|
||
|
||
### 300 - gitea-vm (Git Server)
|
||
|
||
**Purpose**: Self-hosted Git server
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 2
|
||
- **RAM**: 4 GB
|
||
- **Storage**: nvme-mirror3 (PVE2)
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
**Access**: https://git.htsn.io
|
||
|
||
**Repositories**:
|
||
- homelab-docs (this documentation)
|
||
- Personal projects
|
||
- Private repos
|
||
|
||
---
|
||
|
||
### 301 - trading-vm (AI Trading Platform)
|
||
|
||
**Purpose**: Algorithmic trading system with AI models
|
||
|
||
**Specs**:
|
||
- **OS**: Ubuntu 22.04
|
||
- **vCPUs**: 16
|
||
- **RAM**: 32 GB
|
||
- **Storage**: nvme-mirror3 (PVE2)
|
||
- **Network**: 1 Gb (vmbr0)
|
||
|
||
**GPU Passthrough**:
|
||
- NVIDIA RTX A6000 (300W TDP, 48GB VRAM)
|
||
|
||
**Software**:
|
||
- Trading algorithms
|
||
- AI models for market prediction
|
||
- Real-time data feeds
|
||
- Backtesting infrastructure
|
||
|
||
---
|
||
|
||
## LXC Container Details
|
||
|
||
### 200 - Pi-hole (DNS & Ad Blocking)
|
||
|
||
**Purpose**: Network-wide DNS server and ad blocker
|
||
|
||
**Type**: LXC (unprivileged)
|
||
**OS**: Ubuntu 22.04
|
||
**IP**: 10.10.10.10
|
||
**Storage**: rpool
|
||
|
||
**Access**:
|
||
- Web UI: http://10.10.10.10/admin
|
||
- Public URL: https://pihole.htsn.io
|
||
|
||
**Configuration**:
|
||
- Upstream DNS: Cloudflare (1.1.1.1)
|
||
- DHCP: Disabled (router handles DHCP)
|
||
- Interface: All interfaces
|
||
|
||
**Usage**: Set router DNS to 10.10.10.10 for network-wide ad blocking
|
||
|
||
---
|
||
|
||
### 202 - Traefik (Reverse Proxy)
|
||
|
||
**Purpose**: Primary reverse proxy for all public-facing services
|
||
|
||
**Type**: LXC (unprivileged)
|
||
**OS**: Ubuntu 22.04
|
||
**IP**: 10.10.10.250
|
||
**Storage**: rpool
|
||
|
||
**Configuration**: `/etc/traefik/`
|
||
**Dynamic Configs**: `/etc/traefik/conf.d/*.yaml`
|
||
|
||
**See**: [TRAEFIK.md](TRAEFIK.md) for complete documentation
|
||
|
||
**⚠️ Important**: This is the PRIMARY Traefik instance. Do NOT confuse with Saltbox's Traefik (VM 101).
|
||
|
||
---
|
||
|
||
### 205 - FindShyt (Custom App)
|
||
|
||
**Purpose**: Custom application (details TBD)
|
||
|
||
**Type**: LXC (unprivileged)
|
||
**OS**: Ubuntu 22.04
|
||
**IP**: 10.10.10.8
|
||
**Storage**: rpool
|
||
|
||
**Access**: https://findshyt.htsn.io
|
||
|
||
---
|
||
|
||
## VM Startup Order & Dependencies
|
||
|
||
### Power-On Sequence
|
||
|
||
When servers boot (after power failure or restart), VMs/CTs start in this order:
|
||
|
||
#### PVE (10.10.10.120)
|
||
|
||
| Order | Wait | VMID | Name | Reason |
|
||
|-------|------|------|------|--------|
|
||
| **1** | 30s | 100 | TrueNAS | ⚠️ Storage must start first - other VMs depend on NFS |
|
||
| **2** | 60s | 101 | Saltbox | Depends on TrueNAS NFS mounts for media |
|
||
| **3** | 10s | 105, 110, 111, 201, 206 | Other VMs | General VMs, no critical dependencies |
|
||
| **4** | 5s | 200, 202, 205 | Containers | Lightweight, start quickly |
|
||
|
||
**Configure startup order** (already set):
|
||
```bash
|
||
# View current config
|
||
ssh pve 'qm config 100 | grep -E "startup|onboot"'
|
||
|
||
# Set startup order (example)
|
||
ssh pve 'qm set 100 --onboot 1 --startup order=1,up=30'
|
||
ssh pve 'qm set 101 --onboot 1 --startup order=2,up=60'
|
||
```
|
||
|
||
#### PVE2 (10.10.10.102)
|
||
|
||
| Order | Wait | VMID | Name |
|
||
|-------|------|------|------|
|
||
| **1** | 10s | 300, 301 | All VMs |
|
||
|
||
**Less critical** - no dependencies between PVE2 VMs.
|
||
|
||
---
|
||
|
||
## Resource Allocation Summary
|
||
|
||
### Total Allocated (PVE)
|
||
|
||
| Resource | Allocated | Physical | % Used |
|
||
|----------|-----------|----------|--------|
|
||
| **vCPUs** | 56 | 64 (32 cores × 2 threads) | 88% |
|
||
| **RAM** | 98 GB | 128 GB | 77% |
|
||
|
||
**Note**: vCPU overcommit is acceptable (VMs rarely use all cores simultaneously)
|
||
|
||
### Total Allocated (PVE2)
|
||
|
||
| Resource | Allocated | Physical | % Used |
|
||
|----------|-----------|----------|--------|
|
||
| **vCPUs** | 18 | 64 | 28% |
|
||
| **RAM** | 36 GB | 128 GB | 28% |
|
||
|
||
**PVE2** has significant headroom for additional VMs.
|
||
|
||
---
|
||
|
||
## Adding a New VM
|
||
|
||
### Quick Template
|
||
|
||
```bash
|
||
# Create VM
|
||
ssh pve 'qm create VMID \
|
||
--name myvm \
|
||
--memory 4096 \
|
||
--cores 2 \
|
||
--net0 virtio,bridge=vmbr0 \
|
||
--scsihw virtio-scsi-pci \
|
||
--scsi0 nvme-mirror1:32 \
|
||
--boot order=scsi0 \
|
||
--ostype l26 \
|
||
--agent enabled=1'
|
||
|
||
# Attach ISO for installation
|
||
ssh pve 'qm set VMID --ide2 local:iso/ubuntu-22.04.iso,media=cdrom'
|
||
|
||
# Start VM
|
||
ssh pve 'qm start VMID'
|
||
|
||
# Access console
|
||
ssh pve 'qm vncproxy VMID' # Then connect with VNC client
|
||
# Or via Proxmox web UI
|
||
```
|
||
|
||
### Cloud-Init Template (Faster)
|
||
|
||
Use cloud-init for automated VM deployment:
|
||
|
||
```bash
|
||
# Download cloud image
|
||
ssh pve 'wget https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img -O /var/lib/vz/template/iso/ubuntu-22.04-cloud.img'
|
||
|
||
# Create VM
|
||
ssh pve 'qm create VMID --name myvm --memory 4096 --cores 2 --net0 virtio,bridge=vmbr0'
|
||
|
||
# Import disk
|
||
ssh pve 'qm importdisk VMID /var/lib/vz/template/iso/ubuntu-22.04-cloud.img nvme-mirror1'
|
||
|
||
# Attach disk
|
||
ssh pve 'qm set VMID --scsi0 nvme-mirror1:vm-VMID-disk-0'
|
||
|
||
# Add cloud-init drive
|
||
ssh pve 'qm set VMID --ide2 nvme-mirror1:cloudinit'
|
||
|
||
# Set boot disk
|
||
ssh pve 'qm set VMID --boot order=scsi0'
|
||
|
||
# Configure cloud-init (user, SSH key, network)
|
||
ssh pve 'qm set VMID --ciuser hutson --sshkeys ~/.ssh/homelab.pub --ipconfig0 ip=10.10.10.XXX/24,gw=10.10.10.1'
|
||
|
||
# Enable QEMU agent
|
||
ssh pve 'qm set VMID --agent enabled=1'
|
||
|
||
# Resize disk (cloud images are small by default)
|
||
ssh pve 'qm resize VMID scsi0 +30G'
|
||
|
||
# Start VM
|
||
ssh pve 'qm start VMID'
|
||
```
|
||
|
||
**Cloud-init VMs boot ready-to-use** with SSH keys, static IP, and user configured.
|
||
|
||
---
|
||
|
||
## Adding a New LXC Container
|
||
|
||
```bash
|
||
# Download template (if not already downloaded)
|
||
ssh pve 'pveam update'
|
||
ssh pve 'pveam available | grep ubuntu'
|
||
ssh pve 'pveam download local ubuntu-22.04-standard_22.04-1_amd64.tar.zst'
|
||
|
||
# Create container
|
||
ssh pve 'pct create CTID local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
|
||
--hostname mycontainer \
|
||
--memory 2048 \
|
||
--cores 2 \
|
||
--net0 name=eth0,bridge=vmbr0,ip=10.10.10.XXX/24,gw=10.10.10.1 \
|
||
--rootfs local-zfs:8 \
|
||
--unprivileged 1 \
|
||
--features nesting=1 \
|
||
--start 1'
|
||
|
||
# Set root password
|
||
ssh pve 'pct exec CTID -- passwd'
|
||
|
||
# Add SSH key
|
||
ssh pve 'pct exec CTID -- mkdir -p /root/.ssh'
|
||
ssh pve 'pct exec CTID -- bash -c "echo \"$(cat ~/.ssh/homelab.pub)\" >> /root/.ssh/authorized_keys"'
|
||
ssh pve 'pct exec CTID -- chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys'
|
||
```
|
||
|
||
---
|
||
|
||
## GPU Passthrough Configuration
|
||
|
||
### Current GPU Assignments
|
||
|
||
| GPU | Location | Passed To | VMID | Purpose |
|
||
|-----|----------|-----------|------|---------|
|
||
| **NVIDIA Quadro P2000** | PVE | - | - | Proxmox host (Plex transcoding via driver) |
|
||
| **NVIDIA TITAN RTX** | PVE | saltbox, lmdev1 | 101, 111 | Media transcoding + AI dev (shared) |
|
||
| **NVIDIA RTX A6000** | PVE2 | trading-vm | 301 | AI trading (dedicated) |
|
||
|
||
### How to Pass GPU to VM
|
||
|
||
1. **Identify GPU PCI ID**:
|
||
```bash
|
||
ssh pve 'lspci | grep -i nvidia'
|
||
# Example output:
|
||
# 81:00.0 VGA compatible controller: NVIDIA Corporation TU102 [TITAN RTX] (rev a1)
|
||
# 81:00.1 Audio device: NVIDIA Corporation TU102 High Definition Audio Controller (rev a1)
|
||
```
|
||
|
||
2. **Pass GPU to VM** (include both VGA and Audio):
|
||
```bash
|
||
ssh pve 'qm set VMID -hostpci0 81:00.0,pcie=1'
|
||
# If multi-function device (GPU + Audio), use:
|
||
ssh pve 'qm set VMID -hostpci0 81:00,pcie=1'
|
||
```
|
||
|
||
3. **Configure VM for GPU**:
|
||
```bash
|
||
# Set machine type to q35
|
||
ssh pve 'qm set VMID --machine q35'
|
||
|
||
# Set BIOS to OVMF (UEFI)
|
||
ssh pve 'qm set VMID --bios ovmf'
|
||
|
||
# Add EFI disk
|
||
ssh pve 'qm set VMID --efidisk0 nvme-mirror1:1,format=raw,efitype=4m,pre-enrolled-keys=1'
|
||
```
|
||
|
||
4. **Reboot VM** and install NVIDIA drivers inside the VM
|
||
|
||
**See**: [GPU-PASSTHROUGH.md](#) (coming soon) for detailed guide
|
||
|
||
---
|
||
|
||
## Backup Priority
|
||
|
||
See [BACKUP-STRATEGY.md](BACKUP-STRATEGY.md) for complete backup plan.
|
||
|
||
### Critical VMs (Must Backup)
|
||
|
||
| Priority | VMID | Name | Reason |
|
||
|----------|------|------|--------|
|
||
| 🔴 **CRITICAL** | 100 | truenas | All storage lives here - catastrophic if lost |
|
||
| 🟡 **HIGH** | 101 | saltbox | Complex media stack config |
|
||
| 🟡 **HIGH** | 110 | homeassistant | Home automation config |
|
||
| 🟡 **HIGH** | 300 | gitea-vm | Git repositories (code, docs) |
|
||
| 🟡 **HIGH** | 301 | trading-vm | Trading algorithms and AI models |
|
||
|
||
### Medium Priority
|
||
|
||
| VMID | Name | Notes |
|
||
|------|------|-------|
|
||
| 200 | pihole | Easy to rebuild, but DNS config valuable |
|
||
| 202 | traefik | Config files backed up separately |
|
||
|
||
### Low Priority (Ephemeral/Rebuildable)
|
||
|
||
| VMID | Name | Notes |
|
||
|------|------|-------|
|
||
| 105 | fs-dev | Development - code is in Git |
|
||
| 111 | lmdev1 | Ephemeral development |
|
||
| 201 | copyparty | Simple app, easy to redeploy |
|
||
| 206 | docker-host | Docker Compose files backed up separately |
|
||
|
||
---
|
||
|
||
## Quick Reference Commands
|
||
|
||
```bash
|
||
# List all VMs
|
||
ssh pve 'qm list'
|
||
ssh pve2 'qm list'
|
||
|
||
# List all containers
|
||
ssh pve 'pct list'
|
||
|
||
# Start/stop VM
|
||
ssh pve 'qm start VMID'
|
||
ssh pve 'qm stop VMID'
|
||
ssh pve 'qm shutdown VMID' # Graceful
|
||
|
||
# Start/stop container
|
||
ssh pve 'pct start CTID'
|
||
ssh pve 'pct stop CTID'
|
||
ssh pve 'pct shutdown CTID' # Graceful
|
||
|
||
# VM console
|
||
ssh pve 'qm terminal VMID'
|
||
|
||
# Container console
|
||
ssh pve 'pct enter CTID'
|
||
|
||
# Clone VM
|
||
ssh pve 'qm clone VMID NEW_VMID --name newvm'
|
||
|
||
# Delete VM
|
||
ssh pve 'qm destroy VMID'
|
||
|
||
# Delete container
|
||
ssh pve 'pct destroy CTID'
|
||
```
|
||
|
||
---
|
||
|
||
## Related Documentation
|
||
|
||
- [STORAGE.md](STORAGE.md) - Storage pool assignments
|
||
- [SSH-ACCESS.md](SSH-ACCESS.md) - How to access VMs
|
||
- [BACKUP-STRATEGY.md](BACKUP-STRATEGY.md) - VM backup strategy
|
||
- [POWER-MANAGEMENT.md](POWER-MANAGEMENT.md) - VM resource optimization
|
||
- [NETWORK.md](NETWORK.md) - Which bridge to use for new VMs
|
||
|
||
---
|
||
|
||
**Last Updated**: 2025-12-22
|