Auto-sync: 20260105-172251
This commit is contained in:
20
CLAUDE.md
20
CLAUDE.md
@@ -11,6 +11,7 @@ This is your **quick reference guide** for common homelab tasks. For detailed in
|
||||
| Task | Documentation | Quick Command |
|
||||
|------|--------------|---------------|
|
||||
| **Gateway issues** | [GATEWAY.md](GATEWAY.md) | `ssh ucg-fiber 'free -m'` |
|
||||
| **Tailscale/VPN issues** | [TAILSCALE.md](TAILSCALE.md) | `tailscale status` |
|
||||
| **Add new public service** | [TRAEFIK.md](TRAEFIK.md) | Create Traefik config + Cloudflare DNS |
|
||||
| **Check UPS status** | [UPS.md](UPS.md) | `ssh pve 'upsc cyberpower@localhost'` |
|
||||
| **Check server temps** | [Temperature Check](#server-temperature-check) | `ssh pve 'grep Tctl ...'` |
|
||||
@@ -85,6 +86,9 @@ nc -zw1 10.10.10.150 22000 && echo "Windows: UP" || echo "Windows: DOWN"
|
||||
| Symptom | Check | Fix | Docs |
|
||||
|---------|-------|-----|------|
|
||||
| **Network down** | `ssh ucg-fiber 'free -m'` | Check memory, watchdog reboots auto | [GATEWAY.md](GATEWAY.md) |
|
||||
| **Tailscale DNS not working** | `tailscale status` | Check PVE online, subnet routing | [TAILSCALE.md](TAILSCALE.md) |
|
||||
| **Subnet unreachable** | `ping 10.10.10.10` | Check `--accept-routes` on local devices | [TAILSCALE.md](TAILSCALE.md) |
|
||||
| **Relay-only connections** | `tailscale ping <ip>` | Check for VPN conflicts, restart tailscaled | [TAILSCALE.md](TAILSCALE.md) |
|
||||
| Device not syncing | `curl Syncthing API` | Restart Syncthing | [SYNCTHING.md](SYNCTHING.md) |
|
||||
| VM won't start | Storage/RAM available? | `ssh pve 'qm start VMID'` | [VMS.md](VMS.md) |
|
||||
| Server running hot | Check KSM, CPU processes | Disable KSM | [POWER-MANAGEMENT.md](POWER-MANAGEMENT.md) |
|
||||
@@ -246,9 +250,10 @@ ssh pve 'qm guest exec VMID -- bash -c "COMMAND"'
|
||||
### Infrastructure
|
||||
- [README.md](README.md) - Start here
|
||||
- [GATEWAY.md](GATEWAY.md) - UniFi gateway, monitoring services
|
||||
- [TAILSCALE.md](TAILSCALE.md) - VPN, subnet routing, DNS
|
||||
- [VMS.md](VMS.md) - VM/CT inventory
|
||||
- [STORAGE.md](STORAGE.md) - ZFS pools, shares
|
||||
- [NETWORK.md](NETWORK.md) - Bridges, VLANs, Tailscale
|
||||
- [NETWORK.md](NETWORK.md) - Bridges, VLANs, MTU
|
||||
- [POWER-MANAGEMENT.md](POWER-MANAGEMENT.md) - Optimizations
|
||||
- [UPS.md](UPS.md) - UPS config, NUT monitoring
|
||||
|
||||
@@ -310,6 +315,15 @@ git add -A && git commit -m "Update docs" && git push
|
||||
|
||||
## Recent Changes
|
||||
|
||||
### 2026-01-05
|
||||
- Created [TAILSCALE.md](TAILSCALE.md) - comprehensive Tailscale VPN documentation
|
||||
- **Fixed Tailscale subnet routing issues:**
|
||||
- Switched primary subnet router from UCG-Fiber to PVE (gateway had relay-only connections)
|
||||
- Disabled `--accept-routes` on UCG-Fiber and PiHole (devices on subnet must not accept subnet routes)
|
||||
- Fixed PiHole ProtonVPN from full-tunnel to split-tunnel (DNS-only via fwmark routing)
|
||||
- **Root cause:** Devices directly on 10.10.10.0/24 with `--accept-routes=true` were routing local traffic through Tailscale mesh instead of local interface
|
||||
- **Key lesson:** Any device directly connected to an advertised subnet MUST have `--accept-routes=false`
|
||||
|
||||
### 2026-01-03
|
||||
- Deployed **Crafty Controller 4** on docker-host2 for Minecraft server management
|
||||
- URL: https://mc.htsn.io (Web GUI)
|
||||
@@ -348,8 +362,8 @@ git add -A && git commit -m "Update docs" && git push
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2026-01-03
|
||||
**Documentation Status**: ✅ Phase 1 Complete + Gateway Monitoring + MetaMCP
|
||||
**Last Updated**: 2026-01-05
|
||||
**Documentation Status**: ✅ Phase 1 Complete + Gateway Monitoring + MetaMCP + Tailscale
|
||||
|
||||
---
|
||||
|
||||
|
||||
296
TAILSCALE.md
Normal file
296
TAILSCALE.md
Normal file
@@ -0,0 +1,296 @@
|
||||
# Tailscale VPN Configuration
|
||||
|
||||
## Overview
|
||||
|
||||
Tailscale provides secure remote access to the homelab via a mesh VPN. This document covers the configuration, subnet routing, and critical gotchas learned from troubleshooting.
|
||||
|
||||
---
|
||||
|
||||
## Network Architecture
|
||||
|
||||
```
|
||||
Remote Clients (MacBook, Phone)
|
||||
│
|
||||
▼ Tailscale Mesh (100.x.x.x)
|
||||
│
|
||||
┌───────┴────────┐
|
||||
│ │
|
||||
▼ ▼
|
||||
PVE (Subnet Router) UCG-Fiber (Gateway)
|
||||
100.113.177.80 100.94.246.32
|
||||
│ │
|
||||
│ 10.10.10.0/24 │
|
||||
└──────────┬───────────┘
|
||||
│
|
||||
┌──────┴──────┐
|
||||
│ │
|
||||
PiHole TrueNAS
|
||||
10.10.10.10 10.10.10.200
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Device Configuration
|
||||
|
||||
| Device | Tailscale IP | Role | Accept Routes | Advertise Routes |
|
||||
|--------|--------------|------|---------------|------------------|
|
||||
| **PVE** | 100.113.177.80 | Subnet Router (Primary) | **NO** | 10.10.10.0/24, 10.10.20.0/24 |
|
||||
| **UCG-Fiber** | 100.94.246.32 | Gateway (backup) | **NO** | (disabled) |
|
||||
| **PiHole** | 100.112.59.128 | DNS Server | **NO** | None |
|
||||
| **TrueNAS** | 100.100.94.71 | NAS | Yes | None |
|
||||
| **Mac-Mini** | 100.108.89.58 | Desktop | Yes | None |
|
||||
| **MacBook** | 100.88.161.1 | Laptop | Yes | None |
|
||||
| **Phone** | 100.106.175.37 | Mobile | Yes | None |
|
||||
|
||||
---
|
||||
|
||||
## Critical Configuration Rules
|
||||
|
||||
### 1. Devices on the Advertised Subnet MUST Have `--accept-routes=false`
|
||||
|
||||
**Problem:** If a device is directly connected to 10.10.10.0/24 AND has `--accept-routes=true`, Tailscale will route local subnet traffic through the mesh instead of the local interface.
|
||||
|
||||
**Symptom:** Device can't reach neighbors on the same subnet; `ip route get 10.10.10.X` shows `dev tailscale0` instead of the local interface.
|
||||
|
||||
**Fix:**
|
||||
```bash
|
||||
# On any device directly connected to 10.10.10.0/24
|
||||
tailscale set --accept-routes=false
|
||||
```
|
||||
|
||||
**Affected devices:**
|
||||
- UCG-Fiber (gateway) - directly on 10.10.10.0/24
|
||||
- PiHole - directly on 10.10.10.0/24
|
||||
- PVE - directly on 10.10.10.0/24 (but is the subnet router, so different)
|
||||
|
||||
### 2. Only ONE Device Should Be Primary Subnet Router
|
||||
|
||||
**Problem:** Multiple devices advertising the same subnet can cause routing conflicts or failover issues.
|
||||
|
||||
**Current Setup:**
|
||||
- **PVE** is the primary subnet router for both 10.10.10.0/24 and 10.10.20.0/24
|
||||
- **UCG-Fiber** has subnet advertisement DISABLED (was causing relay-only connections)
|
||||
|
||||
**To change subnet router:**
|
||||
1. Go to https://login.tailscale.com/admin/machines
|
||||
2. Disable route on old device, enable on new device
|
||||
3. Or set primary if both advertise
|
||||
|
||||
### 3. VPNs on Tailscale Devices Can Break Connectivity
|
||||
|
||||
**Problem:** A full-tunnel VPN (like ProtonVPN with `AllowedIPs = 0.0.0.0/0`) will route Tailscale's DERP/STUN traffic through the VPN, breaking NAT traversal.
|
||||
|
||||
**Symptom:** Device shows relay-only connections with asymmetric traffic (high TX, near-zero RX).
|
||||
|
||||
**Fix:** Use split-tunnel configuration that excludes Tailscale traffic. See [PiHole ProtonVPN Configuration](#pihole-protonvpn-split-tunnel) below.
|
||||
|
||||
---
|
||||
|
||||
## DNS Configuration
|
||||
|
||||
### Tailscale Admin DNS Settings
|
||||
- **Nameserver:** 10.10.10.10 (PiHole via subnet route)
|
||||
- **Fallback:** None configured
|
||||
|
||||
### How DNS Works
|
||||
1. Remote client enables "Use Tailscale DNS"
|
||||
2. DNS queries go to 10.10.10.10
|
||||
3. Traffic routes through PVE (subnet router) to PiHole
|
||||
4. PiHole resolves via Unbound (recursive) through ProtonVPN
|
||||
|
||||
---
|
||||
|
||||
## Subnet Routing
|
||||
|
||||
### Current Primary Routes
|
||||
```
|
||||
PVE advertises:
|
||||
- 10.10.10.0/24 (LAN)
|
||||
- 10.10.20.0/24 (Storage network)
|
||||
```
|
||||
|
||||
### Verifying Routes
|
||||
```bash
|
||||
# From MacBook - check who's advertising routes
|
||||
tailscale status --json | python3 -c "
|
||||
import sys, json
|
||||
data = json.load(sys.stdin)
|
||||
for peer in data.get('Peer', {}).values():
|
||||
routes = peer.get('PrimaryRoutes', [])
|
||||
if routes:
|
||||
print(f\"{peer.get('HostName')}: {routes}\")"
|
||||
```
|
||||
|
||||
### Testing Subnet Connectivity
|
||||
```bash
|
||||
# Test from remote client
|
||||
ping 10.10.10.10 # PiHole
|
||||
ping 10.10.10.120 # PVE
|
||||
ping 10.10.10.1 # Gateway
|
||||
dig @10.10.10.10 google.com # DNS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## PiHole ProtonVPN Split-Tunnel
|
||||
|
||||
PiHole runs a WireGuard tunnel to ProtonVPN for encrypted upstream DNS queries. The configuration uses policy-based routing to ONLY route Unbound's DNS traffic through the VPN.
|
||||
|
||||
### Configuration File: `/etc/wireguard/piehole.conf`
|
||||
|
||||
```ini
|
||||
[Interface]
|
||||
PrivateKey = <key>
|
||||
Address = 10.2.0.2/32
|
||||
# CRITICAL: Disable automatic routing - we handle it manually
|
||||
Table = off
|
||||
|
||||
# Policy routing: only route Unbound DNS through VPN
|
||||
PostUp = ip route add default dev %i table 51820
|
||||
PostUp = ip rule add fwmark 0x51820 table 51820 priority 100
|
||||
PostUp = iptables -t mangle -N UNBOUND_VPN 2>/dev/null || true
|
||||
PostUp = iptables -t mangle -F UNBOUND_VPN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -d 10.0.0.0/8 -j RETURN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -d 127.0.0.0/8 -j RETURN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -d 100.64.0.0/10 -j RETURN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -d 192.168.0.0/16 -j RETURN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -d 172.16.0.0/12 -j RETURN
|
||||
PostUp = iptables -t mangle -A UNBOUND_VPN -j MARK --set-mark 0x51820
|
||||
PostUp = iptables -t mangle -A OUTPUT -p udp --dport 53 -m owner --uid-owner unbound -j UNBOUND_VPN
|
||||
PostUp = iptables -t mangle -A OUTPUT -p tcp --dport 53 -m owner --uid-owner unbound -j UNBOUND_VPN
|
||||
PostUp = iptables -t nat -A POSTROUTING -o %i -j MASQUERADE
|
||||
|
||||
PostDown = iptables -t mangle -D OUTPUT -p udp --dport 53 -m owner --uid-owner unbound -j UNBOUND_VPN
|
||||
PostDown = iptables -t mangle -D OUTPUT -p tcp --dport 53 -m owner --uid-owner unbound -j UNBOUND_VPN
|
||||
PostDown = iptables -t mangle -F UNBOUND_VPN
|
||||
PostDown = iptables -t mangle -X UNBOUND_VPN
|
||||
PostDown = ip rule del fwmark 0x51820 table 51820 priority 100
|
||||
PostDown = ip route del default dev %i table 51820
|
||||
PostDown = iptables -t nat -D POSTROUTING -o %i -j MASQUERADE
|
||||
|
||||
[Peer]
|
||||
PublicKey = <ProtonVPN-key>
|
||||
AllowedIPs = 0.0.0.0/0, ::/0
|
||||
Endpoint = 149.102.242.1:51820
|
||||
PersistentKeepalive = 25
|
||||
```
|
||||
|
||||
**Key Points:**
|
||||
- `Table = off` prevents wg-quick from adding default routes
|
||||
- Only traffic from the `unbound` user to port 53 gets marked and routed through VPN
|
||||
- Local, private, and Tailscale (100.64.0.0/10) traffic is excluded
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Symptom: Can't reach subnet (10.10.10.x) from remote
|
||||
|
||||
**Check 1:** Is PVE online and advertising routes?
|
||||
```bash
|
||||
tailscale status | grep pve
|
||||
# Should show "active" not "offline"
|
||||
```
|
||||
|
||||
**Check 2:** Is PVE the primary subnet router?
|
||||
```bash
|
||||
tailscale status --json | python3 -c "..." # See above
|
||||
```
|
||||
|
||||
**Check 3:** Can PVE reach the target on local network?
|
||||
```bash
|
||||
ssh pve 'ping -c 1 10.10.10.10'
|
||||
```
|
||||
|
||||
### Symptom: Device shows "relay" with asymmetric traffic (high TX, low RX)
|
||||
|
||||
**Cause:** Usually a VPN or firewall blocking Tailscale's UDP traffic.
|
||||
|
||||
**Check:** Run netcheck on the affected device:
|
||||
```bash
|
||||
tailscale netcheck
|
||||
```
|
||||
|
||||
Look for:
|
||||
- Wrong external IP (indicates VPN routing issue)
|
||||
- Missing DERP latencies
|
||||
- `MappingVariesByDestIP: true` with no direct connections
|
||||
|
||||
### Symptom: Local devices can't reach each other
|
||||
|
||||
**Cause:** `--accept-routes=true` on a device that's directly on the subnet.
|
||||
|
||||
**Fix:**
|
||||
```bash
|
||||
# Check current setting
|
||||
tailscale debug prefs | grep -i route
|
||||
|
||||
# Disable accept-routes
|
||||
tailscale set --accept-routes=false
|
||||
```
|
||||
|
||||
### Symptom: Gateway can ping Tailscale IPs but not local IPs
|
||||
|
||||
**Check routing:**
|
||||
```bash
|
||||
ip route get 10.10.10.120
|
||||
# If it shows "dev tailscale0" instead of "dev br0", that's the problem
|
||||
```
|
||||
|
||||
**Fix:** `tailscale set --accept-routes=false` on the gateway
|
||||
|
||||
---
|
||||
|
||||
## Maintenance Commands
|
||||
|
||||
### Restart Tailscale
|
||||
```bash
|
||||
# On Linux
|
||||
systemctl restart tailscaled
|
||||
|
||||
# Check status
|
||||
tailscale status
|
||||
```
|
||||
|
||||
### Re-advertise Routes (PVE)
|
||||
```bash
|
||||
tailscale set --advertise-routes=10.10.10.0/24,10.10.20.0/24
|
||||
```
|
||||
|
||||
### Check Connection Type
|
||||
```bash
|
||||
# Shows direct vs relay for each peer
|
||||
tailscale status
|
||||
|
||||
# Detailed ping with path info
|
||||
tailscale ping <tailscale-ip>
|
||||
```
|
||||
|
||||
### Force Re-connection
|
||||
```bash
|
||||
tailscale down && tailscale up
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Issues
|
||||
|
||||
### UCG-Fiber Relay-Only Connections
|
||||
The UniFi gateway sometimes fails to establish direct Tailscale connections, falling back to relay. This appears related to memory pressure or the gateway's NAT implementation. Current workaround: use PVE as the subnet router instead.
|
||||
|
||||
### Gateway Memory Pressure
|
||||
The UCG-Fiber has limited RAM (~3GB) and can become unstable under load. The internet-watchdog service will auto-reboot if connectivity is lost. See [GATEWAY.md](GATEWAY.md).
|
||||
|
||||
---
|
||||
|
||||
## Change History
|
||||
|
||||
### 2026-01-05
|
||||
- Switched subnet router from UCG-Fiber to PVE
|
||||
- Fixed PiHole ProtonVPN from full-tunnel to split-tunnel (DNS-only)
|
||||
- Disabled `--accept-routes` on UCG-Fiber and PiHole
|
||||
- Documented critical configuration rules
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2026-01-05
|
||||
Reference in New Issue
Block a user