Files
homelab-docs/GATEWAY.md
2026-01-05 12:28:33 -05:00

8.2 KiB

UniFi Gateway (UCG-Fiber)

Documentation for the UniFi Cloud Gateway Fiber (10.10.10.1) - the primary network gateway and router.

Overview

Property Value
Device UniFi Cloud Gateway Fiber (UCG-Fiber)
IP Address 10.10.10.1
SSH User root
SSH Auth SSH key (~/.ssh/id_ed25519)
Host Aliases ucg-fiber, gateway
Firmware v4.4.9 (as of 2026-01-02)
UniFi Core 4.4.19
RAM 2.9 GB (shared with UniFi apps)

SSH Access

SSH key authentication is configured. Use host aliases:

# Quick access
ssh ucg-fiber 'hostname'
ssh gateway 'free -m'

# Or use IP directly
ssh root@10.10.10.1 'uptime'

Note: SSH key may need re-deployment after firmware updates if UniFi clears authorized_keys.


Monitoring Services

Two custom monitoring services run on the gateway to prevent and diagnose issues.

Internet Watchdog Service

Purpose: Auto-reboots gateway if internet connectivity is lost for 5+ minutes

Location: /data/scripts/internet-watchdog.sh

How it works:

  1. Pings 1.1.1.1, 8.8.8.8, 208.67.222.222 every 60 seconds
  2. If all three fail, increments failure counter
  3. After 5 consecutive failures (~5 minutes), triggers reboot
  4. Logs all activity to /var/log/internet-watchdog.log

Commands:

# Check service status
ssh ucg-fiber 'systemctl status internet-watchdog'

# View recent logs
ssh ucg-fiber 'tail -50 /var/log/internet-watchdog.log'

# Stop temporarily (if troubleshooting)
ssh ucg-fiber 'systemctl stop internet-watchdog'

# Restart
ssh ucg-fiber 'systemctl restart internet-watchdog'

Log Format:

2026-01-02 22:45:01 - Watchdog started
2026-01-02 22:46:01 - Internet check failed (1/5)
2026-01-02 22:47:01 - Internet restored after 1 failures

Memory Monitor Service

Purpose: Logs memory usage and top processes every 10 minutes for diagnostics

Location: /data/scripts/memory-monitor.sh

Log File: /data/logs/memory-history.log

How it works:

  1. Every 10 minutes, logs current memory usage (free -m)
  2. Logs top 12 memory-consuming processes
  3. Auto-rotates log when it exceeds 10MB (keeps one .old file)

Commands:

# Check service status
ssh ucg-fiber 'systemctl status memory-monitor'

# View recent memory history
ssh ucg-fiber 'tail -100 /data/logs/memory-history.log'

# Check current memory usage
ssh ucg-fiber 'free -m'

# See top memory consumers right now
ssh ucg-fiber 'ps -eo pid,rss,comm --sort=-rss | head -12'

Log Format:

========== 2026-01-02 22:30:00 ==========
--- MEMORY ---
              total        used        free      shared  buff/cache   available
Mem:           2892        1890         102         456         899        1002
Swap:           512          88         424
--- TOP MEMORY PROCESSES ---
  PID   RSS COMMAND
 1234 327456 unifi-protect
 2345 252108 mongod
 3456 236544 java
...

Known Memory Consumers

Process Typical Memory Purpose
unifi-protect ~320 MB Camera/NVR management
mongod ~250 MB UniFi configuration database
java (controller) ~230 MB UniFi Network controller
postgres ~180 MB PostgreSQL database
unifi-core ~150 MB UniFi OS core
tailscaled ~80 MB Tailscale VPN

Total available: ~2.9 GB Typical usage: ~1.8-2.0 GB (leaves ~1 GB free) Warning threshold: <500 MB free Critical: <200 MB free or swap >50% used


Disabled Services

The following services were disabled to reduce memory usage:

Service Memory Saved Reason Disabled
UniFi Connect ~200 MB Not needed (cameras use Protect)

To re-enable if needed:

ssh ucg-fiber 'systemctl enable unifi-connect && systemctl start unifi-connect'

Common Issues

Gateway Freeze / Network Loss

Symptoms:

  • All devices lose internet
  • Cannot ping 10.10.10.1
  • Physical reboot required

Root Cause: Memory exhaustion causing soft lockup

Prevention:

  1. Internet watchdog auto-reboots after 5 min outage
  2. Memory monitor logs help identify runaway processes
  3. UniFi Connect disabled to free ~200 MB

Post-Incident Analysis:

# Check memory history for spike before freeze
ssh ucg-fiber 'grep -B5 "Swap:" /data/logs/memory-history.log | tail -50'

# Check watchdog logs
ssh ucg-fiber 'cat /var/log/internet-watchdog.log'

# Check system logs for errors
ssh ucg-fiber 'dmesg | tail -100'
ssh ucg-fiber 'journalctl -p err --since "1 hour ago"'

High Memory Usage

Check current state:

ssh ucg-fiber 'free -m && echo "---" && ps -eo pid,rss,comm --sort=-rss | head -15'

If swap is heavily used:

# Check swap usage
ssh ucg-fiber 'cat /proc/swaps'

# See what's in swap
ssh ucg-fiber 'for pid in $(ls /proc | grep -E "^[0-9]+$"); do
  swap=$(grep VmSwap /proc/$pid/status 2>/dev/null | awk "{print \$2}");
  [ "$swap" -gt 10000 ] 2>/dev/null && echo "$pid: ${swap}kB - $(cat /proc/$pid/comm)";
done | sort -t: -k2 -rn | head -10'

Consider reboot if:

  • Available memory <200 MB
  • Swap usage >300 MB
  • System becoming unresponsive

Tailscale Issues

Check Tailscale status:

ssh ucg-fiber 'tailscale status'

Common errors and fixes:

Error Fix
DNS resolution failed Check upstream DNS (Pi-hole at 10.10.10.10)
TLS handshake failed Usually temporary; Tailscale auto-reconnects
Not connected ssh ucg-fiber 'tailscale up'

Firmware Updates

Check current version:

ssh ucg-fiber 'ubnt-systool version'

Update process:

  1. Check UniFi site for latest stable firmware
  2. Download via UI or CLI
  3. Schedule update during low-usage time

After update:

  • Verify SSH key still works
  • Check custom services still running
  • Verify Tailscale reconnects

Re-deploy SSH key if needed:

ssh-copy-id -i ~/.ssh/id_ed25519 root@10.10.10.1

Service Locations

File Purpose
/data/scripts/internet-watchdog.sh Watchdog script
/data/scripts/memory-monitor.sh Memory monitor script
/etc/systemd/system/internet-watchdog.service Watchdog systemd unit
/etc/systemd/system/memory-monitor.service Memory monitor systemd unit
/var/log/internet-watchdog.log Watchdog log
/data/logs/memory-history.log Memory history log

Note: /data/ persists across firmware updates. /var/log/ may not.


Quick Reference Commands

# System status
ssh ucg-fiber 'uptime && free -m'

# Check both monitoring services
ssh ucg-fiber 'systemctl status internet-watchdog memory-monitor'

# Memory history (last hour)
ssh ucg-fiber 'tail -60 /data/logs/memory-history.log'

# Watchdog activity
ssh ucg-fiber 'tail -20 /var/log/internet-watchdog.log'

# Network devices (ARP table)
ssh ucg-fiber 'cat /proc/net/arp'

# Tailscale status
ssh ucg-fiber 'tailscale status'

# System logs
ssh ucg-fiber 'journalctl -p warning --since "1 hour ago" | head -50'

Backup Considerations

Custom services in /data/scripts/ persist across firmware updates but may need:

  • Systemd services re-enabled after major updates
  • Script permissions re-applied if wiped

Backup critical files:

# Copy scripts locally for reference
scp ucg-fiber:/data/scripts/*.sh ~/Projects/homelab/data/scripts/


Incident History

2025-12-27 to 2025-12-29: Gateway Freeze

Timeline:

  • Dec 7: Firmware update to v4.4.9
  • Dec 24: Last healthy system logs
  • Dec 27-29: "No internet detected" errors in logs
  • Dec 29+: Complete silence (gateway frozen)
  • Jan 2: Physical reboot restored access

Root Cause: Memory exhaustion causing soft lockup (no crash dump saved)

Resolution:

  • Deployed internet-watchdog service
  • Deployed memory-monitor service
  • Disabled UniFi Connect (~200 MB saved)
  • Configured SSH key auth

Last Updated: 2026-01-02