Files
homelab-docs/EMC-ENCLOSURE.md
Hutson 93821d1557 Initial commit: Homelab infrastructure documentation
- CLAUDE.md: Main homelab assistant context and instructions
- IP-ASSIGNMENTS.md: Complete IP address assignments
- NETWORK.md: Network bridges, VLANs, and configuration
- EMC-ENCLOSURE.md: EMC storage enclosure documentation
- SYNCTHING.md: Syncthing setup and device list
- SHELL-ALIASES.md: ZSH aliases for Claude Code sessions
- HOMEASSISTANT.md: Home Assistant API and automations
- INFRASTRUCTURE.md: Server hardware and power management
- configs/: Shared shell configurations
- scripts/: Utility scripts
- mcp-central/: MCP server configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 02:31:02 -05:00

6.4 KiB

EMC Storage Enclosure Documentation

Hardware Overview

Component Details
Model EMC ESES Viper DAE (KTN-STL3)
Capacity 15x 3.5" SAS/SATA drive bays
SES Device /dev/sg15 (on TrueNAS)
Connection SAS to LSI SAS2308 HBA (mpt2sas driver)
Location Connected to PVE (10.10.10.120) via TrueNAS VM

Components

The enclosure has dual LCC controllers for redundancy:

Controller Slot Status Notes
LCC A Left Working Currently in use
LCC B Right Faulty Causes high fan speed, SAS discovery failure

Replacement Part: EMC 303-108-000E VIPER 6G SAS LCC (~$15 on eBay)

Power Supplies

Two redundant PSUs with integrated fans.

Fans

Multiple cooling fans controlled by enclosure firmware. Fan speeds are automatically managed based on temperature - manual override is not supported on EMC ESES enclosures.

Fan Speed Codes:

Code Description RPM (approx)
1 Lowest ~1500
2 Second lowest ~2000
3 Third lowest ~2670
4 Medium ~3300
5 Fifth ~4160
6 Sixth ~4800
7 Highest ~5500+

ZFS Pool Using This Enclosure

Pool: vault
Size: 164TB raidz1
Drives: 13x HDD in raidz1 + special mirror + NVMe cache/log
Mount: /mnt/vault on TrueNAS

SES Commands Reference

All commands run from TrueNAS (VM 100):

# Check overall enclosure status
sg_ses -p 0x02 /dev/sg15

# Check fan speeds
sg_ses --index=coo,-1 --get=speed_code /dev/sg15

# Check temperatures
sg_ses -p 0x02 /dev/sg15 | grep -E "(Temperature|Cooling)"

# Check PSU status
sg_ses -p 0x02 /dev/sg15 | grep -A5 "Power supply"

# Check LCC controller status
sg_ses -p 0x02 /dev/sg15 | grep -A5 "Enclosure services controller"

# List all SES elements
sg_ses -p 0x07 /dev/sg15

# Identify enclosure (flash LEDs)
sg_ses --index=enc,0 --set=ident:1 /dev/sg15

Running SES Commands via Proxmox

# From Mac (via SSH key auth)
ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15"'

# Quick fan check
ssh pve 'qm guest exec 100 -- bash -c "sg_ses --index=coo,-1 --get=speed_code /dev/sg15"'

# Quick temp check
ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep Temperature"'

Troubleshooting

Symptom: Fans Running Loud (Speed 5+)

Possible Causes:

  1. Faulty LCC controller - Switch to other LCC
  2. High temperatures - Check temp sensors
  3. PSU issue - Check PSU status via SES
  4. Failed drive - Check drive status LEDs

Diagnosis Steps:

# 1. Check current fan speed
ssh pve 'qm guest exec 100 -- bash -c "sg_ses --index=coo,-1 --get=speed_code /dev/sg15"'
# Normal: 1-3, High: 4-5, Critical: 6-7

# 2. Check temperatures
ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep Temperature"'
# Normal: 25-40C, Warning: 45-50C, Critical: 55C+

# 3. Check for component failures
ssh pve 'qm guest exec 100 -- bash -c "sg_ses -p 0x02 /dev/sg15 | grep -i fail"'

# 4. If no obvious cause, try switching LCC
# Power down enclosure, move SAS cable to other LCC port

Symptom: Drives Not Detected After Enclosure Power Cycle

Possible Causes:

  1. Enclosure not fully initialized (wait for green LEDs to stop blinking)
  2. Faulty LCC controller
  3. SAS cable loose
  4. HBA needs rescan

Diagnosis Steps:

# 1. Check SAS link status
cat /sys/class/sas_phy/*/negotiated_linkrate

# 2. Check for expanders (should show enclosure)
lsscsi -g | grep -i enclo

# 3. Force HBA rescan
echo "- - -" > /sys/class/scsi_host/host0/scan

# 4. If no expander, check SAS cable and try other LCC port

Symptom: Pool Won't Import After Enclosure Maintenance

# 1. Wait for enclosure to fully initialize (1-2 minutes)

# 2. Rescan for devices
echo "- - -" > /sys/class/scsi_host/host0/scan

# 3. Import pool
zpool import vault

# 4. If read-only mount issues, reboot TrueNAS
ssh pve 'qm reboot 100'

Maintenance Procedures

Safe Shutdown for Enclosure Maintenance

# 1. Stop services using the pool
ssh pve 'qm guest exec 101 -- bash -c "docker stop \$(docker ps -q)"'

# 2. Shutdown TrueNAS (auto-exports ZFS pool)
ssh pve 'qm shutdown 100 --timeout 120'

# 3. Wait for TrueNAS to fully stop
ssh pve 'while qm status 100 | grep -q running; do sleep 5; done'

# 4. Power off enclosure
# (Physical switch or PDU)

# 5. Perform maintenance

# 6. Power on enclosure, wait for initialization (green LEDs solid)

# 7. Start TrueNAS
ssh pve 'qm start 100'

# 8. Verify pool imported
ssh pve 'qm guest exec 100 -- bash -c "zpool status vault"'

Hot-Swap LCC Controller

LCCs can be hot-swapped while enclosure is running:

  1. Order replacement LCC (EMC 303-108-000E)
  2. Move SAS cable to working LCC (if not already)
  3. Wait for drives to come online via new LCC
  4. Remove faulty LCC
  5. Install replacement LCC
  6. Optionally move SAS cable back to original port

Incident Log

2024-12-19: LCC B Failure

Symptoms:

  • Fans running at speed code 5 (~4160 RPM) - very loud
  • After enclosure power cycle, drives not detected
  • SAS link UP (4 PHYs at 6.0 Gbit) but no expander discovery

Root Cause: LCC B controller malfunction causing:

  • False temperature/error readings → high fan speed
  • SAS expander not responding → drives not enumerated

Resolution:

  1. Moved SAS cable from LCC B to LCC A
  2. Drives immediately appeared
  3. Fan speed dropped to code 3 (2670 RPM) - quiet
  4. Imported vault pool, all data intact

Replacement Ordered:

  • Part: EMC 303-108-000E VIPER 6G SAS LCC
  • Source: eBay
  • Price: $14.95 + free shipping

LED Status Reference

Drive LEDs

LED Color Status
Solid Blue Power Drive has power
Blinking Blue Activity I/O in progress
Solid Amber Fault Drive failed
Blinking Amber Identify Drive being located

LCC LEDs

LED Color Status
Solid Green Link SAS connection active
Blinking Green Activity Data transfer
Amber Fault LCC issue

PSU LEDs

LED Color Status
Solid Green OK Power supply healthy
Off No Power No AC input
Amber Fault PSU failure