Auto-sync: 20260116-150510
This commit is contained in:
@@ -16,8 +16,9 @@ Documentation for system monitoring, health checks, and alerting across the home
|
|||||||
| **Network** | ✅ Partial | Gateway watchdog | ✅ Auto-reboot | Connectivity check every 60s |
|
| **Network** | ✅ Partial | Gateway watchdog | ✅ Auto-reboot | Connectivity check every 60s |
|
||||||
| **Services** | ❌ No | - | ❌ No | No health checks |
|
| **Services** | ❌ No | - | ❌ No | No health checks |
|
||||||
| **Backups** | ❌ No | - | ❌ No | No verification |
|
| **Backups** | ❌ No | - | ❌ No | No verification |
|
||||||
|
| **Claude Code** | ✅ Yes | Prometheus + Grafana | ✅ Yes | Token usage, burn rate, cost tracking |
|
||||||
|
|
||||||
**Overall Status**: ⚠️ **PARTIAL** - Gateway monitoring active, most else is manual
|
**Overall Status**: ⚠️ **PARTIAL** - Gateway monitoring active, Claude Code active, most else is manual
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -87,6 +88,102 @@ ssh ucg-fiber 'free -m && ps -eo pid,rss,comm --sort=-rss | head -12'
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### Claude Code Token Monitoring
|
||||||
|
|
||||||
|
**Status**: ✅ **Active with alerts**
|
||||||
|
|
||||||
|
Monitors Claude Code token usage across all machines to track subscription consumption and prevent hitting weekly limits.
|
||||||
|
|
||||||
|
**Architecture**:
|
||||||
|
```
|
||||||
|
Claude Code (MacBook/Mac Mini)
|
||||||
|
│
|
||||||
|
▼ (OpenTelemetry Prometheus exporter :9464)
|
||||||
|
│
|
||||||
|
Prometheus (docker-host:9090)
|
||||||
|
│
|
||||||
|
├──► Grafana Dashboard
|
||||||
|
│
|
||||||
|
└──► Alertmanager (burn rate alerts)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Monitored Devices**:
|
||||||
|
| Device | IP Address | Metrics Port |
|
||||||
|
|--------|------------|--------------|
|
||||||
|
| MacBook | 10.10.10.147 | 9464 |
|
||||||
|
| Mac Mini | 10.10.10.123 | 9464 |
|
||||||
|
|
||||||
|
**What's monitored**:
|
||||||
|
- Token usage (input/output/cache) over time
|
||||||
|
- Burn rate (tokens/hour)
|
||||||
|
- Cost tracking (USD)
|
||||||
|
- Usage by model (Opus, Sonnet, Haiku)
|
||||||
|
- Session count
|
||||||
|
- Per-device breakdown
|
||||||
|
|
||||||
|
**Dashboard**: https://grafana.htsn.io/d/claude-code-usage/claude-code-token-usage
|
||||||
|
|
||||||
|
**Alerts Configured**:
|
||||||
|
| Alert | Threshold | Severity |
|
||||||
|
|-------|-----------|----------|
|
||||||
|
| High Burn Rate | >100k tokens/hour for 15min | Warning |
|
||||||
|
| Weekly Limit Risk | Projected >5M tokens/week | Critical |
|
||||||
|
| No Metrics | Scrape fails for 5min | Info |
|
||||||
|
|
||||||
|
**Configuration Files**:
|
||||||
|
- Claude settings: `~/.claude/settings.json` (on each Mac)
|
||||||
|
- Prometheus scrape: `/opt/monitoring/prometheus/prometheus.yml` (docker-host)
|
||||||
|
- Alert rules: `/opt/monitoring/prometheus/rules/claude-code.yml` (docker-host)
|
||||||
|
|
||||||
|
**Claude Code Settings** (in `~/.claude/settings.json`):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"env": {
|
||||||
|
"CLAUDE_CODE_ENABLE_TELEMETRY": "1",
|
||||||
|
"OTEL_METRICS_EXPORTER": "prometheus",
|
||||||
|
"OTEL_EXPORTER_PROMETHEUS_PORT": "9464",
|
||||||
|
"OTEL_METRIC_EXPORT_INTERVAL": "60000"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Prometheus Scrape Config**:
|
||||||
|
```yaml
|
||||||
|
- job_name: "claude-code"
|
||||||
|
scrape_interval: 60s
|
||||||
|
static_configs:
|
||||||
|
- targets: ["10.10.10.147:9464"]
|
||||||
|
labels:
|
||||||
|
device: "macbook"
|
||||||
|
- targets: ["10.10.10.123:9464"]
|
||||||
|
labels:
|
||||||
|
device: "mac-mini"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Useful PromQL Queries**:
|
||||||
|
```promql
|
||||||
|
# Total tokens this session
|
||||||
|
sum(claude_code_token_usage_total)
|
||||||
|
|
||||||
|
# Burn rate (tokens/hour)
|
||||||
|
sum(rate(claude_code_token_usage_total[1h])) * 3600
|
||||||
|
|
||||||
|
# Usage by device
|
||||||
|
sum(claude_code_token_usage_total) by (device)
|
||||||
|
|
||||||
|
# Projected weekly usage
|
||||||
|
sum(increase(claude_code_token_usage_total[24h])) * 7
|
||||||
|
```
|
||||||
|
|
||||||
|
**Important Notes**:
|
||||||
|
- Claude Code must be restarted after changing telemetry settings
|
||||||
|
- Metrics only flow while Claude Code is running
|
||||||
|
- Weekly subscription resets Monday 1am (America/New_York)
|
||||||
|
|
||||||
|
**Added**: 2026-01-16
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Syncthing Monitoring
|
### Syncthing Monitoring
|
||||||
|
|
||||||
**Status**: ⚠️ **Partial** - API available, no automated monitoring
|
**Status**: ⚠️ **Partial** - API available, no automated monitoring
|
||||||
|
|||||||
Reference in New Issue
Block a user