Skip to content

Fleet Status Log - 2026-02-05

🟢 Monitoring & Alerting (Federation Hub)

  • Grafana Centralized: Deployed at http://192.168.1.13:3001 (Memory-Alpha).
  • Telegram Integration: Linked to @the_lal_net_bot. All critical alerts route to Telegram.
  • Noise Control: 2-minute threshold for all outages to filter transient blips.
  • Rules Deployed:
    • Node Offline (Critical)
    • Service Failure (Warning - tracks Plex, Paperless, Ollama, etc.)

🛠️ Autonomous Remediation (Damage Control)

  • Script: /home/vivianl/projects/AI/tools/damage_control.py (running on Memory-Alpha).
  • Function: Queries Prometheus every 30s. Auto-restarts failed monitoring stacks via SSH.
  • Verified: Tested on Holodeck-Lab; system detected down state and restored service within 45s.

🎧 Audiobook Pipeline (Specialized)

  • qBittorrent (Risa - .21): Port 8085. High-speed NFS access. No VPN.
  • Atomic Mover: abs-mover.service running as root on Risa.
    • Trigger: Watches /mnt/vault/media/downloads/audiobooks/complete.
    • Action: Instantly hardlinks to /mnt/vault/media/audiobooks.
    • Notification: Sends Telegram alert upon library linking.
  • Nightly Dedupe: dedupe_audiobooks.py via Cron (3 AM). Keeps newest versions of duplicates.

🔐 General Downloads (VPN Secure)

  • qBittorrent (Starfleet - .35): Port 8086. Routed through Gluetun (Surfshark - Switzerland).
  • Access: Local Network Bypass enabled for 192.168.1.0/24. Login is automatic from home network.

💾 Standardized Backups

  • Frequency: 03:00 (Full OS to PBS), 04:00 (Configs to Vault).
  • Rate Limit: All backups strictly capped at 100MB/s (--rate 100M / --bwlimit=100000).
  • Alerts: All backup scripts updated to send Telegram Success/Fail notifications.

🚀 Maintenance & Recovery

  • Scheduled Reboot: Friday 06/02 at 05:00 AM.
    • Orchestrated by /root/fleet_reboot.sh on Proxmox.
    • Graceful shutdown sequence: Docker -> Guests -> Storage -> Hosts.
  • Recovery Audit: reboot_health_check.py on Memory-Alpha triggers @reboot.
    • Checks NFS mount status on all nodes.
    • Queries Prometheus for service health.
    • Sends "Fleet Reboot Recovery Report" to Telegram.

📋 Credentials & Access

  • Grafana: admin / federation_admin
  • qBit (.21): admin / myPass123,
  • qBit (.35): admin / myPass123, (Autologin from local network enabled)
  • Telegram Bot: @the_lal_net_bot