Home Server Monitoring with Grafana and Prometheus

You don't know your server until you monitor it. CPU spikes, memory leaks, storage approaching capacity—these problems announce themselves through crashes if you don't have visibility. Grafana and Prometheus provide the monitoring foundation that scales from single-server homelabs to multi-machine installations.

The Prometheus Data Model

Prometheus collects time-series metrics: numerical values labeled with key-value pairs, sampled at intervals. A typical metric like node_memory_MemAvailable_bytes tells you available memory with timestamp. Queries can filter by labels (instance="server1"), aggregate across dimensions (sum by (job) (rate(http_requests_total[5m]))), and compute rates over time.

The data model is simple but powerful. Prometheus scrapes targets at configured intervals, stores the metrics locally, and provides a query language (PromQL) for analysis. Retention is typically 15-30 days for home use—long enough to spot trends without consuming excessive storage.

Node Exporter: System Metrics

The Prometheus node_exporter exposes the system metrics that matter: CPU, memory, disk, network, and kernel statistics. Running node_exporter on every server you want to monitor provides a consistent metrics interface regardless of the underlying OS.

The key metrics for home servers:

node_cpu_seconds_total — CPU time consumed by mode (user, system, idle)
node_memory_MemAvailable_bytes — available memory
node_filesystem_avail_bytes — available filesystem space
node_disk_io_time_seconds_total — disk I/O utilization

Docker Metrics

Docker exposes metrics through the metrics endpoint (enabled in daemon.json). The cAdvisor container collects Docker-specific metrics: container CPU, memory, network, and filesystem usage. Prometheus scrapes these alongside node_exporter metrics, giving you visibility into both system and application layers.

The container_memory_usage_bytes metric tracks per-container memory, useful for identifying containers that are consuming excessive resources. Combined with service labels, you can aggregate costs across all containers belonging to a particular application.

Grafana Dashboards

Grafana provides the visualization layer. The Node Exporter Full dashboard (ID 1860) is the standard starting point—a comprehensive view of system metrics with gauges, graphs, and tables covering all the important numbers. Import it, point it at your Prometheus data source, and you have immediate visibility.

The real value emerges when you build custom dashboards for your specific services. A Plex dashboard showing transcoding sessions, bandwidth, and library size. A NAS dashboard showing disk health, temperature, and throughput. Homelab dashboards showing VM and container resource consumption.

Alerting

Dashboards are passive—someone has to look at them. Alerting notifies you when attention is needed. Prometheus alerting rules trigger when conditions are met: disk space below 10%, CPU above 90% for 5 minutes, service down. Alertmanager routes alerts to various destinations: email, Slack, PagerDuty.

For home servers, email alerts often suffice. Set alerts for conditions that require action but aren't emergencies: "disk space below 20%" prompts cleanup before it becomes critical. Reserve urgent alerts (email, push notification) for conditions that require immediate attention, like service failures.

Docker Compose Setup

The monitoring stack deploys cleanly with Docker Compose:

services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
  grafana:
    image: grafana/grafana
    volumes:
      - ./grafana:/var/lib/grafana
    ports:
      - "3000:3000"
  node-exporter:
    image: prom/node-exporter
    ports:
      - "9100:9100"

This starts Prometheus, Grafana, and node-exporter with volumes for persistence. The Prometheus configuration file specifies scrape targets and intervals; Grafana's web interface handles datasource and dashboard setup.