As modern systems become more complex and distributed, observability has become a must-have skill for anyone working in DevOps, SRE, or infrastructure engineering. At the heart of modern observability stacks are Prometheus and Grafana—two open-source tools that power real-time monitoring and visual analytics.
Prometheus collects metrics with precision and efficiency. Grafana turns those metrics into beautiful, actionable dashboards. Together, they’re used by global companies like LinkedIn, SoundCloud, and Uber to monitor everything from microservices to Kubernetes clusters.
For learners and junior engineers, the best way to master these tools is by building hands-on projects. In this guide, we’ll walk you through beginner-friendly projects that help you understand the power of Prometheus and Grafana—while preparing you for real-world observability tasks.
What Are Prometheus and Grafana?
Prometheus: Metrics Collection and Alerting
Prometheus is a time-series database that scrapes metrics from configured targets (like servers, containers, and apps) and stores them using a high-performance storage engine. It also supports querying with its PromQL language and alerting with built-in rule evaluation.
Key Features:
Pull-based metrics scraping via HTTP
Support for custom exporters (Node Exporter, Blackbox Exporter, etc.)
PromQL for querying time-series data
AlertManager integration for notification routing
Grafana: Interactive Visualization
Grafana connects to Prometheus and other data sources to visualize data on interactive dashboards. It’s flexible, extensible, and ideal for real-time analytics and alerting.
Key Features:
Customizable dashboards and panels
Annotations for incidents and events
Alerting with thresholds and rules
Support for multiple data sources (including Prometheus, Loki, InfluxDB)
Tools and Environment Setup
Before jumping into projects, you’ll need to set up your local environment.
Minimum Requirements:
Docker and Docker Compose (recommended for isolated environments)
A Linux or macOS machine (Windows with WSL also works)
Basic familiarity with YAML and the terminal
Optional Enhancements:
Visual Studio Code with Docker and YAML plugins
Git for version control and configuration tracking
Project 1: Monitor Your Local System with Node Exporter
Goal: Use Prometheus and Grafana to monitor CPU, memory, and disk usage of your local machine.
Steps:
Install Node Exporter
Use Docker to run Node Exporter:arduino
CopyEdit
docker run -d -p 9100:9100 prom/node-exporter
Configure Prometheus to Scrape Node Exporter
Create aprometheus.yml
file:yaml
CopyEdit
scrape_configs: - job_name: 'node' static_configs: - targets: ['localhost:9100']
Start Prometheus with Docker:
bash
CopyEdit
docker run -d -p 9090:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
Launch Grafana and connect to Prometheus as a data source:
arduino
CopyEdit
docker run -d -p 3000:3000 grafana/grafana
Import a System Monitoring Dashboard from Grafana’s dashboard library (e.g., ID 1860)
What You Learn:
How exporters work
How to query Prometheus with PromQL
How to visualize system metrics in Grafana
Project 2: Build a Custom Application Metrics Dashboard
Goal: Instrument a Python or Node.js app to expose custom metrics and monitor them with Prometheus and Grafana.
Steps:
Instrument Your Application
Use libraries like:Python:
prometheus_client
Node.js:
prom-client
Example (Python):
python
CopyEdit
from prometheus_client import start_http_server, Summary import random, time REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request') @REQUEST_TIME.time() def process_request(): time.sleep(random.random()) if __name__ == '__main__': start_http_server(8000) while True: process_request()
Update Prometheus Config to scrape your app:
yaml
CopyEdit
- job_name: 'custom-app' static_configs: - targets: ['host.docker.internal:8000']
Create a Custom Dashboard in Grafana
Add panels for:request_processing_seconds_count
request_processing_seconds_sum
Calculate averages using PromQL
What You Learn:
How to expose application-level metrics
PromQL basics for analyzing latency and throughput
Customizing dashboards and alerts
Project 3: Website Uptime Monitoring with Blackbox Exporter
Goal: Use Prometheus and Grafana to monitor the availability of a public website.
Steps:
Run Blackbox Exporter:
arduino
CopyEdit
docker run -d -p 9115:9115 prom/blackbox-exporter
Configure Prometheus to use the exporter:
yaml
CopyEdit
- job_name: 'blackbox' metrics_path: /probe params: module: [http_2xx] static_configs: - targets: - https://refontelearning.com relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: localhost:9115
Create a Grafana Dashboard to visualize:
Probe success (
probe_success
)Response duration (
probe_duration_seconds
)Status code summaries
What You Learn:
Uptime and endpoint monitoring
Customizing scrape intervals
Creating alerts for downtime
Project 4: Container Monitoring with cAdvisor and Docker Metrics
Goal: Monitor Docker containers using cAdvisor with Prometheus and Grafana.
Steps:
Run cAdvisor:
ruby
CopyEdit
docker run -d \ -p 8080:8080 \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:ro \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ google/cadvisor
Configure Prometheus to scrape cAdvisor:
yaml
CopyEdit
- job_name: 'cadvisor' static_configs: - targets: ['localhost:8080']
Import Dashboard: Use Grafana dashboard ID 893 from Grafana.com
What You Learn:
How to monitor container resource usage
Understanding metrics like container CPU, memory, and I/O
Real-time visualization of containerized workloads
Project 5: Alerting and Incident Simulation
Goal: Set up Prometheus AlertManager and simulate an alert condition.
Steps:
Configure Prometheus Alert Rules:
yaml
CopyEdit
groups: - name: instance_down rules: - alert: InstanceDown expr: up == 0 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} down" description: "Target {{ $labels.instance }} has been unreachable for more than 1 minute."
Run AlertManager:
arduino
CopyEdit
docker run -d -p 9093:9093 prom/alertmanager
Integrate AlertManager with Prometheus and test by stopping a target
Add Grafana Notification Channel (email, Slack, or webhook)
What You Learn:
How Prometheus alerting rules work
Real-time incident response simulation
Multi-platform alert routing
Final Thoughts: Projects That Prepare You for Real Work
Prometheus and Grafana are not just tools—they're foundational to modern DevOps, SRE, and platform engineering workflows. By completing hands-on projects like these, you not only learn how the tools work—you build practical, resume-ready experience that hiring managers look for.
Whether you're monitoring a local system or deploying dashboards in the cloud, each project reinforces key observability concepts that are transferable across tech stacks and industries. Start small, build iteratively, and keep pushing your observability skills forward.
FAQs
Do I need Kubernetes to start learning Prometheus and Grafana?
No. You can start with local Docker containers and gradually move to Kubernetes environments once you’re comfortable with the basics.
Are these tools only for DevOps roles?
Not at all. They're also useful for backend developers, platform engineers, site reliability engineers (SREs), and data engineers who need system visibility.
What programming languages do I need to know?
Basic Bash or Python is helpful for scripting and metric instrumentation, but Prometheus and Grafana themselves are language-agnostic and easy to use with exporters.
Can I use Prometheus and Grafana in a cloud environment?
Yes. All major cloud providers (AWS, Azure, GCP) support Prometheus and Grafana via managed services or open-source deployments.
What’s the best way to showcase these projects?
Publish your configurations and dashboards on GitHub, write a blog post explaining your setup, or create a walkthrough video. Recruiters and hiring managers love to see real-world implementations.