0% found this document useful (0 votes)

226 views31 pages

16 - Prometheus Handout

Uploaded by

kihepex735

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

226 views31 pages

16 - Prometheus Handout

Uploaded by

kihepex735

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

DEVOPS

BOOTCAMP Monitoring with

Prometheus
Introduction to Prometheus

Prometheus is an open-source monitoring system and alerting

toolkit
Prometheus is used widely and has an active community

It gathers, organizes, and stores metrics as time series data from

targets by "scraping" metrics HTTP endpoints
Can trigger alerts when specified conditions are observed
Why we need a monitoring tool - 1
Visibility in different environments Visibility on different levels

You need visibility in all kinds of When you have 100s or 1000s of containers, plus
environments, but especially in highly components on multiple levels (infrastructure,
dynamic container environment, which is more platform, application) you need a way to have a
challenging to monitor visibility and consistent monitoring across all these
components

Container
environments
Bare servers

For that you need tools like Prometheus, which

are designed for monitoring these types of Without visibility, it's a black box for you. When things
environments break inside your complex environment, you have no
idea what is happening. You don't know what has
caused the issue, what is not working.
Why we need a monitoring tool - 2
Example Use Case
Backend running?
Any exceptions?
Problem:
Auth-Service running?
Many application errors appear in frontend to the end user
Why did Auth-Service crash?
They only see the error message, but the cause can be any
of the many components in the backend

Solution:
Monitoring can help identifying the problem quickly with little effort
Instead of manually trying to troubleshoot across multiple
components, it will help exactly pin point directly to the root
cause
Saves you a lot of time and effort

Company saves a lot of money

Why we need a monitoring tool - 3

With Prometheus everything is automated. It's constantly monitoring and looking out for any
issues real time and may even identify a potential issue before it happens, so you can prevent it.

Constantly monitors all the services

Triggers alerts when a services crashes

Helps to identify problems before they happen

Prometheus Architecture - 1
How it all works

Prometheus Server

Is the main component

Does the actual monitoring work
Scrapes and stores time series data
Prometheus Architecture - 2
How it all works: Targets & Metrics

Prometheus pulls metrics from targets

Targets Metrics

What does Prometheus monitor? Which units are monitored of those targets?
Prometheus Architecture - 3
How it all works: Metrics

Metrics play an important role in understanding why your application is working in a certain way

Metric Entries How Prometheus collects Metrics Data from Targets

Format: Human-readable text-based Prometheus pulls from HTTP endpoints

Metric entries consist of: TYPE and HELP attributes Targets must expose: [hostaddress]/metrics
Must be in correct format that Prometheus
understands

...how many times x happened ...what is current value of x now? ...how long or how big?
Prometheus Architecture - 4
How it all works: Exporters
Official vs Third-Party
Some services expose /metrics endpoints by default Some are maintained as part of the official

Others need another component for that: Prometheus organization, others are externally
contributed and maintained.

Exporters

Exporters help in exporting existing metrics

from third-party systems as Prometheus
metrics
An exporter is a services that fetches metrics
from target and converts the data and
exposes them as Prometheus metrics
Prometheus can then scrape this endpoint as
usual
Prometheus Architecture - 5
How it all works: Exporters

Example: Monitor a Linux Server Example: Monitor own applications

1. Download a node exporter Client libraries let you define and expose

2. Untar and execute internal metrics via an HTTP endpoint on

3. Converts metrics of the server your application's instance

4. Exposes /metrics endpoint Metrics like: How many requests?

5. Configure Prometheus to How many exceptions?
Server resources used?
scrape this endpoint
Choose a Prometheus client library that
matches the language in which your
application is written

Exporters are available as Docker Images

Prometheus Architecture - 6
How it all works: Push vs Pull

Important difference of Prometheus compared to other monitoring systems like Amazon Cloud
Watch or New Relic
Prometheus - Pull Model

Others - Push Model Prometheus pulls metrics from endpoints

Services push to a centralized collection platform

High load of network traffic

Monitoring can become your bottleneck

Installation of additional software to push metrics

Prometheus Architecture - 7
How it all works: Pushgateway

Pushgateway

An intermediary service, which

allows you to push metrics from
jobs, which cannot be scraped
Prometheus recommends using the
Pushgateway only in certain limited
cases: Usually only valid use case for
capturing the outcome of a service-
level batch job
Prometheus Architecture - 8
How it all works: Alertmanager

Alertmanager

The Alertmanager handles alerts sent by Prometheus server

Takes care of deduplicating, grouping and routing them to the correct receiver integrations

Receiver of these alerts can be email,

PagerDuty, Slack etc.
Prometheus Architecture - 9
How it all works: Data Storage

Prometheus Data Storage

Prometheus includes a local on-disk time series database

But optionally integrates with remote storage systems

Data in local storage is stored in a

custom, highly efficient format
Prometheus Architecture - 10
How it all works: PromQL
Querying Prometheus
Prometheus provides a functional query language called
PromQL
Let's user select and aggregate time series data in real time

Options to view result Example Queries:

1) Query target directly

2) Prometheus Web UI

3) Or use a more
powerful visualization
tool, e.g. Grafana
Configuring Prometheus - 1
YAML Config

You write your configuration in a prometheus.yml file

Let Prometheus know what to scrape and when:

Which targets? At what interval? prometheus.yml

Example Config File

Targets are discovered via
How often Prometheus will
a service discovery
scrape its targets
mechanism
Rules for aggregating metric
values or creating alerts
when condition met
What resources Prometheus
monitors
Prometheus has its own
/metrics endpoint
Configuring Prometheus - 2

Define your own jobs

Default values for each job:

Prometheus Characteristics

Difficult to scale

Reliable

Standalone and self-containing

Works, even if other parts of infrastructure broken

No extensive set-up needed Limits Monitoring

Less complex

Workaround:

Increase Prometheus server capacity

Limit number of metrics

Other Prometheus Features
Prometheus Federation

Allows Prometheus to scale to environments

with tens of data centers and millions of
nodes
Allows a Prometheus server to scrape data
from other Prometheus servers

Prometheus with Docker and Kubernetes

Fully compatible Can easily be deployed in container

Prometheus environments like K8s
components available Monitoring of K8s cluster node
as Docker images resources out-of-the-box!
Deploy Monitoring Stack - 1
3 different ways to deploy the Prometheus monitoring stack

1) Do it yourself 2) Using an Operator 2) Using Helm

Create all configuration Manager of all Prometheus Using Helm chart to deploy
YAML files yourself components operator
Execute them in right Helm: Manage initial setup
order 1. Find Prometheus operator Operator: Manage setup
2. Deploy in K8s cluster
Deploy Monitoring Stack - 2

Overview of K8s resources deployed:

3 Deployments 1 DaemonSet
Prometheus Operator Node Exporter DaemonSet

created Prometheus and => connects to server

Alertmanager StatefulSet
=> translates Worker Nodes
Grafana metrics to Prometheus metrics
Kube State Metrics
CPU usage load on server
=> own Helm chart

=> dependency of this Helm chart

=> scrapes K8s components

Data Visualization - 1
1st step: Decide what to monitor?

Notice when something unexpected happens

Observe any anomalies
CPU spikes, insufficient storage, high load,
unauthorized requests

Analyze and react accordingly

2nd step: How to get this information?

How to get visibility of these

monitoring data
What data do we have available?
Data Visualization - 2

3rd step: Use a proper data visualization tool

Grafana = a powerful open source visualization and

analytics software
Already deployed with the Prometheus Operator
Helm Chart

Data Visualization Tool

Grafana
With Grafana you can create dynamic and reusable dashboards that allow
you to visualize your data in any way you want

Dashboard Panel
Dashboard is a set of one or more panels The basic visualization building block in Grafana

You can create your own Dashboards Composed by a query and a visualization

Organized into one or more rows Each panel has a query editor specific to the

Row is a logical divider within a dashboard data source selected in the panel

Rows are used to group panels together Can be moved and resized within a dashboard
Alerting in Prometheus - 1

Instead of constantly checking, you

want to get notified when something
happens
Then you will check your dashboards

For that we need to configure our

monitoring stack to notify us
whenever something unexpected
happens
Alerting in Prometheus - 2 Example Alert rules to configure

1st Alert: when CPU usage > 50%

Configure Alerting
2nd Alert: when Pod cannot start
Alerting with Prometheus is separated into 2 parts:
1) Alerting rules in Prometheus server send
alerts to an Alertmanager
2) Alertmanager then manages (deduplicating,
grouping, routing) those alerts, including
1)
sending out notifications

Main steps to setup alerting and notifications:

2)
1. Setup and configure the Alertmanager
2. Configure Prometheus to talk to the Alertmanaer
Prometheus server and
3. Create alerting rules in Prometheus
Alertmanager have each its
own configuration file
Alerting in Prometheus - 3
Alertmanager example configuration

Receiver:
These are the notification integrations
For each alert you can define own receiver. For example:
send all K8s cluster related issues to admin email
send all application related issues to developer
team's slack channel
Monitor third party and own applications - 1

Still missing:

Configure Third-Party and

own application monitoring

Monitor Kubernetes components

Monitor Resource Consumption on the Nodes

Monitor Prometheus Stack itself

Monitor third-party applications like Redis

Monitor own applications, like your online shop

microservices
Monitor third party and own applications - 2
3rd-party example: Redis

Monitor Redis on application level, not on

Kubernetes level

As we learnt, we can do that

via an Exporter!

How to:
1. Deploy redis-exporter
2. Deploy ServiceMonitor (custom K8s
resource) to tell Prometheus about
this new exporter
Monitor third party and own applications - 3
Own application

No exporter available for your own application As we learnt, we can do that

via Client Libraries!
So we have to define the metrics ourselves

How to (Nodejs application): Client Libraries:

1. Expose metrics using Nodejs client library Gives you an abstract interface to expose your

2. Deploy Nodejs application in the cluster metrics

3. Configure Prometheus to scrape new target Libraries implement the Prometheus metric types:

(ServiceMonitor) Counter, Gauge, Histogram, Summary

4. Visualize scraped metrics in Grafana Dashboard Choose client library that matches the application's
language
Best Practices

Official Best Practices:

Metric and Label Naming: https://prometheus.io/docs/practices/naming/
Set of guidelines for instrumenting your code:
https://prometheus.io/docs/practices/instrumentation/
Consoles and Dashboards: https://prometheus.io/docs/practices/consoles/
Alerting :https://prometheus.io/docs/practices/alerting/
On when to use the Pushgateway: https://prometheus.io/docs/practices/pushing/

16 - Prometheus Checklist
No ratings yet
16 - Prometheus Checklist
9 pages
Setup Prometheus Monitoring On Kubernetes
No ratings yet
Setup Prometheus Monitoring On Kubernetes
6 pages
Prometheus Grafana Setup
100% (1)
Prometheus Grafana Setup
5 pages
DevOps Monitoring with Prometheus & Grafana
No ratings yet
DevOps Monitoring with Prometheus & Grafana
10 pages
DevOps Shack - Comprehensive Monitoring Guide
No ratings yet
DevOps Shack - Comprehensive Monitoring Guide
41 pages
Prometheus and Grafana Monitoring Tools 1703260158
No ratings yet
Prometheus and Grafana Monitoring Tools 1703260158
59 pages
Prometheus Ebook v2
80% (5)
Prometheus Ebook v2
231 pages
Kubernetes
No ratings yet
Kubernetes
92 pages
Nginx Monitoring in Prometheus
No ratings yet
Nginx Monitoring in Prometheus
5 pages
and Install GitLab GitLab
No ratings yet
and Install GitLab GitLab
17 pages
Helm Charts
100% (1)
Helm Charts
10 pages
Helm For Freshers (Step by Step Guide)
No ratings yet
Helm For Freshers (Step by Step Guide)
14 pages
Terraform Projects - DevOps Shack
No ratings yet
Terraform Projects - DevOps Shack
48 pages
Kubernetes For Beginners
100% (1)
Kubernetes For Beginners
29 pages
Argo Flux Whitepaper
No ratings yet
Argo Flux Whitepaper
16 pages
Prisma Cloud Complete Guide Kubernetes
No ratings yet
Prisma Cloud Complete Guide Kubernetes
14 pages
Prometheus Up and Running Infrastructure
No ratings yet
Prometheus Up and Running Infrastructure
6 pages
Terraform Commands
100% (1)
Terraform Commands
5 pages
DevOps Shack - Mastering Git A Comprehensive Guide
No ratings yet
DevOps Shack - Mastering Git A Comprehensive Guide
41 pages
LinuxFoundation CKS v2021-09-20 q9
No ratings yet
LinuxFoundation CKS v2021-09-20 q9
10 pages
SSL-TLS Certificate Setup
No ratings yet
SSL-TLS Certificate Setup
36 pages
Sonatype - Nexus Latest
No ratings yet
Sonatype - Nexus Latest
20 pages
Install Grafana 9 With Prometheus On RHEL 9 - CentOS Stream 9 - TechnixLeo
100% (1)
Install Grafana 9 With Prometheus On RHEL 9 - CentOS Stream 9 - TechnixLeo
10 pages
Certification Project: Problem Statement - I
0% (2)
Certification Project: Problem Statement - I
3 pages
Docker Cheat Sheet
100% (1)
Docker Cheat Sheet
20 pages
Kibana, Grafana and Zeppelin On Monitoring Data
100% (1)
Kibana, Grafana and Zeppelin On Monitoring Data
21 pages
Mar 1st Kubenetes Helm Notes
No ratings yet
Mar 1st Kubenetes Helm Notes
10 pages
DevOps Mini Projects Guide
No ratings yet
DevOps Mini Projects Guide
44 pages
Jenkins Lab
100% (1)
Jenkins Lab
41 pages
Deploying Openstack: What Options Do We Have?
No ratings yet
Deploying Openstack: What Options Do We Have?
19 pages
AWS DevOps Course Syllabus
No ratings yet
AWS DevOps Course Syllabus
6 pages
Jenkins Automation for Developers
No ratings yet
Jenkins Automation for Developers
52 pages
A Container Stack For OpenStack
100% (1)
A Container Stack For OpenStack
30 pages
100 Linux Best Practices
No ratings yet
100 Linux Best Practices
15 pages
Kubernetes Notes
No ratings yet
Kubernetes Notes
55 pages
Terraform Running Using Jenkins CI-CD Pipeline
No ratings yet
Terraform Running Using Jenkins CI-CD Pipeline
13 pages
Kubernetes-Certified-Administrator - README - MD at Master Walidshaari - Kubernetes-Certified-Administrator GitHub PDF
No ratings yet
Kubernetes-Certified-Administrator - README - MD at Master Walidshaari - Kubernetes-Certified-Administrator GitHub PDF
7 pages
KPLABS Course - CKA D1 Core Concepts
No ratings yet
KPLABS Course - CKA D1 Core Concepts
22 pages
100 Kubernetes Commands
No ratings yet
100 Kubernetes Commands
16 pages
Helm Package Manager Guide
No ratings yet
Helm Package Manager Guide
14 pages
GitHub Actions CI CD
No ratings yet
GitHub Actions CI CD
29 pages
Multinode k8s Cluster
No ratings yet
Multinode k8s Cluster
51 pages
Extending SaltStack - Sample Chapter
No ratings yet
Extending SaltStack - Sample Chapter
15 pages
Kubernetes Command Reference Guide
No ratings yet
Kubernetes Command Reference Guide
1 page
Edureka Training - DevOps Certification Training Course
No ratings yet
Edureka Training - DevOps Certification Training Course
11 pages
The Ultimate DevOps Bootcamp
100% (1)
The Ultimate DevOps Bootcamp
2 pages
Kubernetes Persistent Volumes
No ratings yet
Kubernetes Persistent Volumes
13 pages
Terraform For Teenagers
No ratings yet
Terraform For Teenagers
27 pages
Podman Container Management Guide
No ratings yet
Podman Container Management Guide
5 pages
Terraform Best Practices Guide
100% (1)
Terraform Best Practices Guide
47 pages
Grafana: Open-Source Data Visualization Tool
No ratings yet
Grafana: Open-Source Data Visualization Tool
7 pages
Docker Teraform
50% (2)
Docker Teraform
13 pages
Docker Real World Scenarios With Solutions
No ratings yet
Docker Real World Scenarios With Solutions
32 pages
Sysadmin Interview
No ratings yet
Sysadmin Interview
32 pages
Advanced Kubernetes Scenarios
No ratings yet
Advanced Kubernetes Scenarios
45 pages
Terraform: Automation On Aws
No ratings yet
Terraform: Automation On Aws
24 pages
Terraform Interview Guide
No ratings yet
Terraform Interview Guide
15 pages
Mastering Core DevOps Scenarios
No ratings yet
Mastering Core DevOps Scenarios
15 pages
Prometheus Concepts
No ratings yet
Prometheus Concepts
4 pages
House Dzone Refcard 293 Getting Started Prometheus
No ratings yet
House Dzone Refcard 293 Getting Started Prometheus
6 pages
Guide To Computer Forensics and Investigations Processing Digital Evidence 5th Edition Bill Nelson PDF Download
No ratings yet
Guide To Computer Forensics and Investigations Processing Digital Evidence 5th Edition Bill Nelson PDF Download
130 pages
Lecture7 - SQL DDL - Part 1
No ratings yet
Lecture7 - SQL DDL - Part 1
31 pages
Gym Shark
No ratings yet
Gym Shark
2 pages
Summary Database Design For The DakStats Baseball
No ratings yet
Summary Database Design For The DakStats Baseball
2 pages
8051 Notes
No ratings yet
8051 Notes
61 pages
Project Proposal
0% (1)
Project Proposal
7 pages
Hadoop Admin Responsibilities
No ratings yet
Hadoop Admin Responsibilities
1 page
MC012-010 and 012 - Data Sheet
100% (1)
MC012-010 and 012 - Data Sheet
4 pages
Latin For Beginners by D'Ooge, Benjamin Leonard, 1860-1940
No ratings yet
Latin For Beginners by D'Ooge, Benjamin Leonard, 1860-1940
12 pages
Collective Order
100% (1)
Collective Order
26 pages
Rani Ki Vav in English - Google Search 3
No ratings yet
Rani Ki Vav in English - Google Search 3
1 page
Job Costing Epicor
No ratings yet
Job Costing Epicor
311 pages
Embedded Systems Course Guide
No ratings yet
Embedded Systems Course Guide
1 page
Network SIS Setup Guide - TMP
No ratings yet
Network SIS Setup Guide - TMP
25 pages
Brochure Sinumerik 808d
No ratings yet
Brochure Sinumerik 808d
20 pages
Matlab For Advanced Users, WI4141TU: K.dekker@tudelft - NL P.wilders@tudelft - NL
No ratings yet
Matlab For Advanced Users, WI4141TU: K.dekker@tudelft - NL P.wilders@tudelft - NL
7 pages
6sa553 PDF
No ratings yet
6sa553 PDF
2 pages
C++ Book
No ratings yet
C++ Book
416 pages
Cisco 200-155 Exam Dumps
100% (1)
Cisco 200-155 Exam Dumps
18 pages
LDAP Client Configuration With Autofs Home Directories
No ratings yet
LDAP Client Configuration With Autofs Home Directories
3 pages
Automated Patching: Pivot From Manual: To Scalable With Oracle Database Lifecycle Management Pack
No ratings yet
Automated Patching: Pivot From Manual: To Scalable With Oracle Database Lifecycle Management Pack
35 pages
Iiot Reference Architecture: Course Code: Csio4700 Course Name: Iot For Industries
No ratings yet
Iiot Reference Architecture: Course Code: Csio4700 Course Name: Iot For Industries
14 pages
Horn PDF
No ratings yet
Horn PDF
8 pages
Unipower Company 2018pptx
No ratings yet
Unipower Company 2018pptx
22 pages
Appin Technology Lab (Network Security Courses)
No ratings yet
Appin Technology Lab (Network Security Courses)
21 pages
Robotics Process Automation September 2015 v17-1
91% (11)
Robotics Process Automation September 2015 v17-1
164 pages
Social Media Marketing Training Instituite in Chennai - Social Media Classes in Chennai
No ratings yet
Social Media Marketing Training Instituite in Chennai - Social Media Classes in Chennai
14 pages
Weekly Homework Packet 2nd Grade
100% (1)
Weekly Homework Packet 2nd Grade
8 pages
Generative AI and ChatGPT For Beginners - A Comprehensive Guide To Harness The Power of AI, Boost Productivity, and Get More Done in Less Time (Tech Mastery)
No ratings yet
Generative AI and ChatGPT For Beginners - A Comprehensive Guide To Harness The Power of AI, Boost Productivity, and Get More Done in Less Time (Tech Mastery)
118 pages
ICT Revision Worksheet TERM - II Answers
No ratings yet
ICT Revision Worksheet TERM - II Answers
32 pages

16 - Prometheus Handout

Uploaded by

16 - Prometheus Handout

Uploaded by

DEVOPS

BOOTCAMP Monitoring with

Prometheus is an open-source monitoring system and alerting

It gathers, organizes, and stores metrics as time series data from

For that you need tools like Prometheus, which

Company saves a lot of money

Constantly monitors all the services

Triggers alerts when a services crashes

Helps to identify problems before they happen

Is the main component

Prometheus pulls metrics from targets

Metric Entries How Prometheus collects Metrics Data from Targets

Format: Human-readable text-based Prometheus pulls from HTTP endpoints

Exporters help in exporting existing metrics

Example: Monitor a Linux Server Example: Monitor own applications

2. Untar and execute internal metrics via an HTTP endpoint on

3. Converts metrics of the server your application's instance

4. Exposes /metrics endpoint Metrics like: How many requests?

Exporters are available as Docker Images

Others - Push Model Prometheus pulls metrics from endpoints

Services push to a centralized collection platform

High load of network traffic

Monitoring can become your bottleneck

Installation of additional software to push metrics

An intermediary service, which

The Alertmanager handles alerts sent by Prometheus server

Receiver of these alerts can be email,

Prometheus Data Storage

Prometheus includes a local on-disk time series database

Data in local storage is stored in a

Options to view result Example Queries:

1) Query target directly

You write your configuration in a prometheus.yml file

Which targets? At what interval? prometheus.yml

Example Config File

Define your own jobs

Default values for each job:

Standalone and self-containing

Works, even if other parts of infrastructure broken

No extensive set-up needed Limits Monitoring

Increase Prometheus server capacity

Limit number of metrics

Allows Prometheus to scale to environments

Prometheus with Docker and Kubernetes

Fully compatible Can easily be deployed in container

1) Do it yourself 2) Using an Operator 2) Using Helm

Overview of K8s resources deployed:

created Prometheus and => connects to server

=> dependency of this Helm chart

=> scrapes K8s components

Notice when something unexpected happens

Analyze and react accordingly

2nd step: How to get this information?

How to get visibility of these

3rd step: Use a proper data visualization tool

Grafana = a powerful open source visualization and

Data Visualization Tool

Instead of constantly checking, you

For that we need to configure our

1st Alert: when CPU usage > 50%

Main steps to setup alerting and notifications:

Configure Third-Party and

Monitor Kubernetes components

Monitor Resource Consumption on the Nodes

Monitor Prometheus Stack itself

Monitor third-party applications like Redis

Monitor own applications, like your online shop

Monitor Redis on application level, not on

As we learnt, we can do that

No exporter available for your own application As we learnt, we can do that

How to (Nodejs application): Client Libraries:

2. Deploy Nodejs application in the cluster metrics

(ServiceMonitor) Counter, Gauge, Histogram, Summary

Official Best Practices:

You might also like