<meta charset="utf-8">
什么是Prometheus?
Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。Prometheus使用Go语言开发,是Google BorgMon监控系统的开源版本。
2016年由Google发起Linux基金会旗下的原生云基金会(Cloud Native Computing Foundation), 将Prometheus纳入其下第二大开源项目。
Prometheus目前在开源社区相当活跃。
Prometheus和Heapster(Heapster是K8S的一个子项目,用于获取集群的性能数据。)相比功能更完善、更全面。Prometheus性能也足够支撑上万台规模的集群。
Prometheus的特点
- 多维度数据模型。
- 灵活的查询语言。
- 不依赖分布式存储,单个服务器节点是自主的。
- 通过基于HTTP的pull方式采集时序数据。
- 可以通过中间网关进行时序列数据推送。
- 通过服务发现或者静态配置来发现目标服务对象。
- 支持多种多样的图表和界面展示,比如Grafana等。
架构图
基本原理
Prometheus的基本原理是通过HTTP协议周期性抓取被监控组件的状态,任意组件只要提供对应的HTTP接口就可以接入监控。不需要任何SDK或者其他的集成过程。这样做非常适合做虚拟化环境监控系统,比如VM、Docker、Kubernetes等。输出被监控组件信息的HTTP接口被叫做exporter 。目前互联网公司常用的组件大部分都有exporter可以直接使用,比如Varnish、Haproxy、Nginx、MySQL、Linux系统信息(包括磁盘、内存、CPU、网络等等)。
服务过程
- Prometheus Daemon负责定时去目标上抓取metrics(指标)数据,每个抓取目标需要暴露一个http服务的接口给它定时抓取。Prometheus支持通过配置文件、文本文件、Zookeeper、Consul、DNS SRV Lookup等方式指定抓取目标。Prometheus采用PULL的方式进行监控,即服务器可以直接通过目标PULL数据或者间接地通过中间网关来Push数据。
- Prometheus在本地存储抓取的所有数据,并通过一定规则进行清理和整理数据,并把得到的结果存储到新的时间序列中。
- Prometheus通过PromQL和其他API可视化地展示收集的数据。Prometheus支持很多方式的图表可视化,例如Grafana、自带的Promdash以及自身提供的模版引擎等等。Prometheus还提供HTTP API的查询方式,自定义所需要的输出。
- PushGateway支持Client主动推送metrics到PushGateway,而Prometheus只是定时去Gateway上抓取数据。
- Alertmanager是独立于Prometheus的一个组件,可以支持Prometheus的查询语句,提供十分灵活的报警方式。
本教程内容简介
- 1.演示安装Prometheus Server
- 2.演示node-exporter、jmx_prometheus_javaagent安装使用,分别监控linux系统资源和kafka指标
- 3.演示grafana的使用
一、prometheus安装
- 官网下载prometheus-2.18.1.linux-amd64.tar.gz并解压
[root@master prometheus]# ll
total 148584
drwxr-xr-x 2 3434 3434 4096 May 8 2020 console_libraries
drwxr-xr-x 2 3434 3434 4096 May 8 2020 consoles
drwxr-xr-x 16 root root 4096 Nov 10 13:00 data
drwxr-xr-x 2 root root 4096 Nov 4 16:28 lib
-rw-r--r-- 1 3434 3434 11357 May 8 2020 LICENSE
-rw-r--r-- 1 3434 3434 3184 May 8 2020 NOTICE
-rwxr-xr-x 1 3434 3434 87173843 May 8 2020 prometheus
-rw-r--r-- 1 3434 3434 1209 Nov 4 16:41 prometheus.yml
-rwxr-xr-x 1 3434 3434 49973547 May 8 2020 promtool
-rwxr-xr-x 1 3434 3434 14957614 May 8 2020 tsdb
- 修改配置文件prometheus.yml
配置如下
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- 运行prometheus
./prometheus
访问http://10.4.4.16:9090/graph如下图所示,及表示成功安装
二、exporter安装使用
- node_exporter安装运行
- 下载node_exporter-1.0.1.linux-amd64.tar.gz并解压
[root@master node_exporter]# ll
total 19220
-rw-r--r-- 1 3434 3434 11357 Jun 16 21:19 LICENSE
-rwxr-xr-x 1 3434 3434 19657731 Jun 16 20:44 node_exporter
-rw------- 1 root root 47 Sep 30 03:02 nohup.out
-rw-r--r-- 1 3434 3434 463 Jun 16 21:19 NOTICE
- 运行
./node_exporter
- kafka_exporter安装运行
方式一:非侵入式kafka监控(推荐)
- 下载kafka_exporter-1.2.0.linux-amd64.tar.gz并解压
[root@master kafka_exporter]# ll
total 13276
-rwxr-xr-x 1 2000 2000 13578776 Jul 7 2018 kafka_exporter
-rw-rw-r-- 1 2000 2000 11357 Jul 7 2018 LICENSE
- 运行
./kafka_exporter --kafka.server=127.0.0.1:9092
方式二:侵入式kafka监控exporter安装(比较复杂,可以跳过)
jmx_prometheus_javaagent安装运行
- 下载二进制文件jmx_prometheus_javaagent-0.14.0.jar和配置文件kafka-0-8-2.yml 本教程将二进制文件放在了/opt/prometheus/lib目录下
[root@master prometheus]# ll
total 148584
drwxr-xr-x 2 3434 3434 4096 May 8 2020 console_libraries
drwxr-xr-x 2 3434 3434 4096 May 8 2020 consoles
drwxr-xr-x 17 root root 4096 Nov 10 15:00 data
drwxr-xr-x 2 root root 4096 Nov 4 16:28 lib
-rw-r--r-- 1 3434 3434 11357 May 8 2020 LICENSE
-rw-r--r-- 1 3434 3434 3184 May 8 2020 NOTICE
-rwxr-xr-x 1 3434 3434 87173843 May 8 2020 prometheus
-rw-r--r-- 1 3434 3434 1209 Nov 4 16:41 prometheus.yml
-rwxr-xr-x 1 3434 3434 49973547 May 8 2020 promtool
-rwxr-xr-x 1 3434 3434 14957614 May 8 2020 tsdb
[root@master prometheus]# ll lib
total 412
-rw-r--r-- 1 root root 413862 Nov 4 16:25 jmx_prometheus_javaagent-0.14.0.jar
-rw-r--r-- 1 root root 2820 Nov 4 16:28 kafka-0-8-2.yml
- 修改启动kafka的脚本,添加如下所示
export KAFKA_OPTS="$KAFKA_OPTS -javaagent:/opt/prometheus/lib/jmx_prometheus_javaagent-0.14.0.jar=7071:/opt/prometheus/lib/kafka-0-8-2.yml"
- 启动kafka
[root@master prometheus]# ps axu|grep kafka
root 16400 0.0 0.0 103320 896 pts/2 S+ 15:08 0:00 grep kafka
root 26565 3.9 34.4 13512408 5680568 pts/3 Sl Nov04 341:41 /opt/local/jdk/bin/java -Xmx6G -Xms2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/data/logs/kafka/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/data/logs/kafka -Dlog4j.configuration=file:bin/../config/log4j.properties -cp :/opt/local/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/local/kafka/bin/../libs/connect-api-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-file-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-json-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-runtime-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-transforms-0.10.2.1.jar:/opt/local/kafka/bin/../libs/guava-18.0.jar:/opt/local/kafka/bin/../libs/hk2-api-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/hk2-locator-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/hk2-utils-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/jackson-annotations-2.8.0.jar:/opt/local/kafka/bin/../libs/jackson-annotations-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-core-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-databind-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-jaxrs-base-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-jaxrs-json-provider-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-module-jaxb-annotations-2.8.5.jar:/opt/local/kafka/bin/../libs/javassist-3.20.0-GA.jar:/opt/local/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/local/kafka/bin/../libs/javax.inject-1.jar:/opt/local/kafka/bin/../libs/javax.inject-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/local/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/opt/local/kafka/bin/../libs/jersey-client-2.24.jar:/opt/local/kafka/bin/../libs/jersey-common-2.24.jar:/opt/local/kafka/bin/../libs/jersey-container-servlet-2.24.jar:/opt/local/kafka/bin/../libs/jersey-container-servlet-core-2.24.jar:/opt/local/kafka/bin/../libs/jersey-guava-2.24.jar:/opt/local/kafka/bin/../libs/jersey-media-jaxb-2.24.jar:/opt/local/kafka/bin/../libs/jersey-server-2.24.jar:/opt/local/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jopt-simple-5.0.3.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1-sources.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1-test-sources.jar:/opt/local/kafka/bin/../libs/kafka-clients-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-log4j-appender-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-streams-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-streams-examples-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-tools-0.10.2.1.jar:/opt/local/kafka/bin/../libs/log4j-1.2.17.jar:/opt/local/kafka/bin/../libs/lz4-1.3.0.jar:/opt/local/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/local/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/local/kafka/bin/../libs/reflections-0.9.10.jar:/opt/local/kafka/bin/../libs/rocksdbjni-5.0.1.jar:/opt/local/kafka/bin/../libs/scala-library-2.11.8.jar:/opt/local/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/opt/local/kafka/bin/../libs/slf4j-api-1.7.21.jar:/opt/local/kafka/bin/../libs/slf4j-log4j12-1.7.21.jar:/opt/local/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/opt/local/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/opt/local/kafka/bin/../libs/zkclient-0.10.jar:/opt/local/kafka/bin/../libs/zookeeper-3.4.9.jar -javaagent:/opt/prometheus/lib/jmx_prometheus_javaagent-0.14.0.jar=7071:/opt/prometheus/lib/kafka-0-8-2.yml kafka.Kafka /opt/local/kafka/config/server.properties
- 修改prometheus.yml,增加linux资源监控和kafka监控
- job_name: linux
static_configs:
- targets: ['10.4.4.16:9100']
labels:
instance: node
- job_name: 'kafka-exporter' #方式一,非侵入式(推荐)
static_configs:
- targets: ['10.4.4.16:9308']
- job_name: 'kafka' #方式二,侵入式
static_configs:
- targets: ['10.4.4.16:7071']
重启prometheus程序
-
访问http://10.4.4.16:9090/targets
可以看到,增加了linux和kafka的监控,下图红框所示:
三、grafana的安装使用
- 下载grafana-7.2.0.linux-amd64.tar.gz并解压
[root@master grafana]# ll
total 48
drwxr-xr-x 2 root root 4096 Sep 23 20:19 bin
drwxr-xr-x 3 root root 4096 Sep 23 20:19 conf
drwxr-xr-x 5 root root 4096 Nov 10 15:16 data
-rw-r--r-- 1 root root 11343 Sep 23 20:16 LICENSE
-rw-r--r-- 1 root root 108 Sep 23 20:16 NOTICE.md
drwxr-xr-x 4 root root 4096 Sep 23 20:19 plugins-bundled
drwxr-xr-x 12 root root 4096 Sep 23 20:19 public
-rw-r--r-- 1 root root 2799 Sep 23 20:16 README.md
drwxr-xr-x 2 root root 4096 Sep 23 20:19 scripts
-rw-r--r-- 1 root root 5 Sep 23 20:19 VERSION
- 运行 grafana
[root@master grafana]# ps axu|grep graf
root 18047 0.0 0.0 103320 900 pts/2 S+ 15:20 0:00 grep graf
root 24002 0.0 0.2 1695092 38780 ? Sl Sep30 45:16 ./grafana-server
访问http://10.4.4.16:3000/login出现如下,表示安装成功,默认登录用户名/密码为admin/admin
-
添加data source 为prometheus,注意填写url为服务器地址加9090端口既可
-
官网下载并导入dashboard json文件
-
查看node exporter如下:
查看kafka exporter如下: