开源监控系统Prometheus入门操作介绍

本文是一个“Hello World”风格的教程，演示了如何在简单的示例设置中安装，配置和使用Prometheus。您将在本地下载并运行Prometheus，然后将其自己看做一个应用程序来进行监控，同时使用Node Exporter采集主机数据。最后通过仪表盘来使用收集的时间序列数据。

安装

为您的平台下载最新版本的Prometheus，然后解压缩并运行它：

$ wget https://github.com/prometheus/prometheus/releases/download/v2.9.1/prometheus-2.9.1.linux-amd64.tar.gz
$ tar xvfz prometheus-2.9.1.linux-amd64.tar.gz

采集应用数据

Prometheus通过在目标应用上的HTTP端点/metrics来收集受监控目标的指标。由于Prometheus也以同样的方式公开数据，因此也可以抓取它的指标和监控自身的健康状况。

虽然Prometheus服务器收集有关自身的数据在实践中并不是很有用，但它是一个很好的示例。

配置

可以看到Prometheus的配置文件prometheus.yml：

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

这里设置了Prometheus采集本地9090端口服务。

启动

./prometheus --config.file=prometheus.yml

正常的情况下，你可以看到以下输出内容：

level=info ts=2019-04-18T03:24:43.863Z caller=main.go:285 msg="no time or size retention was set so using the default time retention" duration=15d
level=info ts=2019-04-18T03:24:43.863Z caller=main.go:321 msg="Starting Prometheus" version="(version=2.9.1, branch=HEAD, revision=ad71f2785fc321092948e33706b04f3150eee44f)"
level=info ts=2019-04-18T03:24:43.863Z caller=main.go:322 build_context="(go=go1.12.4, user=root@09f919068df4, date=20190416-17:50:04)"
level=info ts=2019-04-18T03:24:43.863Z caller=main.go:323 host_details="(Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 localhost.localdomain (none))"
level=info ts=2019-04-18T03:24:43.863Z caller=main.go:324 fd_limits="(soft=1024, hard=4096)"
level=info ts=2019-04-18T03:24:43.863Z caller=main.go:325 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-04-18T03:24:43.864Z caller=main.go:640 msg="Starting TSDB ..."
level=info ts=2019-04-18T03:24:43.874Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-04-18T03:24:43.876Z caller=main.go:655 msg="TSDB started"
level=info ts=2019-04-18T03:24:43.876Z caller=main.go:724 msg="Loading configuration file" filename=prometheus.yml
level=info ts=2019-04-18T03:24:43.877Z caller=main.go:751 msg="Completed loading of configuration file" filename=prometheus.yml
level=info ts=2019-04-18T03:24:43.877Z caller=main.go:609 msg="Server is ready to receive web requests."

这时Prometheus应该已经启动。您还应该通过localhost:9090浏览到自己的状态页面。过几秒钟后就会从自己的HTTP指标端点收集到有关数据。

您还可以通过浏览其指标端点来验证Prometheus是否正在提供有关自身的指标： localhost:9090/metrics

使用Node Exporter采集主机数据

Exporter的作用是暴露已有的第三方服务的 metrics 给 Prometheus。Exporter将监控数据采集的端点通过HTTP服务的形式暴露给Prometheus Server，Prometheus Server通过访问该Exporter提供的Endpoint端点，即可获取到需要采集的监控数据。

从上面的描述中可以看出Exporter可以是一个相对开放的概念，其可以是一个独立运行的程序独立于监控目标以外，也可以是直接内置在监控目标中。只要能够向Prometheus提供标准格式的监控样本数据即可。

这里为了能够采集到主机的运行指标如CPU, 内存，磁盘等信息。我们可以使用Node Exporter。

安装

$ wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
$ tar xvfz node_exporter-0.17.0.linux-amd64.tar.gz

运行

 $ cd node_exporter-0.17.0.linux-amd64
 $ ./node_exporter

看到如下输出：

INFO[0000] Starting node_exporter (version=0.17.0, branch=HEAD, revision=f6f6194a436b9a63d0439abc585c76b19a206b21)  source="node_exporter.go:82"
INFO[0000] Build context (go=go1.11.2, user=root@322511e06ced, date=20181130-15:51:33)  source="node_exporter.go:83"
INFO[0000] Enabled collectors:                           source="node_exporter.go:90"
INFO[0000]  - arp                                        source="node_exporter.go:97"
INFO[0000]  - bcache                                     source="node_exporter.go:97"
INFO[0000]  - bonding                                    source="node_exporter.go:97"
INFO[0000]  - conntrack                                  source="node_exporter.go:97"
INFO[0000]  - cpu                                        source="node_exporter.go:97"
INFO[0000]  - diskstats                                  source="node_exporter.go:97"
INFO[0000]  - edac                                       source="node_exporter.go:97"
INFO[0000]  - entropy                                    source="node_exporter.go:97"
INFO[0000]  - filefd                                     source="node_exporter.go:97"
INFO[0000]  - filesystem                                 source="node_exporter.go:97"
INFO[0000]  - hwmon                                      source="node_exporter.go:97"
INFO[0000]  - infiniband                                 source="node_exporter.go:97"
INFO[0000]  - ipvs                                       source="node_exporter.go:97"
INFO[0000]  - loadavg                                    source="node_exporter.go:97"
INFO[0000]  - mdadm                                      source="node_exporter.go:97"
INFO[0000]  - meminfo                                    source="node_exporter.go:97"
INFO[0000]  - netclass                                   source="node_exporter.go:97"
INFO[0000]  - netdev                                     source="node_exporter.go:97"
INFO[0000]  - netstat                                    source="node_exporter.go:97"
INFO[0000]  - nfs                                        source="node_exporter.go:97"
INFO[0000]  - nfsd                                       source="node_exporter.go:97"
INFO[0000]  - sockstat                                   source="node_exporter.go:97"
INFO[0000]  - stat                                       source="node_exporter.go:97"
INFO[0000]  - textfile                                   source="node_exporter.go:97"
INFO[0000]  - time                                       source="node_exporter.go:97"
INFO[0000]  - timex                                      source="node_exporter.go:97"
INFO[0000]  - uname                                      source="node_exporter.go:97"
INFO[0000]  - vmstat                                     source="node_exporter.go:97"
INFO[0000]  - xfs                                        source="node_exporter.go:97"
INFO[0000]  - zfs                                        source="node_exporter.go:97"
INFO[0000] Listening on :9100                            source="node_exporter.go:111"

代表已启动成功，可通过9100端口访问。

从Node Exporter收集监控数据

为了能够让Prometheus Server能够从当前node exporter获取到监控数据，这里需要修改Prometheus配置文件。编辑prometheus.yml并在scrape_configs节点下添加以下内容:

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'node'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9100']

重新启动Prometheus Server

访问http://localhost:9090，进入到Prometheus Server。输入“up”并且点击执行按钮以后，可以看到如下结果：

Prometheus UI

其中“1”表示正常，反之“0”则为异常。

集成Grafana

第三方的可视化工具Grafana是一个开源的可视化平台，并且提供了对Prometheus的完整支持。

下载安装

官网下载地址：Grafana

根据自己的系统版本和配置，下载对应的包。3000为Grafana的默认侦听端口，启动后打开浏览器，输入IP+端口进行访问。

系统默认用户名和密码为admin/admin，第一次登陆系统会要求修改密码。

添加数据源

首先是添加数据源，这里选择Prometheus作为默认的数据源，如下图所示，指定数据源类型为Prometheus并且设置Prometheus的访问地址即可，在配置正确的情况下点击“Save & Test”按钮，会提示连接成功的信息：

添加数据源

创建DashBoard

创建好数据源之后，就需要创建DashBoard（仪表盘），可以自定义，也可以导入你需要的仪表盘，官方提供了很多的可选仪表盘。

这里我们选择官方提供的Prometheus 2.0 Stats。

创建仪表盘

展示仪表盘

Grafana中所有的Dashboard通过JSON进行共享，下载并且导入这些JSON文件，就可以直接使用这些已经定义好的Dashboard：

仪表盘样式

可以在Explore的“Metrics”选项下通过PromQL查询需要可视化的数据，比如输入指标node_memory_MemFree_bytes查看系统可用内存：

PromQL查询

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 204,590评论 6赞 478
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 86,808评论 2赞 381
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 151,151评论 0赞 337
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,779评论 1赞 277
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,773评论 5赞 367
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,656评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,022评论 3赞 398
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,678评论 0赞 258
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 41,038评论 1赞 299
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,659评论 2赞 321
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,756评论 1赞 330
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,411评论 4赞 321
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,005评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,973评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,203评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,053评论 2赞 350
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,495评论 2赞 343

开源监控系统Prometheus入门操作介绍

安装

采集应用数据

配置

启动

使用Node Exporter采集主机数据

安装

运行

从Node Exporter收集监控数据

集成Grafana

下载安装

添加数据源

创建DashBoard

展示仪表盘

推荐阅读更多精彩内容