一、zabbix服务无法启动:
1、问题:服务done掉,无法启动zabbix服务:
[root@zabbix /var/log/zabbix]#systemctl status zabbix-server.service
● zabbix-server.service - Zabbix Server
Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2020-05-13 08:11:36 CST; 764ms ago
Process: 15248 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=1/FAILURE)
Process: 15241 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
Main PID: 15243 (code=exited, status=0/SUCCESS)
May 13 08:11:36 zabbix kill[15248]: -s, --signal <sig> send specified signal
May 13 08:11:36 zabbix kill[15248]: -q, --queue <sig> use sigqueue(2) rather than kill(2)
May 13 08:11:36 zabbix kill[15248]: -p, --pid print pids without signaling them
May 13 08:11:36 zabbix kill[15248]: -l, --list [=<signal>] list signal names, or convert one to a name
May 13 08:11:36 zabbix kill[15248]: -L, --table list signal names and numbers
May 13 08:11:36 zabbix kill[15248]: -h, --help display this help and exit
May 13 08:11:36 zabbix kill[15248]: -V, --version output version information and exit
May 13 08:11:36 zabbix kill[15248]: For more details see kill(1).
May 13 08:11:36 zabbix systemd[1]: Unit zabbix-server.service entered failed state.
May 13 08:11:36 zabbix systemd[1]: zabbix-server.service failed.
2、首先想到的就是查看zabbix日志:
[root@zabbix /var/log/zabbix]#more zabbix_server.log
15846:20200513:081745.479 Starting Zabbix Server. Zabbix 4.4.7 (revision 77fb8c7ee0).
15846:20200513:081745.479 ****** Enabled features ******
15846:20200513:081745.479 SNMP monitoring: YES
15846:20200513:081745.479 IPMI monitoring: YES
15846:20200513:081745.479 Web monitoring: YES
15846:20200513:081745.479 VMware monitoring: YES
15846:20200513:081745.479 SMTP authentication: YES
15846:20200513:081745.479 ODBC: YES
15846:20200513:081745.479 SSH support: YES
15846:20200513:081745.480 IPv6 support: YES
15846:20200513:081745.480 TLS support: YES
15846:20200513:081745.480 ******************************
15846:20200513:081745.480 using configuration file: /etc/zabbix/zabbix_server.conf
15846:20200513:081745.487 current database version (mandatory/optional): 04040000/04040002
15846:20200513:081745.487 required mandatory version: 04040000
15846:20200513:081745.501 server #0 started [main process]
15848:20200513:081745.502 server #1 started [configuration syncer #1]
15848:20200513:081746.189 __mem_malloc: skipped 0 asked 24 skip_min 18446744073709551615 skip_max 0
15848:20200513:081746.189 [file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 16 bytes)
15848:20200513:081746.189 [file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
15848:20200513:081746.189 === memory statistics for configuration cache ===
备注:通过上面的日志信息可以发现,提示CacheSize内存不足,去检查zabbix主配置文件。
3、检查zabbix主配置文件:
403 ### Option: CacheSize
404 # Size of configuration cache, in bytes.
405 # Shared memory size for storing host, item and trigger data.
406 #
407 # Mandatory: no
408 # Range: 128K-8G
409 # Default:
410 #CacheSize=8M
修改410 #CacheSize=8M,去掉#号,把默认的8M在这里我改成2048M。
4、修改完后,重启zabbix服务:
[root@zabbix /var/log/zabbix]#systemctl restart zabbix-server.service
[root@zabbix ~]#systemctl status zabbix-server.service
● zabbix-server.service - Zabbix Server
Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2020-05-13 08:44:40 CST; 3h 12min ago
Process: 18301 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=0/SUCCESS)
Process: 18317 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
Main PID: 18319 (zabbix_server)
CGroup: /system.slice/zabbix-server.service
├─18319 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
├─18321 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 0.678445 sec, idle 60 sec]
├─18324 /usr/sbin/zabbix_server: housekeeper [deleted 815888 hist/trends, 0 items/triggers, 0 events, 0 sessions, 0 ala...
├─18325 /usr/sbin/zabbix_server: timer #1 [updated 0 hosts, suppressed 0 events in 0.000707 sec, idle 59 sec]
├─18326 /usr/sbin/zabbix_server: http poller #1 [got 0 values in 0.000843 sec, idle 5 sec]
├─18327 /usr/sbin/zabbix_server: discoverer #1 [processed 0 rules in 0.000659 sec, idle 60 sec]
├─18328 /usr/sbin/zabbix_server: history syncer #1 [processed 0 values, 0 triggers in 0.000043 sec, idle 1 sec]
├─18329 /usr/sbin/zabbix_server: history syncer #2 [processed 0 values, 0 triggers in 0.000027 sec, idle 1 sec]
├─18330 /usr/sbin/zabbix_server: history syncer #3 [processed 128 values, 101 triggers in 0.087992 sec, idle 1 sec]
├─18331 /usr/sbin/zabbix_server: history syncer #4 [processed 104 values, 96 triggers in 0.017477 sec, idle 1 sec]
├─18332 /usr/sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.000794 sec, idle 3 sec]
├─18333 /usr/sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000068 sec, idle 5 sec]
├─18334 /usr/sbin/zabbix_server: self-monitoring [processed data in 0.000025 sec, idle 1 sec]
├─18335 /usr/sbin/zabbix_server: task manager [processed 0 task(s) in 0.000544 sec, idle 5 sec]
├─18337 /usr/sbin/zabbix_server: poller #1 [got 30 values in 0.037370 sec, idle 1 sec]
├─18339 /usr/sbin/zabbix_server: poller #2 [got 23 values in 0.026674 sec, idle 1 sec]
├─18340 /usr/sbin/zabbix_server: poller #3 [got 16 values in 0.036619 sec, idle 1 sec]
├─18341 /usr/sbin/zabbix_server: poller #4 [got 15 values in 0.016117 sec, idle 1 sec]
├─18342 /usr/sbin/zabbix_server: poller #5 [got 17 values in 0.044779 sec, idle 1 sec]
├─18343 /usr/sbin/zabbix_server: unreachable poller #1 [got 1 values in 3.009875 sec, getting values]
├─18345 /usr/sbin/zabbix_server: trapper #1 [processed data in 0.002796 sec, waiting for connection]
├─18346 /usr/sbin/zabbix_server: trapper #2 [processed data in 0.002698 sec, waiting for connection]
├─18347 /usr/sbin/zabbix_server: trapper #3 [processed data in 0.000471 sec, waiting for connection]
├─18348 /usr/sbin/zabbix_server: trapper #4 [processed data in 0.002340 sec, waiting for connection]
├─18349 /usr/sbin/zabbix_server: trapper #5 [processed data in 0.003454 sec, waiting for connection]
├─18350 /usr/sbin/zabbix_server: icmp pinger #1 [pinging hosts]
├─18351 /usr/sbin/zabbix_server: alert manager #1 [sent 0, failed 0 alerts, idle 5.005276 sec during 5.005400 sec]
├─18352 /usr/sbin/zabbix_server: alerter #1 started
├─18353 /usr/sbin/zabbix_server: alerter #2 started
├─18354 /usr/sbin/zabbix_server: alerter #3 started
├─18355 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 5, processed 532 values, idle 5.043614 sec during 5.0...
├─18356 /usr/sbin/zabbix_server: preprocessing worker #1 started
├─18357 /usr/sbin/zabbix_server: preprocessing worker #2 started
├─18358 /usr/sbin/zabbix_server: preprocessing worker #3 started
├─18359 /usr/sbin/zabbix_server: lld manager #1 [processed 1 LLD rules during 5.501500 sec]
├─18360 /usr/sbin/zabbix_server: lld worker #1 [processed 1 LLD rules, idle 5.488052 sec during 5.502938 sec]
├─18361 /usr/sbin/zabbix_server: lld worker #2 [processed 1 LLD rules, idle 13.106991 sec during 13.126135 sec]
├─18362 /usr/sbin/zabbix_server: alert syncer [queued 0 alerts(s), flushed 0 result(s) in 0.001128 sec, idle 1 sec]
├─31987 sh -c /usr/local/fping/sbin/fping -C3 -i0 2>&1 </tmp/zabbix_server_18350.pinger;
└─31988 /usr/local/fping/sbin/fping -C3 -i0
May 13 08:44:40 zabbix systemd[1]: Starting Zabbix Server...
May 13 08:44:40 zabbix systemd[1]: zabbix-server.service: Supervising process 18319 which is not our child. We'll most like... exits.
May 13 08:44:40 zabbix systemd[1]: Started Zabbix Server.
Hint: Some lines were ellipsized, use -l to show in full.
[root@zabbix ~]#netstat -lntup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:1556 0.0.0.0:* LISTEN 8363/pbx_exchange
tcp 0 0 127.0.0.1:1557 0.0.0.0:* LISTEN 8363/pbx_exchange
tcp 0 0 0.0.0.0:13782 0.0.0.0:* LISTEN 9702/bpcd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 5633/sshd
tcp 0 0 0.0.0.0:13724 0.0.0.0:* LISTEN 9697/vnetd
tcp 0 0 127.0.0.1:40509 0.0.0.0:* LISTEN 8363/pbx_exchange
tcp 0 0 0.0.0.0:10050 0.0.0.0:* LISTEN 5674/zabbix_agentd
tcp 0 0 0.0.0.0:10051 0.0.0.0:* LISTEN 18319/zabbix_server
tcp6 0 0 :::3306 :::* LISTEN 18050/mysqld
tcp6 0 0 :::80 :::* LISTEN 5635/httpd
tcp6 0 0 :::1556 :::* LISTEN 8363/pbx_exchange
tcp6 0 0 :::22 :::* LISTEN 5633/sshd
tcp6 0 0 :::10050 :::* LISTEN 5674/zabbix_agentd
tcp6 0 0 :::10051 :::* LISTEN 18319/zabbix_server
重启完以后,zabbix服务已经启动成功。
二、故障:Zabbix value cache working in low memory mode
1、问题: Zabbix value cache working in low memory mode错误:
455 ### Option: ValueCacheSize
456 # Size of history value cache, in bytes.
457 # Shared memory size for caching item history data requests.
458 # Setting to 0 disables value cache.
459 #
460 # Mandatory: no
461 # Range: 0,128K-64G
462 # Default:
463 # ValueCacheSize=8M
464 ValueCacheSize=1024M
调整了ValueCacheSize大小,由之前默认8M,改变为1024M
2、重启服务:
[root@zabbix ~]#systemctl restart zabbix-server.service
[root@zabbix ~]#systemctl status zabbix-server.service
● zabbix-server.service - Zabbix Server
Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2015-04-19 19:20:24 CST; 15s ago
Process: 7034 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=0/SUCCESS)
Process: 7057 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
Main PID: 7059 (zabbix_server)
CGroup: /system.slice/zabbix-server.service
├─7059 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf