环境:Zabbix 6.4.16
报错
主机配置缓存不足导致 Zabbix 服务停止运行
在一次性添加大量主机后,网页出现如下提示:
Zabbix server is not running: the information displayed may not be current
日志报错信息
194:20240809:192255.350 0: /usr/sbin/zabbix_server: configuration syncer [syncing configuration](_start+0x25) [0x556d6a14d435]
194:20240809:192255.350 [file:dbconfig.c,line:114] __zbx_shmem_realloc(): out of memory (requested 366104 bytes)
194:20240809:192255.350 [file:dbconfig.c,line:114] __zbx_shmem_realloc(): please increase CacheSize configuration parameter
7:20240809:192255.371 One child process died (PID:194,exitcode/signal:1). Exiting ...
192:20240809:192255.372 HA manager has been paused
192:20240809:192255.442 HA manager has been stopped
7:20240809:192255.448 Zabbix Server stopped. Zabbix 6.4.16 (revision 0bc8c62).
这是因为大量的主机需要缓存相应的主机信息、监控项和触发器等等,所以需要适当增大默认的共享内存大小。
在/etc/zabbix/zabbix_server.conf
配置文件中添加如下参数即可。
### Option: CacheSize
# Size of configuration cache, in bytes.
# Shared memory size for storing host, item and trigger data.
#
# Mandatory: no
# Range: 128K-64G
# Default:
# CacheSize=32M
CacheSize=1024M
调优
Housekeeping
Housekeeping 程序会定期从数据库中删除过时的信息,设置每 24 小时删除一次,每次最多删除 100000 条。
### Option: HousekeepingFrequency
# How often Zabbix will perform housekeeping procedure (in hours).
# Housekeeping is removing outdated information from the database.
# To prevent Housekeeper from being overloaded, no more than 4 times HousekeepingFrequency
# hours of outdated information are deleted in one housekeeping cycle, for each item.
# To lower load on server startup housekeeping is postponed for 30 minutes after server start.
# With HousekeepingFrequency=0 the housekeeper can be only executed using the runtime control option.
# In this case the period of outdated information deleted in one housekeeping cycle is 4 times the
# period since the last housekeeping cycle, but not less than 4 hours and not greater than 4 days.
#
# Mandatory: no
# Range: 0-24
# Default:
# HousekeepingFrequency=1
HousekeepingFrequency=24
### Option: MaxHousekeeperDelete
# The table "housekeeper" contains "tasks" for housekeeping procedure in the format:
# [housekeeperid], [tablename], [field], [value].
# No more than 'MaxHousekeeperDelete' rows (corresponding to [tablename], [field], [value])
# will be deleted per one task in one housekeeping cycle.
# If set to 0 then no limit is used at all. In this case you must know what you are doing!
#
# Mandatory: no
# Range: 0-1000000
# Default:
# MaxHousekeeperDelete=5000
MaxHousekeeperDelete=100000
Zabbix server: Utilization of lld worker processes over 75%
low-level discovery (LLD) workers,与之相关的 lld manager 会随着低级发现工作进程的启动而启动。
自动发现(LLD)提供了一种在计算机上为不同实体自动创建监控项,触发器和图形的方法。当一次性添加大量主机设备时,可能会出现上述告警,可根据 Zabbix 所在服务器的资源适当增加 worker 数量解决。
### Option: StartLLDProcessors
# Number of pre-forked instances of low level discovery processors.
#
# Mandatory: no
# Range: 1-100
# Default:
# StartLLDProcessors=2
StartLLDProcessors=6
Zabbix server: Utilization of history syncer processes over 75%
- history poller- 处理需要数据库连接的计算检查的过程
- history syncer- 历史数据库写入器
一次性添加大量主机设备时,可能会出现数据写入量突增,导致进程占用出现峰值,可尝试调整以下三个参数。
### Option: StartHistoryPollers
# Number of pre-forked instances of history pollers.
# Only required for calculated checks.
# A database connection is required for each history poller instance.
#
# Mandatory: no
# Range: 0-1000
# Default:
# StartHistoryPollers=5
StartHistoryPollers=10
### Option: StartTimers
# Number of pre-forked instances of timers.
# Timers process maintenance periods.
# Only the first timer process handles host maintenance updates. Problem suppression updates are shared
# between all timers.
#
# Mandatory: no
# Range: 1-1000
# Default:
# StartTimers=1
StartTimers=2
### Option: StartEscalators
# Number of pre-forked instances of escalators.
#
# Mandatory: no
# Range: 1-100
# Default:
# StartEscalators=1
StartEscalators=2
评论区