Proxmox and Zabbix supervision

lilp

Renowned Member
Feb 10, 2016
63
1
73
36
Hi,
I just installed Zabbix and Grafana to monitoring my Proxmox server with VM and CT.
I think information are false about VM and CT.
On Proxmox my VM are stable :
2024-08-19 15_17_47-proxmox - Proxmox Virtual Environment – Brave.png
But on Zabbix, it's on problem state

2024-08-19 15_18_43-Zabbix_ Problèmes – Brave.png

I use Linux Template for all my VM and CT (they're on Debian and Ubuntu).
It's normal?
Thanks
 
What is the value of the user macro used in the expression?

When enabling showing the operational data in the Problems list, what is shown there?
 
I think information are false about VM and CT.
The first screenshot shows how busy the CPU is. (1.26%)
Zabbix looks at the classic "System Load". And Zabbix divides it "for reasons" by the number of CPUs - thus "Load average... (per CPU..."

The SysLoad is a drastically simplified single value to indicate the stress the system is under. It is calculated by the CPU percentage and especially also the IO-Queue length for disks (and the Interrupt handling, e.g. for the NICs). You can absolutely create a weird - and probably but not always problematic - situation with 1 percent CPU load and a System Load of 300 or more ;-)

In my personal world high SysLoad most often comes from slow storage. One of my data-Datastores (@home) uses rotating rust. During backup (which means reading the whole shebang) the System Load goes up to 35 - which in this case seems absolutely to be normal :-(

A better metric for the load of a Linux system is the "Pressure Stall Index": https://www.kernel.org/doc/html/latest/accounting/psi.html - but the problematic "SysLoad" nevertheless will stay for another while...
 
Where I can find thoses informations ?
Above the Problems list there is a filter tab that you can use to show Operational data in the list. The problem you showed is configured to show the 1min, 5min and 15min load averages as operational data. Or, you can just see the separate item graphs for the 1min, 5min and 15min valueshow they behave.

Above it was already mentioned that CPU utilization and system load are not necessarily directly comparable. See the graphs created by the Linux template if they show more detailed view about the CPU load usage.
 
I follow this tutorial.
It seems working for one of both CT, I test.
But for one, I've still some alerting about Load Average (5m avg)
brave_Ayf7kbjjle.pngbrave_ARBHJ7lyaG.png
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!