PVE GUI Reporting higher CPU usage than VM is reporting

Tmanok

Renowned Member
Hi Everyone,

Running a standard Debian VM with a webserver and qemu-guest-agent, I'm wondering why PVE sees ~6-8% CPU usage when the VM is idle and the VM itself internally sees 0.1-1.8% while idle. Here are two screenshots showing PVE GUI and VM Top or htop at the same time to better illustrate this:
PVE vs Top.png

PVE vs htop.png

I've left the system alone for a long period of time and redirected all traffic away from it. I've also rebooted and checked other processes. What caught my eye first was a failing service, previously PVE was showing 15% usage (alarmed me because this VM was supposed to be idle) while the VM was showing somewhere around 4-8% usage with the failing service. So there is quite a disparity here, and it has persisted even after I removed all the CPU load that is visible to me.

The RAM usage also seems off (even while counting the buffers and caches highlighted by htop). free -mh reports 110MiB used, 10MiB shared, and 197MiB in buffers/caches, which is rather perfectly inline with htop. That accounts for ~317MiB of memory usage, so why is PVE showing 1,187.84MiB usage? (1.16GiB).

Thanks everyone, cluster version is 7.0-10, and although I have not noticed many other issues like this, it is troubling to think that there could be some incorrect reporting on my systems.
 
Quick update:
Earlier I was issuing a "reboot" command from within the VM, this time I issued a "shutdown now" and when it came back up (thanks HA), the memory reporting was accurate.

However, the CPU is still reporting 5-8% usage when internally the debian guest begs to differ at 0.3%.
Thanks,
 
What does top in the Proxmox host shows for that VM?

top -p $(pgrep -f "kvm -id (VMID)")

In my experience, you will always see that extra bit of CPU usage related to KVM overhead due to interrupt handling and process scheduling. The overhead will be slightly bigger the more sockets/cores are set up for the VM, but I've never seen more than 8% in normal usage.

Anyway, that overhead wont take performance out of your VM, but from the host itself: try to run a 1 socket/2 core VM, install stress-ng and run it inside the VM with just 1 thread:

stress-ng -c 1

You will see 100% user usage inside the VM, but in the Proxmox host you will get something like 102% to 108% for that kvm process, meaning that the overhead is handled by another core/cpu of the host, allowing the processes inside the VM to fully get all configured CPUs.
 
Wise words! These thoughts had slipped my mind.

Unfortunately watching the resource consumption of the corresponding PID for the VM shows ~13.8-15% CPU usage (Or roughly the original 5-8% if you divide by 2 for the number of cores allocated to the VM).

Additionally I'm still abut 200MiB of memory over. free -mh shows about ~300MiB consumption inside the vm, PVE GUI displays 573MiB. I'll run a stress test to see whether the memory and CPU are still inaccurate.
Cheers,


Tmanok
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!