PVE showing high memory usage but VM is not

oah433

Member
Apr 8, 2021
31
1
8
41
Hi
Simply put I have the following scenario:
I am running PVE 6.3.2 and have a VM with 16 Gigs of Ram and it was all of sudden consuming all the RAM which leads to the machine hanging up. So we increased the RAM up to 32 Gigs of Ram but nothing really changed. The problem is that the VM is not running any serious tasks. The imgs below tell it all.

This is the RAM usage as reported from the PVE-Admin panel:

1660122855670.png



The VM is centos and is showing this:
1660122788694.png


So the VM internally is reporting 1.59 Gigs of Ram usage while the PVE-panel is showing 17 Gigs of ram Usage, how can I find who is hogging the RAM and any idea on how to start debugging it?


Thx.
 

Attachments

  • 1660122730131.png
    1660122730131.png
    79.9 KB · Views: 50
Please look at the yellow bar in htop. There are a ton of threads about supposedly wrong ram usage.
It's just the cache. The guest caches, show's it as cached, and the host doesn't know about it. Drop caches and check usage afterwards.
 
Please look at the yellow bar in htop. There are a ton of threads about supposedly wrong ram usage.
It's just the cache. The guest caches, show's it as cached, and the host doesn't know about it. Drop caches and check usage afterwards.
I just dropped the caches using:

Code:
echo 3 > /proc/sys/vm/drop_caches

and this is what I got (both imgs below are after dropping the caches)

1660124772671.png

I can see some gigs have been shaved off the RAM but it is still alot. Any ideas on where to go next? update to PVE7 or something similar?

1660124815799.png
 
That your VM is using a lot of RAM for "no reason" could be indicative of another problem with your VM setup, but is not the root cause here.

By default, through the ballooning driver the VM will continually map new RAM whenever it needs it, and internally it will again "release" the memory when it is no longer in use. However, your host might not actually reclaim that memory until it needs it. By default, that threshold is at 80%.
So, your host should only start reclaiming memory once it hits less than 20% total free memory.
Constantly releasing and reassigning memory to the VM wouldn't make much sense anyway, when the host just doesn't need it at the moment.

Solely going off the impressions I got from more recent posts in the forum here, but I don't think this reporting problem shows up in newer versions. The newer PVE7+ versions might have changed to take into account the "true" usage data reported by the ballooning driver.
However, please take this with a grain of salt, to be sure one would have to test it.

Still, updating has a number of other benefits and increases security for system. So you should definitely consider doing that, and if you do, feel free to report back whether the problem persists.
 
And as far as I understand the host won't reclaim RAM the guests isn't using when ballooning is enabled but min RAM is not lower than max RAM, right?
 
I just tested this a little and the "Minimum Memory" setting does not interfere with the memory reclaiming by the host when it is equal to the assigned RAM. When the host uses more than 80% memory and the VM does not use it, it will still be reclaimed, independent of the minimum.

From what I've seen, the minimum setting allows the host (or qemu perhaps) to dynamically change the amount of RAM the machine currently has available. Which means that when the host is at capacity, it will slowly decrease the amount approaching the minimum setting, and when the host is mostly free, it will increase the amount again to the maximum.
The magical 80% threshold seems to apply here as well.
 
Last edited:
Yes, but the problem is that I have never seen in two years that the KVM process releases the reserved RAM. Not even with nodes RAM usage above 90%. No matter if the guest is using that RAM or not and with ballooning enabled. I'm referring to the "RES" column in htop where the value of a KVM process can only grow but never shrink without stopping the VM.
 
Last edited:
That your VM is using a lot of RAM for "no reason" could be indicative of another problem with your VM setup, but is not the root cause here.

By default, through the ballooning driver the VM will continually map new RAM whenever it needs it, and internally it will again "release" the memory when it is no longer in use. However, your host might not actually reclaim that memory until it needs it. By default, that threshold is at 80%.
So, your host should only start reclaiming memory once it hits less than 20% total free memory.
Constantly releasing and reassigning memory to the VM wouldn't make much sense anyway, when the host just doesn't need it at the moment.

Solely going off the impressions I got from more recent posts in the forum here, but I don't think this reporting problem shows up in newer versions. The newer PVE7+ versions might have changed to take into account the "true" usage data reported by the ballooning driver.
However, please take this with a grain of salt, to be sure one would have to test it.

Still, updating has a number of other benefits and increases security for system. So you should definitely consider doing that, and if you do, feel free to report back whether the problem persists.
I got your point. I will go with the update option and report back the progress.

Thx a ton.
 
  • Like
Reactions: datschlatscher
I gave it yet another quick spin while monitoring the memory usage, and here the memory usage drops, both in the virtual machine and also on the host, as reported for the Qemu process in the RES column in htop.

One thing which took me off-guard though, was that the calculation for "total" RAM also takes the Swap space into account. So if the RAM is at nearly 100%, but the overall usage is still less than 80% of RAM size + Swap size, then the Qemu process, and subsequently the VM, will not free any of its reserved memory, even though in the GUI it kind of looks like it should.
However, there still might be other things influencing this. So there might be another reason or unknown interaction for why VM processes do not release memory in your case.
 
Last edited:
One thing which took me off-guard though, was that the calculation for "total" RAM also takes the Swap space into account. So if the RAM is at nearly 100%, but the overall usage is still less than 80% of RAM size + Swap size, then the Qemu process, and subsequently the VM, will not free any of its reserved memory, even though in the GUI it kind of looks like it should.
However, there still might be other things influencing this. So there might be another reason or unknown interaction for why VM processes do not release memory in your case.
Ah, that's interesting and could be a problem. Node got 64GB RAM + 64GB swap. Lets say I'm at 59 of 64 GB RAM and 1 of 64GB swap usage so for PVE I only got 60 of 128GB memory used and PVE won't reclaim the unused RAM as for PVE its just around 50% and not at 95%?
Because RAM stealing of ballooning still starts when RAM utilization get above 80% and not just RAM+swap utilization over 80%.
 
Hello I am also trying to understand my RAM usage. I assigned 16 gigs to a VM without ballooning and the VM shows 92% perma utilization, but htop and free -h don't.
1736943101784.png1736943057615.png1736943024103.png
 
  • Like
Reactions: lethargos
Hi everyone,

I seem to be in the same boat as the ones above. I've disabled ballooning and memory inside my VM are showing low usage but within PVE I'm at a constant 95-99%. From some basic experimenting if I do a reboot proxmox still shows high memory usage, only if I do a shutdown and start will the memory usage reset and then it slowly climbs back up.

My htop and free both show little memory usage as well, I can't tell if this is just a reporting glitch in Proxmox not being able to distinguish between used/release memory or if there is something wrong with my VM. I have 3 VMs all clones of one another initially but added additional services to each with the memory being 32GB, 48GB, 32GB. For both the 32GB ones I seem to be having the issue where over 24hours the memory usage slowly climbs until it maxes out, the 48GB doesn't have much on it but I've never had issues with that one so far.

Any suggestions on how to tackle this?
 
Hi
Simply put I have the following scenario:
I am running PVE 6.3.2 and have a VM with 16 Gigs of Ram and it was all of sudden consuming all the RAM which leads to the machine hanging up. So we increased the RAM up to 32 Gigs of Ram but nothing really changed. The problem is that the VM is not running any serious tasks. The imgs below tell it all.

This is the RAM usage as reported from the PVE-Admin panel:

View attachment 39847



The VM is centos and is showing this:
View attachment 39846


So the VM internally is reporting 1.59 Gigs of Ram usage while the PVE-panel is showing 17 Gigs of ram Usage, how can I find who is hogging the RAM and any idea on how to start debugging it?


Thx.
What's reported by "free -m"?
 
Are you using ZFS?
Yeah I am, mirrored!

What's reported by "free -m"?
On my PVE host and the 3 VMs. The weird part is, that 102 and 103 are clones of 101, I set 101 up initially then cloned it once everything was setup (changed hostname and ip and the running docker containers on it). The 95%+ usage sometimes rotates, sometimes its all 3, sometimes it is just one of them. I have CAdvisor running for metrics in grafana and that is reporting normal usage the whole time on it.
1740983648137.png
  • ID 101
    • 1740983692769.png
    • 1740983770959.png
  • ID 102
    • 1740983702141.png
    • 1740983796152.png
  • ID 103
    • 1740983710139.png
    • 1740983821441.png
 
Yeah I am, mirrored!


On my PVE host and the 3 VMs. The weird part is, that 102 and 103 are clones of 101, I set 101 up initially then cloned it once everything was setup (changed hostname and ip and the running docker containers on it). The 95%+ usage sometimes rotates, sometimes its all 3, sometimes it is just one of them. I have CAdvisor running for metrics in grafana and that is reporting normal usage the whole time on it.
View attachment 83161

also a "top" output showing a list of the top 10 processes sorted by memory (shift+m)
 
Last edited:
Yeah I am, mirrored!


On my PVE host and the 3 VMs. The weird part is, that 102 and 103 are clones of 101, I set 101 up initially then cloned it once everything was setup (changed hostname and ip and the running docker containers on it). The 95%+ usage sometimes rotates, sometimes its all 3, sometimes it is just one of them. I have CAdvisor running for metrics in grafana and that is reporting normal usage the whole time on it.
View attachment 83161

looking at the numbers, they are fine. OS uses RAM to place filesystem cache. Buffer/Cache portions of the memory will be free-up on demand when needed by other applications.

In-OS memory monitoring is the meaningful metric, hypervisor level view is limited to (active pages).

if the workloads are active (in production), you may even shrink RAM definition/assignment to the VM (look at the "available" column and decide how near 0 you are comfortable with)

Unrelated note, but stings me in the eye:

Install the QEMU agent!

https://pve.proxmox.com/wiki/Qemu-guest-agent
 
Last edited:
  • Like
Reactions: Johannes S