Ram Bloat Question

cshill

Member
May 8, 2024
62
8
8
Hi Proxmox Community,
I had a similar question in regards to when I was using some Windows 11 VMs and discovered that the ballooning option for ram consumption can cause a problem as it was maxing out my ram on the Proxmox GUI screen for the VM but not inside the windows VM. If you have ballooning on for Windows and don't load the VirtIO drivers then it can max out ram in Proxmox GUI even if you are doing nothing and ram set to 24GB.

This question has to do with overall ram consumption of a node as I currently have a PowerEdge R420 with 48GB of ram. It has 9 VMs that are all not running, Zpool of 3 disks, and consistent replication to other servers. The first node, my PowerEdge, is sitting at 25/47 GB consumption, that is 53% ram usage when it's not running any VMs. The documentation says that ZFS will consume ram at a rate of base 2 plus each value of TB space. So that would be 2+12 for 14GB of ram consumption, but not sure if this a generalized guideline or if it's hardcoded. HTOP command shows the system consuming 4GB with orange being cache. free -m is showing me that I have 26,171 MB of consumed ram which is in line with what Proxmox is telling me but the buff/cache is saying 4473 MB.

Zpool ram consumption 14GB
Proxmox OS 4GB
Cache 4.4GB

I'm assuming the consistent caching is replication and updating disks. That makes it 22.4GB so I'm still missing some ram usage somewhere.

Overall I'm trying to understand why is this consuming so much ram? I intend to make some servers in the future and want to spec them out appropriately but it feels like I'm going into the 200+GB of ram just to service a couple quick VMs.

If this is the norm then does anyone know the rate at which I need to increase ram to GB of data replication?



1717186798890.png
1717186814487.png
1717187170325.png
 

Attachments

  • 1717187156512.png
    1717187156512.png
    6.2 KB · Views: 0
I rebooted the server afterwards and it's back down to a reasonable amount of ram consumed.
 
I had a similar question in regards to when I was using some Windows 11 VMs and discovered that the ballooning option for ram consumption can cause a problem as it was maxing out my ram on the Proxmox GUI screen for the VM but not inside the windows VM. If you have ballooning on for Windows and don't load the VirtIO drivers then it can max out ram in Proxmox GUI even if you are doing nothing and ram set to 24GB.
Windows is lying to you about RAM usage. It counts RAM used for caching as free RAM and not as used RAM. See https://www.linuxatemyram.com for understanding the difference between "free", "used" and "available" RAM. This is for Linux but also the same for Windows.
Difference between enabled or disabled guest agent is that without it PVE will show you the real amount of "used" RAM based on the RAM that is used by QEMU virtualitzing your VM. With the guest agent, PVE will show you what Windows is (falsely) reporting.

The first node, my PowerEdge, is sitting at 25/47 GB consumption, that is 53% ram usage when it's not running any VMs. The documentation says that ZFS will consume ram at a rate of base 2 plus each value of TB space
By default ZFS was using up to 50% of the hosts RAM for caching. New default is up to 10% of the hosts RAM.
What you refer to is a rule of thumb if you want to manually limit the ARC size: https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
 
Last edited:
Windows is lying to you about RAM usage. It counts RAM used for caching as free RAM and not as used RAM.
When I experimented with Windows 11 I gave it 24GB of Ram once and the system consumed it all when you look at the Proxmox system summary of the VM. It sounds more to do with a communication between Proxmox and not Windows saying it has that amount of ram available or free. It sounds like Proxmox by default will provide the Windows machine with 24GB of ram and if the Windows VM uses only 3GB it will keep 21GB in cache regardless if it is being actually used for caching. As I stated though the proper drivers has mitigated this issue.
With the guest agent, PVE will show you what Windows is (falsely) reporting.
Are you saying NOT to install the guest agent for Windows only or for all? Are you pinning this miscommunication on the guest agent and not the VirtIO drivers causing the problem?
By default ZFS was using up to 50% of the hosts RAM for caching. New default is up to 10% of the hosts RAM.
What you refer to is a rule of thumb if you want to manually limit the ARC size: https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
I looked at limiting the arc but if the new cap is 16GB of ram usage. I'm not sure if that will be a limiting factor. I am preparing for a situation where a server may hold upwards of 160TB of information and I want to provide optimal specs. Currently setting aside about 40-50GB of Ram for ZFS alone which does not follow the recommend 2+1 approach. But to be fair that is an insane amount of ram just for a file and volume manager. Maybe it will make more sense to go small then instead of these massive server projects.
 
Are you saying NOT to install the guest agent for Windows only or for all? Are you pinning this miscommunication on the guest agent and not the VirtIO drivers causing the problem?
You still should use the guest agent or otherwise you won't have consistent snapshot-mode backups. You just have to keep in mind that the host will need way more RAM to run that VM than what the webUI or Windows task manager will tell you. If you want to know how much RAM a VM is using, run htop on the PVE host and have a look at the "RES" column for the kvm process that runs your Windows VM.

When I experimented with Windows 11 I gave it 24GB of Ram once and the system consumed it all when you look at the Proxmox system summary of the VM. It sounds more to do with a communication between Proxmox and not Windows saying it has that amount of ram available or free. It sounds like Proxmox by default will provide the Windows machine with 24GB of ram and if the Windows VM uses only 3GB it will keep 21GB in cache regardless if it is being actually used for caching. As I stated though the proper drivers has mitigated this issue.
When not using PCI passthrough (with it the full RAM will be allocated at the start because of DMA) PVE will dynamically add more RAM to the VM. so it starts low and will grow when Windows needs more RAM. Releasing allocated RAM isn't that easy. PVE will try that but it is usually still way higher than what it should be. For that there is ballooning, where the PVE host will steal RAM from the VM so the VM is forced to free up RAM to not crash which should drop caches first. But PVE won't care if the guestOS needs that RAM or not. It will steal it anyway until "Min RAM" is reached even if that means that the guestOS has to swap out or kill some important processes once there is no more cache available to drop.
I looked at limiting the arc but if the new cap is 16GB of ram usage. I'm not sure if that will be a limiting factor. I am preparing for a situation where a server may hold upwards of 160TB of information and I want to provide optimal specs. Currently setting aside about 40-50GB of Ram for ZFS alone which does not follow the recommend 2+1 approach. But to be fair that is an insane amount of ram just for a file and volume manager. Maybe it will make more sense to go small then instead of these massive server projects.
Also depends on the type of storage. With NVMe SSDs the ARC size isn't that important (and some people even disable ARC data caching and only cache metadata when using very fast NVMes). With HDDs you usually want lots of RAM for the ARC for better performance. You could have a look at the output of arc_summary. Have a look at the "hit rates". Those should be high and if they aren't you might want to increase your ARC size.
 
Last edited:
  • Like
Reactions: cshill
Also depends on the type of storage. With NVMe SSDs the ARC size isn't that important (and some people even disable ARC data caching and only cache metadata when using very fast NVMes). With HDDs you usually want lots of RAM for the ARC for better performance. You could have a look at the output of arc_summary. Have a look at the "hit rates". Those should be high and if they aren't you might to increase your ARC size.
I was just thinking about this as the ram bloat I'm experiencing has to do with HDDs forcing cache usage as it can't write fast enough. The problem I have though is that theoretically it makes sense for a slowdown of the HDDs as I have VMs, replication, and backups running, yet even if I shutdown the VMs and give the machine plenty of time, the cache is still very high. I gave it plenty of time to replicate the data
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!