High VM Memory Usage On Newer PVE Versions?

phil2987 · Mar 17, 2025

I manage a decent-sized Proxmox private cloud for a client consisting of two sites:

Site A (Prod):
PVE 7.4-3
250 VMs (Windows 10, 6GB RAM allocated)
3x Dell R630 w/768GB RAM each

Site B (DR):
PVE 8.3.4 (latest as of Mar 2025)
0 VMs
3x Dell R630 w/768GB RAM each

Recently, we migrated all VM's from Site A to Site B for a DR exercise using Proxmox Backup Server and it worked very well (aside from some slowness with PBS, but that's another thread). The issue is:

At Site A, Host #1 has around 110VM's on it and it's using around 369GB RAM. Looking at the VM status, I see that most of the VM's are using less than half of the allocated 6GB RAM, so I assume that the VirtIO balloon driver is releasing the RAM back to PVE.

After the VMs from Site A Host 1 were copied to Site B Host 1, the memory usage doubled to over 680GB used as Site B, and then the swap filled up. Looking at the status of the individual VM's, I again see very low RAM usage from the VM's themselves, but significantly increased host memory usage.

Why would the same number of VM's on the same hardware use double the RAM on PVE 8.3.4 vs. the older 7.4-3? The only thing I can think of is I would need to potentially update the VirtIO drivers in each VM...does this make sense?

cheiss · Mar 18, 2025

phil2987 said:
I again see very low RAM usage from the VM's themselves, but significantly increased host memory usage.

What underlying storage are you using?

If its ZFS, you might need to limit the ZFS ARC memory.
That's a common thing, there a loads of threads about this topic.

phil2987 · Mar 18, 2025

The storage is on a separate dedicated server running TrueNAS. The Proxmox hosts have no local storage (other than the boot drives).

phil2987 · Mar 18, 2025

From one of the hosts:

root@vhost1:~# cat /sys/module/zfs/parameters/zfs_arc_max
0
root@vhost1:~# cat /sys/module/zfs/parameters/zfs_arc_min
0

VA1DER · Mar 19, 2025

What is the state of ksm on both? Is ksmtuned configured the same on both?

EDIT: This is on the "breaking changes" list from the Proxmox 7 -> 8 upgrade notes:

Kernels based on 6.2 have a degraded Kernel Samepage Merging (KSM) performance on multi-socket NUMA systems.
- Depending on the workload this can result in a significant amount of memory that is not deduplicated anymore.
- This issue went unnoticed for a few kernel releases, making a clean backport of the fixes made for 6.5 hard to do without some general fall-out.
Until we either find a targeted fix for our kernel, or change the default kernel to a 6.5 based kernel (planned for 2023'Q4), the current recommendation is to keep your multi-socket NUMA systems that rely on KSM on Proxmox VE 7 with it's 5.15 based kernel.

I don't know what ever happened with this (I don't personally use the Proxmox kernel, partially for reasons just like this - my PVE 8.3.4 kernel is Debian's stock 6.1), but it does read like what you're seeing.

phil2987 · Mar 19, 2025

Crap....ok...thank you so much for that information -- it does sound exactly like what's happening.

I think I may have to add a new host to the cluster based on Bookworm and play Musical Chairs with the VM's, eventually re-installing the entire cluster using Bookworm.

Thanks again; this is definitely the issue.

phil2987 · Mar 20, 2025

The documentation here:

https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm

states that the default kernel must be removed....I feel kind of trapped. I feel like I can't use the default kernel and, at the same time, I can't downgrade back to 7.x. My cluster is now no longer redundant as I don't have enough RAM to sustain a host failure without KSM.

VA1DER · Mar 20, 2025

phil2987 said:
I think I may have to add a new host to the cluster based on Bookworm and play Musical Chairs with the VM's, eventually re-installing the entire cluster using Bookworm.

I'm aware of that documentation. I ran into it myself.

Check this thread where I first questioned and then worked around this limitation. It's actually not too hard to convince PXE to stop demanding its own kernel. It will be slightly more complicated for you since I suspect you will need ZFS, but this is available in Debian contrib repositories.

gfngfn256 · Mar 20, 2025

VA1DER said:
EDIT: This is on the "breaking changes" list from the Proxmox 7 -> 8 upgrade notes:

I find it unlikely this is relevant to the OP's issue. His site A is running on 7.4.3 smoothly (don't know which kernel), his site B is running 8.3.4 so kernel 6.8.x, and as the linked notes state this KSM issue is not relevant as of kernel 6.5.

phil2987 · Mar 20, 2025

Is there any documentation that the issue was fixed in 6.5?

I thought about it and installing it over top of Debian but using the non-PVE kernel is a little bit "hacky". I can't risk instability with this many VM's and an SLA in place. I think I will need to simply add another host so I have more resources and I can't go back to 7.4.

gfngfn256 · Mar 20, 2025

phil2987 said:
Is there any documentation that the issue was fixed in 6.5?

The already linked release note clearly states:

This issue went unnoticed for a few kernel releases, making a clean backport of the fixes made for 6.5 hard to do without some general fall-out.

so they did not backport it to the previous kernel (6.2) - but it was inherently fixed (by Linux) in the 6.5 kernel. A line later, you will see the same thing:

or change the default kernel to a 6.5 based kernel (planned for 2023'Q4),

So from kernel 6.5.x it is fixed (inherently), we are now on 6.8x.

I didn't search extensively on general Linux kernel changes, but found this note on the Red Hat docs chapter 7.KSM:

Note

Starting in Red Hat Enterprise Linux 6.5, KSM is NUMA aware. This allows it to take NUMA locality into account while coalescing pages, thus preventing performance drops related to pages being moved to a remote node. Red Hat recommends avoiding cross-node memory merging when KSM is in use. If KSM is in use, change the /sys/kernel/mm/ksm/merge_across_nodes tunable to 0 to avoid merging pages across NUMA nodes. Kernel memory accounting statistics can eventually contradict each other after large amounts of cross-node merging. As such, numad can become confused after the KSM daemon merges large amounts of memory. If your system has a large amount of free memory, you may achieve higher performance by turning off and disabling the KSM daemon. Refer to the Red Hat Enterprise Linux Performance Tuning Guide for more information on NUMA.

I imagine this is the fix, alluded to in the above Proxmox docs.

phil2987 · Mar 21, 2025

I appreciate everyone's input. Ultimately, here is the scenario: I have a SLA with a client and in the old environment, 3 hosts were enough to sustain a host failure -- there was enough RAM with 3x768GB to keep all vm's up if a host failed. In the new 8.3.4 environment, due to the fact that memory usage is now almost double, I am in trouble if a host fails. For this reason, I am building another host with 768GB RAM, to be delivered to the datacenter tomorrow, which will allow me to sustain a host failure. Once I connect this host, I will run some tests with KSM and report back.

VA1DER, I appreciate the knowledge share but the fact that you are basing it off hosting a single host with a few vm's is inapplicable to this scenario. It could very well be that I am the only person in the world running 250 production vm's off Proxmox, but I find it hard to believe. The facts are:

On PVE 7.4-3, 110 vm's (Win 10, 6GB allocated), on a single host with 768GB RAM used around 380GB RAM.

On PVE 8.3.4, with the same specs, THE SAME 110 vm's on a single host with 768GB RAM use almost 700GB RAM and KSM starts swapping.

There is no local ZFS -- all vm's are stored on TrueNAS via 10GB NFS backed by NVME drives.

I have to add a new host to the 8.3.4 cluster in order to be able to sustain a possible host failure. It is unfortunate but it is what it is. Once the host is in place I will report back but as it stands right now, I can't risk playing around with random settings on hosts because if something breaks , I am screwed. Also, installing PVE on top of Debian and keeping the old kernel is risky.

I've been doing this for a LONG time and I am still stupid -- I always assume that the newer version of something is better than the old. I should have just stayed with 7.4.-x on the new cluster as well, but it's too late. I will add another host to the cluster tomorrow.

Neobin · Mar 21, 2025

phil2987 said:
Is there any documentation that the issue was fixed in 6.5?

phil2987 · Mar 22, 2025

Neobin said:
https://forum.proxmox.com/threads/k...s-expected-on-6-2-x-kernel.131082/post-595600

https://forum.proxmox.com/threads/k...s-expected-on-6-2-x-kernel.131082/post-595886

I don't see any reference to it being fixed -- just people arguing about whether it has been fixed or not. I am attaching a screenshot of a direct comparison between the same 94 VM's currently running in my environment on one of my hosts. The old host is 7.4-3 and the new host is 8.3.4. With the same number of identical VM's running, the memory usage in the new environment is more than 2x.

Simply put, there is NO WAY that this has been fixed. The VM's on the old host initially used 500+GB when first booted, then "settled in" overnight. I powered them on last night specifically to show the screenshot. The VM's in the new environment have been up for about a week. Notice the KSM and SWAP values on both.

I am contemplating scheduling downtime and rebuilding the new cluster with 7.4-x.

_gabriel · Mar 22, 2025

KSM merge only after 80% used RAM.

1st host during first boot, fill more than 80% of RAM so KSM merge many RAM.

Other host during first boot doesn't fully fill the RAM so KSM merge only a small amount.

It's expected for PVE ( and OS in general) to use all available RAM.
If you need more free RAM, ksm 80% threshold can be adjusted.

phil2987 · Mar 22, 2025

Strange how it is fine on all 3 hosts in the old environment, but OK, I will migrate 20 more VM's to the new host and report back.

phil2987 · Mar 22, 2025

I migrated 20 more vm's to this host....I really hope KSM kicks in soon. FYI, I never saw this high memory usage or swap usage in the 7.4 environment.

_gabriel · Mar 22, 2025

KSM problem was with 6.2 kernel with multi sockets , fixed since later 2023 with kernel 6.5 => topic

Here with PVE 8.1 with 6.8 kernel , will update this week-end.
Single socket node , running a single Windows VM , after uptime 4h

phil2987 · Mar 22, 2025

So, after two hours here's what I noticed:

1. /etc/ksmtuned.conf is the same on both hosts -- everything is commented out. No changes from default.
2. KSM kicked in after RAM was 80% full and slowly started lowering the amount of used RAM. HOWEVER, it seems that once the RAM usage dropped to 80%, KSM completely stopped. Also, SWAP usage remains high. This is different behavior that with 7.4. After 2 hours, I see the below. KSM has stopped trying to lower the memory below 80%.

phil2987 · Mar 22, 2025

Below is a direct comparison between what is currently running in the old environment versus the new. This is 114 VM's (Windows 10, 6GB allocated), taken a few minutes ago:

High VM Memory Usage On Newer PVE Versions?

Member

Proxmox Staff Member

Member

Member

New Member

Member

Member

New Member

Distinguished Member

Member

Distinguished Member

Member

Distinguished Member

Member

Famous Member

Member

Member

Famous Member

Member

Member

We value your privacy