AMD Incpetion fixes cause QEMU/KVM memory leak

liquidrory

New Member
Aug 11, 2023
5
0
1
Hi, I have a Proxmox 7 installation that's been going strong until just this morning. From what I can gather, any running QEMU process causes **very** fast memory allocation on the host until eventually, everything is OOM killed. The PVE node is itself a virtual machine. For the purposes of this thread "host" will refer to the hypervisor in which PVE is running, not PVE itself.

Here's what I know:

- On 1 August, the PVE node's packages were updated, and the host rebooted. No problems, and no memory leak.
- This morning, 11 August, there were no packages on PVE to be updated (so, nothing in that regard changed from 1 August to 11 August), and the host was rebooted.
- Upon starting up, *something* begins to allocate ~100-200MB/second of memory until the machine crashes
- With the fleeting seconds of usability I had with each boot, I manually disabled each container and VM set to start when the PVE node does
- The leak stopped happening as soon as no QEMU/KVM guests were set to start with the PVE node
- Starting any QEMU/KVM guest, any at all, causes the memory leak to occur. The leak continues until the guest is stopped.
- The amount of memory consumed is not dependent on the amount of memory assigned to the guest. For example, a guest only assigned 512MB of memory can still consume all 16GB of memory on the PVE node.
- The memory is not returned when the guest stops.
- Nothing in htop or ps is revealing what is using all of this memory.
- Rolling back to an earlier kernel revision does not solve the problem

I'm ready to post logs, version numbers, etc. just let me know what you need. I did some cursory searching on both Google and this forum but it looks like I'm the first person to notice this problem.
 
Last edited:
I've determined that this is likely a problem with the recent emergency hotfixes for the AMD Inception vulnerability. Taking the host back to the kernel version immediately before those fixes allows everything to work again.

Additionally, the bug affects all nested virtualization, not just on Proxmox, though it only leaks memory on Proxmox. On bare Debian 12, a single CPU core is maxed out and the guest fails to start.
 
Hi,
can you try booting the older kernel and see if the problem goes away? EDIT: Sorry, just saw it's already mentioned, but please share the exact kernel versions. Please also share the output of pveversion -v and the VM configuration qm config <ID> --current. I might have missed something, but I don't think there are any fixes for AMD Inception in the Proxmox kernel yet, I just see one for Zenbleed, do you mean that one?
 
Last edited:
Hi, I have a Proxmox 7 installation that's been going strong until just this morning. From what I can gather, any running QEMU process causes **very** fast memory allocation on the host until eventually, everything is OOM killed.
So this time "host" still refers to the Proxmox VE VM? Or do you really mean QEMU process on the host.
The PVE node is itself a virtual machine. For the purposes of this thread "host" will refer to the hypervisor in which PVE is running, not PVE itself.
What hypervisor are you using to run the Proxmox VE VM? What kernel version does the hypervisor have?
 
The bug occurs with both Proxmox and libvirt on Debian 12 as the intermediate hypervisors. I was able to narrow it down to kernel 5.10.189 (and .190 released just a few hours later) and another person has confirmed it occurs on the 6.4 branch as well. The kernel version of the intermediate Proxmox hypervisor is 5.15.108-1-pve. I'm running it on QEMU/KVM, through libvirt.

I apologize for any confusion in my original post; The second sentence of the post does not follow the rule where host always refers to the outer host. In this case, it refers to the memory leak occuring on Proxmox as the intermediate hypervisor.

Here is the output of `pveversion -v` on the intermediate Proxmox hypervisor:

Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-4
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
The bug occurs with both Proxmox and libvirt on Debian 12 as the intermediate hypervisors. I was able to narrow it down to kernel 5.10.189 (and .190 released just a few hours later) and another person has confirmed it occurs on the 6.4 branch as well. The kernel version of the intermediate Proxmox hypervisor is 5.15.108-1-pve. I'm running it on QEMU/KVM, through libvirt.
So you applied the AMD Inception fixes on the host kernel and then the issue started happening?
Here is the output of `pveversion -v` on the intermediate Proxmox hypervisor:

Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve)
And with the previous kernel, i.e. 5.15.107-2-pve it works?
 
So you applied the AMD Inception fixes on the host kernel and then the issue started happening?

And with the previous kernel, i.e. 5.15.107-2-pve it works?

Applying the AMD Inception fixes on the outer host breaks virtualization inside of any guests running on it, yes.

Earlier Proxmox kernels do not fix this problem, but earlier outer host kernels do. For example, Proxmox works fine and is able to run nested guests, if the host it is running on is on kernel 5.10.188.
 
We cannot influence what kind of host kernel you are running and it sounds like the issue lies with the host kernel if it's not present in older versions. Better report the issue to where you got the host kernel from.
 
That's the plan, I made this thread before I figured it out. It's also possible that someone on your team has better access to kernel developers than I do; I'm just some random, you guys are maintainers of a rather important project that might frequently be used to perform nested virtualization.
 
That's the plan, I made this thread before I figured it out. It's also possible that someone on your team has better access to kernel developers than I do; I'm just some random, you guys are maintainers of a rather important project that might frequently be used to perform nested virtualization.
No worries. Unfortunately, we only have limited time so we need to focus on our own kernels. The AMD Inception fixes will most likely be backported via the Ubuntu tree and we will be sure to debug the issue you reported before releasing the kernel if it affects us too.
 
@fiona @Stoiko Ivanov Just my five cents but the SRSO will be a replacement for microcode update prior Zen 3.
Didn't understand what "intermediate hypervisors" and "outer hosts" are? Guess this is nested virtualization.
But unless recent Debian kernel with SRSO fixes runs directly on the metal this might be quite an edge case.

Meanwhile it's known there are some issues with the first SRSO patchset released in upstream kernels.
There will be new upstream releases with SRSO fixes and hopefully more awareness and public testing:
https://www.phoronix.com/review/amd-inception-benchmarks

@liquidrory For testing the impact of SRSO fixes in Proxmox we need to wait for a new PVE kernel first.
Nested virtualization is a quite complex QA case and needs the first VM layer tested before going deeper.
Would be very nice if you could install a new PVE kernel bare metal once rebased on a fixed Ubuntu one :)
 
any news? safe to upgrade ?
The Proxmox VE kernel does not yet have the Inception fixes applied, that will still take a bit. A kernel with downfall fixes (6.2.16-10) is currently available in the pvetest repository. In any case, you should install the microcode updates for your CPU too.
 
@TheMrg For Zen2 & below you need to wait for a new kernel or swap in Zen3 CPUs to apply microcode from Debian.
The upcoming 6.5 kernel will have the latest patchset for SRSO fixes that quite likely will be backported to 5.15 LTS.
The upstream 5.15.126 has the initial fixes, but also they first need to land in Ubuntu kernel before a new PVE kernel.
Security supply chain for Epyc <= Zen2: upstream Linux kernel devs > Canonical kernel team > our PVE kernel heroes
 
  • Like
Reactions: alz

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!