Nested virtualisation, Proxmox nodes crash

Zlobniy Shurik

New Member
Dec 4, 2025
7
1
3
In my home lab I use nested virtualisation.

Host machine (Fedora 43, kernel 6.18.2, Virtual Machine Manager, KVM):
  • Ryzen 9900x/Asus Prime X670-P-CSM
  • DDR5 ECC RAM 192Gb
  • dedicated 500Gb SATA SSD Samsung 870 (ext4) - for Proxmox VE node system images (qcow2)
  • dedicated NVMe KINGSTON SKC3000S1024G (1Tb, ext4) - as shared storage for Proxmox VE nodes (through VirtIO FS)

VMs:
  • Proxmox node N1 (10 vCPU/64Gb) - bunch of linux VMs
  • Proxmox node N2 (10 vCPU/64Gb) - linux VMs and Windows VMs
  • Proxmox node N3 (2 vCPU/8Gb) - as quorum node only (no VMs at all)
  • Proxmox BS
  • some of linux VMs

The problem is that my virtual machines with Proxmox VE nodes are experiencing kernel crashes intermittently.
There's no pattern. They can work for weeks without problems, or they can freeze twice a day.
What's interesting, node N3 without any payload can freeze too, definitely not often as nodes N1 & N2, but it can.

From Fedora's point of view there are no problem with Proxmox VE VMs - I can manage hanged nodes and see their pretty pink kernel crash screen :)
I can forcefully restart VM and node will run again.

Proxmox BS VM and linux VMs that hosted directly on Fedora are rock solid.

It's definitely not hardware problem, same config was on my old server (Ryzen 5900X/X470/128GB ECC DDR4) and same problem was there.
Initially that config worked perfectly (Proxmox VE 6.xx - Proxmox VE 7.yy ?). But at some point my nodes started to freeze.

Any suggestions on how to fix this freezing issue?
 
There have been serious issues reported with some versions of the kernel, and Ryzen CPUs, I don't remember the specifics, but check that your BIOS is up to date, and maybe try another kernel inside proxmox. You don't specify the versions there, but maybe try more recent kernels there...
 
Hi,
there is no 6.18.2 kernel released in the Proxmox VE repositories. The latest "official" kernel right now is 6.17.9-1 in the pve-test repository, so please try with kernels from Proxmox VE repositories first rather than third-party ones.
 
Last edited:
Hi,
there is no 6.18.2 kernel released in the Proxmox VE repositories. The latest "official" kernel right now is 6.17.9-1 in the pve-test repository, so please try with kernels from Proxmox VE repositories first rather than third-party ones.
he is running fedora with 6.18.2 as kernel as the host.
all the proxmox nodes are virtualized and he is not stating what kernels those are running.
 
  • Like
Reactions: fiona
I can't pinpoint which Proxmox kernel is causing the issue. I've experienced freezes with both 6.14.xx and 6.17.yy kernels.
Perhaps the last working Proxmox kernels were 6.11.zz or even 6.8.xx.
I realize my description isn't very good.
I was hoping my error is known and someone could give me general advice, like "don't use virtioXXX (it's buggy on AMD)" or "try disabling YYY."
 
he is running fedora with 6.18.2 as kernel as the host.
all the proxmox nodes are virtualized and he is not stating what kernels those are running.
Ah, sorry. Yes, I didn't read properly, assuming the host would be Proxmox, because that is by far the most common scenario.

@Zlobniy Shurik any hints in the host or Proxmox node system logs/journal around the time the issue happens?
 
From the host's perspective, everything is fine. No suspicious events in the logs. All virtual machines are running (even the frozen Proxmox node).

I can see pink screen of freezed Proxmox VM in Virtual Machine Manager. And I can to forcefuly reboot/shutdown Proxmox VM. Imho, problem in Proxmox VM itself.

I'm not sure about the Proxmox node... I tried to find anything suspicious in the Proxmox logs, but to no avail. I think I'm just looking in the wrong places.

Which Proxmox node logs should I check next time?
 
Last edited:
What you can try is to connecting via SSH from the host and letting journalctl -f run until a freeze happens.

What QEMU CPU type do you currently have for the VMs? host? Maybe you could try a different QEMU CPU type for the virtual machines (a model similar to the physical CPU and with the +svm CPU flag for virtualization).

Do you have latest BIOS updates/CPU microcode installed on the host?
 
VMs with Proxmox use `host` vCPU and VMinVM use `host` vCPU too.

BIOS from October 2025 (latest version from December, it's seems I miss it).

CPU microcode:
amd-ucode-firmware.noarch 20260110-1.fc43

I'll try trick with 'journalctl -f' through SSH. I hope that I'll find something useful in the logs after next crash :)
 
Are those PVE nodes running under Libvirt? If so, try running those PVE nodes under PVE tooling using this container:
Code:
docker run --detach -it --name pve-1 --hostname pve-1 \
    -p 2222:22 -p 3128:3128 -p 8006:8006 \
    --restart unless-stopped  \
    --cgroupns=private --cap-add ALL \
    --security-opt seccomp=unconfined \
    --security-opt apparmor=unconfined \
    --security-opt systempaths=unconfined \
    --device-cgroup-rule "a *:* rwm" \
    -v /dev/vfio:/dev/vfio \
    -v /usr/lib/modules:/usr/lib/modules:ro \
    -v /sys/kernel/security:/sys/kernel/security \
    -v ./VM-Backup:/var/lib/vz/dump \
    -v ./ISOs:/var/lib/vz/template/iso \
    --env PASSWORD=123 \
    ghcr.io/longqt-sea/proxmox-ve

More details: https://github.com/LongQT-sea/containerized-proxmox
 
Are those PVE nodes running under Libvirt? If so, try running those PVE nodes under PVE tooling using this container:

More details: https://github.com/LongQT-sea/containerized-proxmox
An interesting project, but...
My main goal is to test configuration changes on a virtualized copy of the Proxmox cluster from my work.
Therefore, I need the same Proxmox kernels, the same network topology, the same configuration files, etc.
Hosting my home virtual machines on Proxmox is a secondary goal.
That's why I prefer full virtual machines because I have full access to their internal components.
 
  • Like
Reactions: Johannes S