Proxmox 8 Breaks Nested Virtualization

Status
Not open for further replies.

briphoks

New Member
Jul 12, 2023
5
0
1
I have ESXi nested in proxmox and everything was working fine in 7 and when I upgraded to 8 now guest VMs will not start. I rebuilt a node and made sure nested virt was enabled and still the same result. Does anyone else have this issue?
 
does anything show up when you run journalctl -f and then try to start?
 
Do you get any output when starting the VM from terminal with qm start <VMID>?

Also what output do you get when you double click the Task (shown on the bottom of the web gui)?

What command does qm showcmd <VMID> --pretty produce?
 
Do you get any output when starting the VM from terminal with qm start <VMID>?

Also what output do you get when you double click the Task (shown on the bottom of the web gui)?

What command does qm showcmd <VMID> --pretty produce?
The ESXi VM starts fine no problems. Its the ESXi Guest VMs that wolnt start. They seem to start successfully for a second and then are killed. no long in vsphere as far as i can tell but i may be looking in the wrong spot.

the qm showcmd

root@pve6:~# qm showcmd 104 --pretty
/usr/bin/kvm \
-id 104 \
-name 'ESXi-New1,debug-threads=on' \
-no-shutdown \
-chardev 'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server=on,wait=off' \
-mon 'chardev=qmp,mode=control' \
-chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
-mon 'chardev=qmp-event,mode=control' \
-pidfile /var/run/qemu-server/104.pid \
-daemonize \
-smbios 'type=1,uuid=d5410033-a260-4c2b-8df8-3daf605e8d56' \
-smp '12,sockets=1,cores=12,maxcpus=12' \
-nodefaults \
-boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
-vnc 'unix:/var/run/qemu-server/104.vnc,password=on' \
-cpu host,+kvm_pv_eoi,+kvm_pv_unhalt \
-m 65536 \
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
-device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \
-device 'vmgenid,guid=8a45fa5f-9e98-4ae4-9f42-cc5096e1ef4b' \
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
-device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
-device 'VGA,id=vga,bus=pci.0,addr=0x2' \
-device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
-iscsi 'initiator-name=iqn.1993-08.org.debian:01:249996e83273' \
-drive 'file=/dev/pve/vm-104-disk-0,if=none,id=drive-ide0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
-device 'ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100' \
-netdev 'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' \
-device 'vmxnet3,mac=9E:D4:CB:73:6F:0D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=101' \
-machine 'type=pc+pve0'
 
Hello, which CPU type is used by the VM that spawns the nested VMs? You can check this via

Code:
cat /etc/pve/qemu-server/GUEST_ID.conf  | grep cpu

it should read cpu: host (among other things), you can edit the file manually or via the UI.
 
Last edited:
CPU type was set to host. upgraded and it didnt work then reinstalled proxmox 8 and set cpu to host among other things just like it was done before and no guest vm on the nested ESXi would start. Went back to 7 and everything works great now
 
This problem may be relate to a bogus check about nested FLUSHBYASID [1]. Can you try to revert to kernel 5.15 and see if that allows you to boot the VM?

Code:
proxmox-boot-tool kernel pin 5.15.108-2-pve --next-boot

The --next-boot will pin that kernel only for the next boot, omit it to switch permanently to the 5.15.108-2-pve kernel.

[1]: https://lore.kernel.org/all/ZPjjy94x2BDIitOo@google.com/
 
Is there a fix or maybe a kernel update / downgrade that I can implement ?
Downgrade to kernel 5.15 should work, but 5.15 is not officially supported with PVE 8. You can still use PVE 7 as we still support 5.15 with PVE 7. As for a fix for kernel 6.2, we are working on it, but can't make any promises as to when that may land.
 
I'm having the same issue when upgrading from proxmox 7.4 to 8 for nested virtualization using VMware Workstation 16 in an Ubuntu Virtual Machine with an AMD EPYC 7543P, with the VM set to 'host' for the CPU type. Setting the kernel back to the previous version fixes it for now.
 
Downgrade to kernel 5.15 should work, but 5.15 is not officially supported with PVE 8. You can still use PVE 7 as we still support 5.15 with PVE 7. As for a fix for kernel 6.2, we are working on it, but can't make any promises as to when that may land.
Would this fix allow nested ESXi 8.0 virtualization? (works fine with ESXi 7.0 with Proxmox 7 and 8).
According to VMware, there might be an issue with VHV implementation in Proxmox with an AMD processor.
Nested ESXi 8.0 virtualization on Proxmox is not working with either kernel 5.15 or 6.2.
 
Would this fix allow nested ESXi 8.0 virtualization? (works fine with ESXi 7.0 with Proxmox 7 and 8).
Likely yes, but I couldn't test that yet, I can't give you any guarantees. If you test it, and it doesn't work, please post any error messages you may see.

According to VMware, there might be an issue with VHV implementation in Proxmox with an AMD processor.
Nested ESXi 8.0 virtualization on Proxmox is not working with either kernel 5.15 or 6.2.
The cause is already known, as mentioned before the CPU flag FLUSHBYASID is not properly exposed by KVM. A bogus check was merged upstream and should be reverted in addition to some other changes. We are currently work on preparing patches for this. For more information you can follow the discussion on the kernel mailing list [1].

[1]: https://lore.kernel.org/all/ZPjjy94x2BDIitOo@google.com/
 
Just an update, patches have since been submitted upstream [1]. I've tested them, and they work, you can find our backport here [2]. It might still be a little while until they are released with a new version of our kernel, though.

[1]: https://lore.kernel.org/all/20231018194104.1896415-1-seanjc@google.com/
[2]: https://lists.proxmox.com/pipermail/pve-devel/2023-October/059540.html

How would I go about installing this fix in proxmox, do I need to download and apply a patch somehow, any docks explaining how to do this ?
 
How would I go about installing this fix in proxmox, do I need to download and apply a patch somehow, any docks explaining how to do this ?
It's included in proxmox-kernel-6.2.16-19, which is available on the no-subscription repository since ~ two days.

So, to get that fix, you can simply update to that package and then reboot.
 
Last edited:
It's included in proxmox-kernel-6.2.16-19, which is available on the no-subscription repository since ~ two days.

So, to get that fix, you can simply update to that package and then reboot.
Hello friend, a question, this patch will be to up for general linux kernel or only is a fix for proxmox kernel?
 
Status
Not open for further replies.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!