Proxmox in virtual machine with nested virtualization CPU# stuck for 22s!

klusht · Sep 26, 2020

Hi,

I wish to use proxmox for development/testing purposes for cloud clusters simulation. I use a windows host and proxmox in a virtual machine (vmware player) with nested virtualization.

Guest proxmox VM configuration:
processors: 16 ( from a i7-6950X )
- virtualization engine: virtualize IntelVT-x/EPT or AMD-V/RVI, Virtualize CPU performance counters, Virtualize IOMMU( IO memory management unit )
memory: 64GB (max)
Proxmox installed from iso v 6.1-3

Nested VMs in proxmox:
Centos-7, processors ( 1 socket, 4 cores) memory 4GB

When I start VMs in proxmox I receive randomly:

] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [systemd-udevd:622]

I also get this while they are running, with a different message with the common CPU#N stuck for 22s! message.
Tried different settings on vmware player, but nothing helped.

I do assume this has to do with the vmware hypervisor, but is there anything I can try from proxmox side ?

Regards,

nxet · Sep 26, 2020

I've been seeing many of these as well, but I just assumed it was my old CPU complaining.
Commenting mostly to watch this conversation.

Dominic · Oct 1, 2020

First of all,

Commenting mostly to watch this conversation.

There is a button "Watch" right above the first post

@klusht Could you please run the following commands in Proxmox VE and just copy & paste the output in a [code][/code] section? See also the Nested Virtualization Wiki article.

Code:

pveversion -v
egrep '(vmx|svm)' --color=always /proc/cpuinfo
cat /etc/pve/qemu-server/105.conf

And an upgrade should not hurt:

Code:

apt update
apt full-upgrade

klusht · Oct 3, 2020

I remember testing a differnet version as well, and this issue was still presenet, randomly.
Would be really happy to undestand if this is something I can fix or it is more related to my CPU.
Any hint would be highly appreciated.

Here are the outputs:

root@pve:~# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.10-1-pve)
pve-manager: 6.1-3 (running version: 6.1-3/37248ce6)
pve-kernel-5.3: 6.0-12
pve-kernel-helper: 6.0-12
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-2
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.1-2
pve-container: 3.0-14
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191002-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-2
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

root@pve:~# egrep '(vmx|svm)' --color=always /proc/cpuinfo
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid rdseed adx smap xsaveopt arat flush_l1d arch_capabilities
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid rdseed adx smap xsaveopt arat flush_l1d arch_capabilities
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid rdseed adx smap xsaveopt arat flush_l1d arch_capabilities
[...]

root@pve:~# cat /etc/pve/qemu-server/105.conf
#web-cluster master-node
autostart: 1
bootdisk: scsi0
cores: 4
ide2: none,media=cdrom
memory: 4096
name: Copy-of-base-centos7
net0: virtio=52:B2:CC:6D:85A,bridge=vmbr0,firewall=0
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-105-disk-0,size=32G
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=84301d15-77a1-4a93-835d-d3efa44503bb
sockets: 1
vmgenid: 7b2999af-48a9-47c3-bf30-5e7666e95249

apoc · Oct 3, 2020

Have you tried a different virtualization platform?
Virtual box is an alternative solution you could try. Nested virtualization can be problematic according to my experience. Due to the fact this software is a hosted platform you might run into bottlenecks.

Aside that: what is your hardware you are running the nested proxmox on?

klusht · Oct 3, 2020

My hardware:
CPU: i7-6950X ( 10 cores 20 Threads )
RAM: 128GB 8x CORSAIR CMK16GX4M1A2400C14 2133
MB: Asrock Fatal1ty X99 Professional Gaming i7

Unfortunately vmware is the only option at the moment as Virtualbox has : "nested virtualization. This feature is only available on host systems that use an AMD CPU".
After few searches it turns out that this can be enabled for Intel as well via command line. I will try that out.

apoc · Oct 3, 2020

Your Problem starts with your VM configuration.
Assigning 16 cores to the PVE VM is way too much!
You only have 10 physical cores - so you you put a lot of pressure on the host. So much pressure that it will pause your PVE guest.
Use 6 vCPUs for the PVE host and 2 for the nested VM and I think the behaviour should go away

klusht · Oct 3, 2020

hmm
I need to read a bit more about CPU virtualization.
vmware shows 18 cores assignable. from which I choose 16 thinking that the should not be a problem.
Also I am missing the understanding why there is a problem when you would for example have only one nested VM with 2 cores (meaning I saw the error for that setup as well)

Everything I was able to read so far, it mentioned that I need to let some cores for the host machine or it can happen to throttle/freeze the host not completely blocking a VM to run. Also, I did a lot of tests with a lot of cores assigned to VM ( not nested VM ) and never have I received a "stuck" error.

The first test of just starting 7 nested VM with my default 1 socket/4 cores (6 cores VM) per nested VM, did not show the error.

Would be really interested to read a bit more about nested virtualization.
Any recommendations ?

apoc · Oct 3, 2020

I have to admit Intel marketing has done its job right.

Your first misconception is that you have 20 processors. Or cores. You have 20 threads. Not cores.
So technically you can't access more than 10 physical cores. Everything else is marketing. The Hyperthreads are threads - no processors. This can help, but most of the time your performance gain will be less than 20% compared to a real core because 2 threads share one physicaln processor/core.

From that your second misconception raises: You trust VMware. VMware Player trusts your OS (because it is no bare metal hypervisor).
Your Windows system tells you you have 20 processors. Guess what: Nope you haven't!

Now consider your poor PVE virtual machine. A hypervisor like PVE, Hyper-V and ESXi expects to run on Hardware. So they expect to control the hardware. If you assign 16 cores (which by the way you still haven't) to the VM - the hypervisor thinks it has 16 cores.
But it gets even worse. Because the hypervisor has no freaking clue that half of these cores are actually threads. There is no concept of threads in VMware Player and Workstation... So this is not exposed.

And after all that you place VMs inside a nested hypervisor with all the overheads etc. What are you expecting?

Using a real-world analogy. What you are doing is going to the bank and get some cash. But instead of checking your balance first you just keep asking for money. Sooner or later you will have pulled all your money but the bank will keep giving you more. At least until you have reached the point where you realize that you have not only spend all your money, oh no. You owe the bank money!

klusht · Oct 4, 2020

Interesting analogy

thanks for sharing.
I admit, I did not read virtualization in depth, which is why I was asking for some good recommendations.

I was aware that 16 are not all real cores, but as long as they are shown in the hypervisor I was playing along mostly because I thought that the hypervisor will take care of scheduling them accordantly.
Also, there is the problem of having a single nested virtual machine with 2 cores that simply receives "stuck" error although there are at least 4 real cores to work with. This shows that if an nested VM is assigned a logical core, then there is this issue.

I get the overhead, speed is not an issue in most cases.

What I don't get is at what level is the CPU "stuck" and why the decision of blocking the processing instead of throttle its cycles ?

Also I really need to understand process management in virtualization.

Thanks for the support, if you have some good resources for this subject please share them

apoc · Oct 4, 2020

klusht said:
...was playing along mostly because I thought that the hypervisor will take care of scheduling them accordantly.

Which hypervisor?
PVE does not know about threads because it is abstracted by VMware player.
And VMware player is relying on windows because it is no bare metal hypervisor.
Windows scheduling can have its own issues because according to my experience it doesnt consider ht threads and outtasks the HT/physical core scheduling to the hardware.

I have written my advice. No clue if it helps but it should.

Dominic · Oct 5, 2020

You can run lscpu in Proxmox VE to check what cores or threads it sees exactly. Additionally, you can try to set the cpu type of your CentOS VM to "host".

Search

Search

Proxmox in virtual machine with nested virtualization CPU# stuck for 22s!

klusht

New Member

Attachments

nxet

Member

Dominic

Proxmox Retired Staff

klusht

New Member

apoc

Famous Member

klusht

New Member

apoc

Famous Member

klusht

New Member

apoc

Famous Member

klusht

New Member

apoc

Famous Member

Dominic

Proxmox Retired Staff

We value your privacy