[SOLVED] Nested lab setup problem with PVE as VM with it's L3 VMs stalling soon after launch.

elimus_

Member
Aug 26, 2017
19
1
23
34
I am attempting to stabilize lab setup with PVE running in nested configuration on
my Threadripper host. So far for some reason, if I launch VM on this PVE VM with
Hardware virtualization enabled for it just stalls. PVE VM and L3 VM freezing almost
at the same time.

So far I haven't been able to find a solution. Maybe anyone using nested setups have
encountered something similar?

Tried:
* Kernel parameter increasing (no luck):
* kernel.hung_task_timeout_secs = 300
* kernel.watchdog_thresh=60
* updated BIOS for newest AGESA

Haven't yet tried:
* Using older kernel on Base OS
* Possibly digging though power state settings in BIOS
^ Left this for now. As for my older Intel, Ryzen Zen and Zen+ hosts I had not seen
any issues with nesting and default bios settings.
* Switching to Xen as hypervisor on my base OS?

###
# Base host Manjaro (w/ latest updates):
- AGESA: 1.1.0.2 (noted just cause saw some nested issue related to AGESA ver)
- Kernel: 5.7.9
- libvirt: 6.4.0
- Nested conf active

# Host -> VM PVE
- VM ProxMox 6.2 (w/ latest updates):
- SVM flag present

# Host -> VM PVE -> VM
- VM Centos7/Ubuntu 20.04.x
 

Attachments

  • pve-kvm-nested-crash.log
    8.7 KB · Views: 4
Hi,

what gen TR do you have?
I work here with a TR 2920X without problems.
Also, what do you use as a CPU type for the VMs?
 
what gen TR do you have?
I work here with a TR 2920X without problems.
Also, what do you use as a CPU type for the VMs?

I have the exact same model. At least now I know that It should work. Thanks for
comment, as that at least gives me hope that I should be able to get this working.

Will try with latest v5.8 kernel that just released for Manjaro. If that fails, then
next thing is to start experimenting with older kernel versions and check if nesting
starts working there...

Also, what do you use as a CPU type for the VMs?

So far I have tried: host, EPYC, kvm64, qemu64
 
Well, it turns out that it really was the kernel version.

- 5.8.0-2 -- Results in almost instant PVE restart as soon as L3 VM starts to load it's kernel...
- 5.7.14-1 -- Stalls as originally mentioned.
- 5.4.57-1 -- SUCCESS - stable nested setup
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!