Windows server 2019 Critical structure corruption, nested virtualization

May 10, 2021
22
2
8
29
I'm using Proxmox 6.4 with 5.11 kernel for a Windows server virtual machine that runs a particular software that uses Virtualbox.

The nested virtualization works fine, the problem is that after a while (about 40 min) I get a BSOD with the message "critical structure corruption".
I installed all the virtuo drivers and tried various hardware combinations for the guest.

This is my current configuration:
Code:
args: -cpu host,+svm,-hypervisor
balloon: 0
boot: order=scsi0;net0
cores: 8
cpu: host
ide3: Storage:iso/virtio-win-0.1.185.iso,media=cdrom,size=402812K
machine: pc-i440fx-5.2
memory: 24576
name: PCT
net0: virtio=12:72:B8:E1:36:F5,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
parent: preupdate
scsi0: Storage:100/vm-100-disk-0.qcow2,discard=on,size=200G
scsihw: virtio-scsi-pci
smbios1: uuid=9d108134-c6e2-484e-bac7-96450cb9e73a
sockets: 1
vga: qxl
vmgenid: b010dcf6-a55b-4bc3-816a-2b691b028d0e

This is the BSOD instead:
Code:
==================================================
Dump File         : 052221-4328-01.dmp
Crash Time        : 22/05/2021 16:07:03
Bug Check String  :
Bug Check Code    : 0x00000109
Parameter 1       : a3a0265e`3eb8d6de
Parameter 2       : b3b732e4`913af5e1
Parameter 3       : 00000000`00000176
Parameter 4       : 00000000`00000007
Caused By Driver  : ntoskrnl.exe
Caused By Address : ntoskrnl.exe+1b73b0
File Description  :
Product Name      :
Company           :
File Version      :
Processor         : x64
Crash Address     : ntoskrnl.exe+1b73b0
Stack Address 1   :
Stack Address 2   :
Stack Address 3   :
Computer Name     :
Full Path         : C:\Windows\Minidump\052221-4328-01.dmp
Processors Count  : 8
Major Version     : 15
Minor Version     : 17763
Dump File Size    : 555.132
Dump File Time    : 22/05/2021 16:07:42
==================================================

How can I debug such a thing?
i also tried with other versions of windows server, same result.
The special thing is that I have an old Windows 7 VM created in another KVM server that is using the same software correctly and never crash.
 
Last edited:
Hi,

does the BSOD only appear when you run Virtualbox? Do other Windows server VMs run correctly?

Your problems looks related to this
https://docs.microsoft.com/en-us/tr...ce/stop-error-0x109-on-vmware-virtual-machine
to me. What we can do is try to replicate that CPUID mask. Could you test if adding the parameters -ss,-rdtscp to args changes something?

Code:
qm set <vmid> -args '-cpu host,+svm,-hypervisor,-ss,-rdtscp'

I think their mask deactivates one of those features (see also Wikipedia, some CPUID explorer). You have to stop & start (or live migrate) to apply the new settings.

Running Coreinfo in the Windows VM is not much work and gives a really good overview over available CPU flags. Using the kd tool mentioned in the Microsoft documentation is more difficult. For a Windows 10 VM I had to install their SDK and then place LiveKd into the installation folder C:\Program Files (x86)\Windows Kits\10\Debuggers\x64.

It would be great if you could also post from your PVE host
Code:
cat /proc/cpuinfo
and the output of Coreinfo in the VM.
 
  • Like
Reactions: Pietro395
Hi Dominic,
Thanks for your interest and support.

does the BSOD only appear when you run Virtualbox? Do other Windows server VMs run correctly?
I confirm that the bsod happens only when I use virtualbox, otherwise the vm is stable. I have other vm's with windows server with no problems.

Could you test if adding the parameters -ss,-rdtscp to args changes something?
I tried, unfortunately nothing has changed, the bsod appeared anyway

t would be great if you could also post from your PVE host
Code:
cat /proc/cpuinfo
There he is:
Code:
processor       : 31
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 49
model name      : AMD EPYC 7302 16-Core Processor
stepping        : 0
microcode       : 0x830104d
cpu MHz         : 1500.000
cache size      : 512 KB
physical id     : 0
siblings        : 32
core id         : 29
cpu cores       : 16
apicid          : 59
initial apicid  : 59
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 5988.77
TLB size        : 3072 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

and the output of Coreinfo in the VM.
Code:
Coreinfo v3.52 - Dump information on system CPU and memory topology
Copyright (C) 2008-2021 Mark Russinovich
Sysinternals - www.sysinternals.com


AMD EPYC 7302 16-Core Processor
AMD64 Family 23 Model 49 Stepping 0, AuthenticAMD
Microcode signature: 00000000
HTT             *       Multicore
CET             -       Supports Control Flow Enforcement Technology
Kernel CET      -       Kernel-mode CET Enabled
User CET        -       User-mode CET Allowed
HYPERVISOR      -       Hypervisor is present
VMX             -       Supports Intel hardware-assisted virtualization
SVM             *       Supports AMD hardware-assisted virtualization
X64             *       Supports 64-bit mode

SMX             -       Supports Intel trusted execution
SKINIT          -       Supports AMD SKINIT
SGX             -       Supports Intel SGX

NX              *       Supports no-execute page protection
SMEP            *       Supports Supervisor Mode Execution Prevention
SMAP            *       Supports Supervisor Mode Access Prevention
PAGE1GB         *       Supports 1 GB large pages
PAE             *       Supports > 32-bit physical addresses
PAT             *       Supports Page Attribute Table
PSE             *       Supports 4 MB pages
PSE36           *       Supports > 32-bit address 4 MB pages
PGE             *       Supports global bit in page tables
SS              -       Supports bus snooping for cache operations
VME             *       Supports Virtual-8086 mode
RDWRFSGSBASE    *       Supports direct GS/FS base access

FPU             *       Implements i387 floating point instructions
MMX             *       Supports MMX instruction set
MMXEXT          *       Implements AMD MMX extensions
3DNOW           -       Supports 3DNow! instructions
3DNOWEXT        -       Supports 3DNow! extension instructions
SSE             *       Supports Streaming SIMD Extensions
SSE2            *       Supports Streaming SIMD Extensions 2
SSE3            *       Supports Streaming SIMD Extensions 3
SSSE3           *       Supports Supplemental SIMD Extensions 3
SSE4a           *       Supports Streaming SIMDR Extensions 4a
SSE4.1          *       Supports Streaming SIMD Extensions 4.1
SSE4.2          *       Supports Streaming SIMD Extensions 4.2

AES             *       Supports AES extensions
AVX             *       Supports AVX instruction extensions
AVX2            *       Supports AVX2 instruction extensions
AVX-512-F       -       Supports AVX-512 Foundation instructions
AVX-512-DQ      -       Supports AVX-512 double and quadword instructions
AVX-512-IFAMA   -       Supports AVX-512 integer Fused multiply-add instructions
AVX-512-PF      -       Supports AVX-512 prefetch instructions
AVX-512-ER      -       Supports AVX-512 exponential and reciprocal instructions
AVX-512-CD      -       Supports AVX-512 conflict detection instructions
AVX-512-BW      -       Supports AVX-512 byte and word instructions
AVX-512-VL      -       Supports AVX-512 vector length instructions
FMA             *       Supports FMA extensions using YMM state
MSR             *       Implements RDMSR/WRMSR instructions
MTRR            *       Supports Memory Type Range Registers
XSAVE           *       Supports XSAVE/XRSTOR instructions
OSXSAVE         *       Supports XSETBV/XGETBV instructions
RDRAND          *       Supports RDRAND instruction
RDSEED          *       Supports RDSEED instruction

CMOV            *       Supports CMOVcc instruction
CLFSH           *       Supports CLFLUSH instruction
CX8             *       Supports compare and exchange 8-byte instructions
CX16            *       Supports CMPXCHG16B instruction
BMI1            *       Supports bit manipulation extensions 1
BMI2            *       Supports bit manipulation extensions 2
ADX             *       Supports ADCX/ADOX instructions
DCA             -       Supports prefetch from memory-mapped device
F16C            *       Supports half-precision instruction
FXSR            *       Supports FXSAVE/FXSTOR instructions
FFXSR           *       Supports optimized FXSAVE/FSRSTOR instruction
MONITOR         -       Supports MONITOR and MWAIT instructions
MOVBE           *       Supports MOVBE instruction
ERMSB           -       Supports Enhanced REP MOVSB/STOSB
PCLMULDQ        *       Supports PCLMULDQ instruction
POPCNT          *       Supports POPCNT instruction
LZCNT           *       Supports LZCNT instruction
SEP             *       Supports fast system call instructions
LAHF-SAHF       *       Supports LAHF/SAHF instructions in 64-bit mode
HLE             -       Supports Hardware Lock Elision instructions
RTM             -       Supports Restricted Transactional Memory instructions

DE              *       Supports I/O breakpoints including CR4.DE
DTES64          -       Can write history of 64-bit branch addresses
DS              -       Implements memory-resident debug buffer
DS-CPL          -       Supports Debug Store feature with CPL
PCID            -       Supports PCIDs and settable CR4.PCIDE
INVPCID         -       Supports INVPCID instruction
PDCM            -       Supports Performance Capabilities MSR
RDTSCP          -       Supports RDTSCP instruction
TSC             *       Supports RDTSC instruction
TSC-DEADLINE    *       Local APIC supports one-shot deadline timer
TSC-INVARIANT   -       TSC runs at constant rate
xTPR            -       Supports disabling task priority messages

EIST            -       Supports Enhanced Intel Speedstep
ACPI            -       Implements MSR for power management
TM              -       Implements thermal monitor circuitry
TM2             -       Implements Thermal Monitor 2 control
APIC            *       Implements software-accessible local APIC
x2APIC          *       Supports x2APIC

CNXT-ID         -       L1 data cache mode adaptive or BIOS

MCE             *       Supports Machine Check, INT18 and CR4.MCE
MCA             *       Implements Machine Check Architecture
PBE             -       Supports use of FERR#/PBE# pin

PSN             -       Implements 96-bit processor serial number

PREFETCHW       *       Supports PREFETCHW instruction

Maximum implemented CPUID leaves: 00000010 (Basic), 8000001F (Extended).
Maximum implemented address width: 48 bits (virtual), 40 bits (physical).

Processor signature: 00830F10

Logical to Physical Processor Map:
*-------  Physical Processor 0
-*------  Physical Processor 1
--*-----  Physical Processor 2
---*----  Physical Processor 3
----*---  Physical Processor 4
-----*--  Physical Processor 5
------*-  Physical Processor 6
-------*  Physical Processor 7

Logical Processor to Socket Map:
********  Socket 0

Logical Processor to NUMA Node Map:
********  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
*-------  Data Cache          0, Level 1,   64 KB, Assoc   2, LineSize  64
*-------  Instruction Cache   0, Level 1,   64 KB, Assoc   2, LineSize  64
*-------  Unified Cache       0, Level 2,  512 KB, Assoc  16, LineSize  64
*-------  Unified Cache       1, Level 3,   16 MB, Assoc  16, LineSize  64
-*------  Data Cache          1, Level 1,   64 KB, Assoc   2, LineSize  64
-*------  Instruction Cache   1, Level 1,   64 KB, Assoc   2, LineSize  64
-*------  Unified Cache       2, Level 2,  512 KB, Assoc  16, LineSize  64
-*------  Unified Cache       3, Level 3,   16 MB, Assoc  16, LineSize  64
--*-----  Data Cache          2, Level 1,   64 KB, Assoc   2, LineSize  64
--*-----  Instruction Cache   2, Level 1,   64 KB, Assoc   2, LineSize  64
--*-----  Unified Cache       4, Level 2,  512 KB, Assoc  16, LineSize  64
--*-----  Unified Cache       5, Level 3,   16 MB, Assoc  16, LineSize  64
---*----  Data Cache          3, Level 1,   64 KB, Assoc   2, LineSize  64
---*----  Instruction Cache   3, Level 1,   64 KB, Assoc   2, LineSize  64
---*----  Unified Cache       6, Level 2,  512 KB, Assoc  16, LineSize  64
---*----  Unified Cache       7, Level 3,   16 MB, Assoc  16, LineSize  64
----*---  Data Cache          4, Level 1,   64 KB, Assoc   2, LineSize  64
----*---  Instruction Cache   4, Level 1,   64 KB, Assoc   2, LineSize  64
----*---  Unified Cache       8, Level 2,  512 KB, Assoc  16, LineSize  64
----*---  Unified Cache       9, Level 3,   16 MB, Assoc  16, LineSize  64
-----*--  Data Cache          5, Level 1,   64 KB, Assoc   2, LineSize  64
-----*--  Instruction Cache   5, Level 1,   64 KB, Assoc   2, LineSize  64
-----*--  Unified Cache      10, Level 2,  512 KB, Assoc  16, LineSize  64
-----*--  Unified Cache      11, Level 3,   16 MB, Assoc  16, LineSize  64
------*-  Data Cache          6, Level 1,   64 KB, Assoc   2, LineSize  64
------*-  Instruction Cache   6, Level 1,   64 KB, Assoc   2, LineSize  64
------*-  Unified Cache      12, Level 2,  512 KB, Assoc  16, LineSize  64
------*-  Unified Cache      13, Level 3,   16 MB, Assoc  16, LineSize  64
-------*  Data Cache          7, Level 1,   64 KB, Assoc   2, LineSize  64
-------*  Instruction Cache   7, Level 1,   64 KB, Assoc   2, LineSize  64
-------*  Unified Cache      14, Level 2,  512 KB, Assoc  16, LineSize  64
-------*  Unified Cache      15, Level 3,   16 MB, Assoc  16, LineSize  64

Logical Processor to Group Map:
********  Group 0

Let me know if you think of anything! currently i don't know what else to try
 
Is there any chance that you could install your software into a VM directly on Proxmox VE instead of Virtualbox?
 
  • Like
Reactions: Pietro395
Is there any chance that you could install your software into a VM directly on Proxmox VE instead of Virtualbox?
Unfortunately not, it is a Windows software that uses Virtualbox to virtualize development PLCs, It creates the VMs iinternall. After the information I sent you have any ideas?
 
I just removed the "args" part from the wiki guide about nested virtualization.
Using only "cpu: host" without "args" might be worth a try. Or another kernel. But apart from trying different kernels and changing CPU flags there is not much that we can do in PVE, I think.
 
  • Like
Reactions: Pietro395
I just removed the "args" part from the wiki guide about nested virtualization.
Using only "cpu: host" without "args" might be worth a try. Or another kernel. But apart from trying different kernels and changing CPU flags there is not much that we can do in PVE, I think.
I will try without args, how can I try other kernels? How can my problem happen? An incompatibility between kernel and CPU?
 
One note: The output of Coreinfo seems to be wrong with regard to the vmx|svm flags. A Debian guest (booted from live iso) displayed vmx to me (I have a Xeon here to test) in /proc/cpuinfo while Coreinfo in the Windows 10 VM displayed it as off - but the VM configuration was the same (up to the live iso).

With cpu host the svm flag should be automatically passed to your Windows VM. Even if Coreinfo doesn't show it. Therefore, adding +svm with args should change nothing.
With cpu kvm64 the svm will not be passed to your Windows VM (unless with args +svm).
This should explain why
virtualization just didn't work

To try another kernel, you can use
Code:
apt search pve-kernel
which should give you for example pve-kernel-5.4. Then apt install <pve-kernel...> and reboot.

Does your Virtualbox make use of Hyper-V (IIRC this can be configured?)?
  • An Intel processor with VT-x and EPT technology -- nesting is currently Intel-only. [0]
[0] https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/nested-virtualization
 
Last edited:
After you may need to try turning off "KVM Hardware virtualization" on the Windows Server guest.. (It's in the options) It May be slow but... if it works?
With cpu kvm64 the svm will not be passed to your Windows VM (unless with args +svm).

I'm trying this way, if I put the args the machine goes into bootloop, is it possible to enable svm with cpu kvm64?
args: -cpu kvm64,+svm
 
I was able to start an Alpine Linux VM with
Code:
args: -cpu kvm64,+vmx
so generally using VMs with this combination should work. I guess the same holds for +svm.
However, my Windows VM did not start correctly either.

Note: It might be convenient to declare custom CPU types instead of using args.
 
I was able to start an Alpine Linux VM with
Code:
args: -cpu kvm64,+vmx
so generally using VMs with this combination should work. I guess the same holds for +svm.
However, my Windows VM did not start correctly either.

Note: It might be convenient to declare custom CPU types instead of using args.
Unfortunately I have to confirm that with + svm I get a bootloop, tried with an existing and a new installation of Windows server.
 
@Dominic after various other researches I came to this comment thinking it was my case:

Most likely the VM trying to access an unsupported MSR. You can check if this is the error by running "dmesg -wH" and observing the output upon the crash. Adding "echo 1 > /sys/module/kvm/parameters/ignore_msrs" to the top of your startup script should fix the issue.

dmesg does not show any errors, however, how is this possible? Shouldn't the error be showing up somewhere?
Do you have any ideas? I'm still stuck with this problem, thanks

 
Hello! I faced up the same issue. Tried a number of machines, all with AMD Ryzen 5 3600 CPU.
The nesting scenario Proxmox7>Windows Server>Virtualbox works, but BSOD happens at random time.
Even if Virtualbox is already closed, but was running previously after boot.
May get BSOD some minutes after Virtualbox start, or only after an hours of work.
If Virtualbox was not started, same Windows VMs work without an issue.
The same nested virtualization scenario works OK on the systems with Intel CPUs.
So the problem is definitely AMD related.

I have tested nested virtualization Proxmox7>Proxmox7>Windows server on same AMD systems, and have not noticed any issues.

Have also tried to use VMware ESXi7 with same AMD machines, the same scenario
ESXi7>Windows Server>Virtualbox and confirmed that it works absolutelly stable.
So the problem is AMD+Proxmox related.

I`m not sure if CPU flags are the cause, but I have provided the comparison.

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw perfctr_core ssbd ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip rdpid arch_capabilities
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succor

The summary comparison of Proxmox and ESXi guest CPU flags:
Proxmox guest unique flags compared to ESXi, present on host:
stibp xsaveerptr

Proxmox guest unique flags compared to ESXi, not present on host:
tsc_known_freq tsc_deadline_timer tsc_adjust arch_capabilities

ESXi guest unique flags compared to Proxmox, present on host:
constant_tsc nonstop_tsc extapic topoext svm_lock nrip_save vmcb_clean flushbyasid decodeassists overflow_recov succor

ESXi guest unique flags compared to Proxmox, not present on host:
tsc_reliable

I have also tried to use various CPU types: host, kvm64 +svm , EPYC +svm and the result was the same BSOD.
Treid to tweak args with -tsc_adjust and received another behaviour of Windows VM, it hangs with near 100% dedicated vm CPU usage, but no BSOD
Tried to tweak args with -tsc_deadline_timer but such an option don`t work, VM can`t start with the following error:
"kvm: Property 'host-x86_64-cpu.tsc_deadline_timer' not found
TASK ERROR: start failed: QEMU exited with code 1"
What is the correct option to remove tsc_deadline_timer from the guest CPU flags?

Maybe we are missing some CPU flags on Proxmox guest, present on ESXi guest?
Or the issue is not CPU flag related at all?
 
Last edited:
  • Like
Reactions: Pietro395
Hello! I faced up the same issue. Tried a number of machines, all with AMD Ryzen 5 3600 CPU.
The nesting scenario Proxmox7>Windows Server>Virtualbox works, but BSOD happens at random time.
Even if Virtualbox is already closed, but was running previously after boot.
May get BSOD some minutes after Virtualbox start, or only after an hours of work.
If Virtualbox was not started, same Windows VMs work without an issue.
The same nested virtualization scenario works OK on the systems with Intel CPUs.
So the problem is definitely AMD related.

I have tested nested virtualization Proxmox7>Proxmox7>Windows server on same AMD systems, and have not noticed any issues.

Have also tried to use VMware ESXi7 with same AMD machines, the same scenario
ESXi7>Windows Server>Virtualbox and confirmed that it works absolutelly stable.
So the problem is AMD+Proxmox related.

I`m not sure if CPU flags are the cause, but I have provided the comparison.

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw perfctr_core ssbd ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip rdpid arch_capabilities
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succor

The summary comparison of Proxmox and ESXi guest CPU flags:
Proxmox guest unique flags compared to ESXi, present on host:
stibp xsaveerptr

Proxmox guest unique flags compared to ESXi, not present on host:
tsc_known_freq tsc_deadline_timer tsc_adjust arch_capabilities

ESXi guest unique flags compared to Proxmox, present on host:
constant_tsc nonstop_tsc extapic topoext svm_lock nrip_save vmcb_clean flushbyasid decodeassists overflow_recov succor

ESXi guest unique flags compared to Proxmox, not present on host:
tsc_reliable

I have also tried to use various CPU types: host, kvm64 +svm , EPYC +svm and the result was the same BSOD.
Treid to tweak args with -tsc_adjust and received another behaviour of Windows VM, it hangs with near 100% dedicated vm CPU usage, but no BSOD
Tried to tweak args with -tsc_deadline_timer but such an option don`t work, VM can`t start with the following error:
"kvm: Property 'host-x86_64-cpu.tsc_deadline_timer' not found
TASK ERROR: start failed: QEMU exited with code 1"
What is the correct option to remove tsc_deadline_timer from the guest CPU flags?

Maybe we are missing some CPU flags on Proxmox guest, present on ESXi guest?
Or the issue is not CPU flag related at all?
Please let me know if you have any updates! I will try some of the flags you indicated
 
Hello! I faced up the same issue. Tried a number of machines, all with AMD Ryzen 5 3600 CPU.
The nesting scenario Proxmox7>Windows Server>Virtualbox works, but BSOD happens at random time.
Even if Virtualbox is already closed, but was running previously after boot.
May get BSOD some minutes after Virtualbox start, or only after an hours of work.
If Virtualbox was not started, same Windows VMs work without an issue.
The same nested virtualization scenario works OK on the systems with Intel CPUs.
So the problem is definitely AMD related.

I have tested nested virtualization Proxmox7>Proxmox7>Windows server on same AMD systems, and have not noticed any issues.

Have also tried to use VMware ESXi7 with same AMD machines, the same scenario
ESXi7>Windows Server>Virtualbox and confirmed that it works absolutelly stable.
So the problem is AMD+Proxmox related.

I`m not sure if CPU flags are the cause, but I have provided the comparison.

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw perfctr_core ssbd ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip rdpid arch_capabilities
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero wbnoinvd arat npt svm_lock nrip_save vmcb_clean flushbyasid decodeassists umip rdpid overflow_recov succor

The summary comparison of Proxmox and ESXi guest CPU flags:
Proxmox guest unique flags compared to ESXi, present on host:
stibp xsaveerptr

Proxmox guest unique flags compared to ESXi, not present on host:
tsc_known_freq tsc_deadline_timer tsc_adjust arch_capabilities

ESXi guest unique flags compared to Proxmox, present on host:
constant_tsc nonstop_tsc extapic topoext svm_lock nrip_save vmcb_clean flushbyasid decodeassists overflow_recov succor

ESXi guest unique flags compared to Proxmox, not present on host:
tsc_reliable

I have also tried to use various CPU types: host, kvm64 +svm , EPYC +svm and the result was the same BSOD.
Treid to tweak args with -tsc_adjust and received another behaviour of Windows VM, it hangs with near 100% dedicated vm CPU usage, but no BSOD
Tried to tweak args with -tsc_deadline_timer but such an option don`t work, VM can`t start with the following error:
"kvm: Property 'host-x86_64-cpu.tsc_deadline_timer' not found
TASK ERROR: start failed: QEMU exited with code 1"
What is the correct option to remove tsc_deadline_timer from the guest CPU flags?

Maybe we are missing some CPU flags on Proxmox guest, present on ESXi guest?
Or the issue is not CPU flag related at all?
I tried with -tsc_adjust and i got the same BSOD.

@Dominic can we have some help?
 
Decided to try the above scenario Linux Qemu-KVM>Windows Server>Virtualbox with different Qemu and Linux Kernel versions.
On the same AMD Ryzen 5 3600 CPU host.

Installed Debian 11 from scratch, the Qemu version is 5.2 Linux Kernel is 5.10.0-8-amd64
BSOD on Windows server guest running Virtualbox present.

Updated Qemu to version 6.1 from source code
BSOD also present

Updated Linux kernel to version 5.10.69 from source code with make olddefconfig
BSOD still present

Updated Linux Kernel to version 5.14.8 also with make olddefconfig
Windows server guest is already running with nested Virtualbox for more than 24 hours without BSOD.
So it looks like Linux Kernel 5.14.8 already has the fix for the above problem.

Will also try to update Linux Kernel to 5.14.8 on Proxmox7 host
Should I possibly get any problems with Proxmox functionality in this case?

Still looking for better and more simple solution.
If fix code applies only to kvm_amd module, maybe it is possible to install custom fixed kvm_amd module on Proxmox7.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!