Hi dear Proxmox community,
I am having the following issue, if using a Ryzen 7950x3d or 7800x3d on a b650e or x670e motherboard I get the same behaviour in both environments.
As long as I use the cpu type "host" I get bluescreens on my Windows 11 Pro VM when I open the device manager and "scan for hardware changes" sometimes Windows does it by itself when installing specific software or drivers and I get the bluescreens.
If I use the cpu type "x86-x64-v4" I can do whatever I want there will be no crash.
I updated the grub with this line "GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt", since those should help with CPU reset bugs, I even tried changing bios boot type to "CSM from UEFI", they both have the same effect, they make the VM work stable if I scan for hardware changes but not forever...
If I start a software like Passmark for CPU benchmarks or MSI Afterburner, basically anything that accesses the CPU sensors, as soon as those start and they show CPU temperature "0 degrees", if I then scan again for hardware changes I get the bluescreen, even if I close the software first. I noticed that if I use the CPU type "x86-64-v4" when I start any of those software for CPU temp it will just show "N/A" instead of the temperature and then there will be no crashes on hardware scanning. I tried a lot of things but could not get a stable VM with the CPU type "host" with those 2 CPU types, does anyone have any suggestion? I would very much appreciate any advice!
Have all the latest drivers installed, latest Proxmox version with latest kernel 6.8.12-8, and also amd-microcode installed.
maybe this helps, those are the flags of my CPU:
root@prox:/etc/default# cat /proc/cpuinfo | grep flags | head -n 1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d
Updated to kernel 6.11.11-1 - problem still there.
Best Regards,
Dean
I am having the following issue, if using a Ryzen 7950x3d or 7800x3d on a b650e or x670e motherboard I get the same behaviour in both environments.
As long as I use the cpu type "host" I get bluescreens on my Windows 11 Pro VM when I open the device manager and "scan for hardware changes" sometimes Windows does it by itself when installing specific software or drivers and I get the bluescreens.
If I use the cpu type "x86-x64-v4" I can do whatever I want there will be no crash.
I updated the grub with this line "GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt", since those should help with CPU reset bugs, I even tried changing bios boot type to "CSM from UEFI", they both have the same effect, they make the VM work stable if I scan for hardware changes but not forever...
If I start a software like Passmark for CPU benchmarks or MSI Afterburner, basically anything that accesses the CPU sensors, as soon as those start and they show CPU temperature "0 degrees", if I then scan again for hardware changes I get the bluescreen, even if I close the software first. I noticed that if I use the CPU type "x86-64-v4" when I start any of those software for CPU temp it will just show "N/A" instead of the temperature and then there will be no crashes on hardware scanning. I tried a lot of things but could not get a stable VM with the CPU type "host" with those 2 CPU types, does anyone have any suggestion? I would very much appreciate any advice!
Have all the latest drivers installed, latest Proxmox version with latest kernel 6.8.12-8, and also amd-microcode installed.
maybe this helps, those are the flags of my CPU:
root@prox:/etc/default# cat /proc/cpuinfo | grep flags | head -n 1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d
Updated to kernel 6.11.11-1 - problem still there.
Best Regards,
Dean