Linux/x86 5.0.21-4-pve Kernel & Linux VM (pvetesting repo)

dale

Renowned Member
Mar 19, 2010
34
0
71
After upgrade to latest pvetesting repo Linux VMs constantly reboot in cause using 5.0.21-4-pve kernel [Windos & CTs - OK].
Linux/x86 5.0.21-3-pve - OK.

root@b11:~# pveversion -v
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-11 (running version: 6.0-11/2140ef37)
pve-kernel-helper: 6.0-11
pve-kernel-5.0: 6.0-10
pve-kernel-5.0.21-4-pve: 5.0.21-8
pve-kernel-5.0.21-3-pve: 5.0.21-7
ceph: 14.2.4-pve1
ceph-fuse: 14.2.4-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-3
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-6
libpve-guest-common-perl: 3.0-2
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
openvswitch-switch: 2.10.0+2018.08.28+git.8ca7c82b7d+ds1-12
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-10
pve-docs: 6.0-8
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-4
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-4
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-13
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1

wbr, dale.
 
After upgrade to latest pvetesting repo Linux VMs constantly reboot in cause using 5.0.21-4-pve kernel [Windos & CTs - OK].
Linux/x86 5.0.21-3-pve - OK.


What shows the VM's syslog?
 
Oh, and can you maybe try to boot some live-ISO image of a Linux Distribution (Ubuntu, Fedora, ...)?

Are you using ZFS as backing storage for those VMs?
 
The virtual machine is running Win server 2019 with the latest updates. The virtual machine is located on ZFS (mirrored on two HDDs) Kvm64 used. The backup on the other storage did not work either. Now I'm trying to run Linux in a container and reinstall Windows in VM.
 
So, could it be that you ran into the ZFS FPU/SIMD issue ( https://forum.proxmox.com/threads/z...pve2-causes-fpu-corruption.58627/#post-270421 ) and only the reboot (and thus restart of the VM) made the issues surface.
I.e., similar guess that I made at: https://forum.proxmox.com/threads/a...amage-on-all-vms-with-uefi.59688/#post-275724

Bad stuff, I know, but I would really like to verify that the new 5.0.21-4 kernel has no grave issues on some special set of HW; I mean we tested common intel/amd setups with different VMs/CTs and backing storages (Ceph, ZFS, ext4, XFS) as always - but there can always be some set of HW/workload which wasn't available to check).

Now I'm trying to run Linux in a container and reinstall Windows in VM.

Thanks, please report back with the result.s
 
But, on the other hand it would be really weird that the guest FS does all writes with some FPU/SIMD instructions.
Normally the aforementioned issue affected mostly compute programs using the FPU massively.
What filesystems are the VMs using? Do the use compression/checksumming themself?
 
Linux in the container runs without problems. But the Windows machine does not want to work in any way and on any storage. It started after today's update. Reboots don't help. NTFS used for Windows. When you try a new installation, Windows falls into the blue screen of death. The installation does not reach the choice of the Virtio SCSI driver. About the use of compression later.
 
Hmm, strange, the original poster said that Windows & CTs work for them, but Linux VMs not..
I can do both here, at least a Windows 10 installations works good on both an intel setup and an AMD EPYC one..
Can re-check a windows server 2019, to be sure, though.
 
He successfully worked for ZFS for over two years. I am now trying to transfer this to a local LVM-Thin drive without any encryption.
 
Tried to reinstall Proxmox on the kernel 5.0.15-1. But that doesn't help either. Windows machines no longer start under any conditions. The mystic! I have no idea.
 
Tried to reinstall Proxmox on the kernel 5.0.15-1. But that doesn't help either. Windows machines no longer start under any conditions. The mystic! I have no idea.

That all sounds like there's something completely different going on. If you booted and older Kernel (e.g., 5.0.21-3) and it does not work it probably has nothing to do with the 5.0.21-4 kernel. Also, all other have just issues with specific linux VMs on older Intel HW, as far as I can tell. As you have issues with just windows I'd guess that it's something else, sadly I'm not to sure what it could be. I'd recommend to open a new thread for that.
 
VM reboot in earliest stage with message:

Physical KASLR disabled: no suitable memory region!

hardware info attached

root@b12:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.0.21-4-pve root=UUID=e48d586c-8f58-4b03-a081-cdcf4776c83c ro console=tty0 console=ttyS0,115200n8 nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier

rebooted with:

root@b12:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.0.21-4-pve root=UUID=e48d586c-8f58-4b03-a081-cdcf4776c83c ro console=tty0 console=ttyS0,115200n8

no changes.

booting from ISO image: proxmox-ve_6.0-1.iso

tapping "install proxmox ve"

..

Loading proxmox installer
Loading initrd

reboot
 

Attachments

Can confirm that this is "hardware dependent" problem.

Another system with Intel(R) Xeon(R) Gold 6144 CPU @ 3.50GHz - Ok.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!