CentOS 7 based VM's won't boot anymore

AllanM

Well-Known Member
Oct 17, 2019
112
36
48
41
Hello Proxmox Devs & Users!

Interesting problem this week. I went to reboot our Security Onion nodes and ran into some problems...


at VM boot:

Screenshot 2023-05-08 190024.png


Then after awhile of waiting:

Screenshot 2023-05-08 185812.png

Updated our cluster last week:

Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.104-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.4-1
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-5
pve-kernel-5.3: 6.1-6
pve-kernel-5.15.104-1-pve: 5.15.104-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.60-2-pve: 5.15.60-2
pve-kernel-5.15.60-1-pve: 5.15.60-1
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.15.30-2-pve: 5.15.30-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-5-pve: 5.13.19-13
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-1
pve-kernel-5.4.124-1-pve: 5.4.124-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.4.114-1-pve: 5.4.114-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.44-1-pve: 5.4.44-1
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.3.13-3-pve: 5.3.13-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-4
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.5
libpve-storage-perl: 7.4-2
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.1-1
proxmox-backup-file-restore: 2.4.1-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.6.5
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-1
pve-firmware: 3.6-4
pve-ha-manager: 3.6.0
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1


Did something change in proxmox recently that would impact CentOS 7 / Security Onion virtio support? We're running SO 2.3.240. All VM's for the distributed SO installation now refuse to boot with previously working settings. If I switch boot disks to SATA mode they will boot, but networking is lost. It appears that no virtio devices are mapping through anymore.

Other VM's (a blend of debian, ubuntu, Windows 10 and Server 2019) work fine and reboot fine.

Any help would be very much appreciated!


Regards,
-Eric
 
Last edited:
I was experimenting with various possible causes....

Switched CPU type of the VM from HOST to EPYC-ROME which is what these servers are (7402P CPU's) and got the following error:

Code:
()
/dev/rbd32
/dev/rbd33
swtpm_setup: Not overwriting existing state file.
kvm: warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaves [bit 3]
kvm: Host doesn't support requested features
stopping swtpm instance (pid 210930) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1

Switched the CPU type to "EPYC" and the VM seems to be booting now.....

Something something something.... Can't put my finger on it but it appears that the CPU "host" option is bugged on EPYC-ROME platforms right now. Probably an issue with the kernel version on Proxmox.
 
Bumping this. "EPYC-ROME" option does not work on EPYC-ROME host hardware. Causing problems with some VM's if set to "host."

I waited for a round of updates to see if this would "Self resolve" but latest enterprise repo kernel was installed yesterday. Issue persisting.
 
Last edited:
Switched CPU type of the VM from HOST to EPYC-ROME which is what these servers are (7402P CPU's) and got the following error:
Code:
kvm: warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaves [bit 3]
"EPYC-ROME" option does not work on EPYC-ROME host hardware.

See:
 
  • Like
Reactions: AllanM
Aha!

Thanks Neobin!

I wonder if I'm observing 2 separate issues here or are these related? I believe both issues cropped up around the same time.

When "host" is selected for these CentOS7 VM's, I get those virtio mapping errors during the boot sequence of the VM, when "EPYC-Rome" is selected, I get the "Xsaves" error. Related?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!