VM Failures after enabling CPU Virtualization on new MB

JayB21

New Member
Apr 12, 2025
8
1
3
I have 10 vms that have been running fine in ProxMox 8.4.14 for about 8 months.

I just swapped mother boards to an ASUS B550-Plus AC-HES as I needed more PCI slots for an additional NIC.

My new system is populated by a Ryzen 7 5900X, 64GB ram, 2 - 1TB nvme & 2 XZSNET 10G Network Cards with the Intel X540 Chip

When I booted up the first time after the swap, things loaded fine and I was able to bring up the webGUI to see that the vm's errored with CPU Virtualization not enabled (because I forgot to do that)

However when I went into the BIOS and enabled SVM Mode and restarted, I get the grub screen and then when it tries to boot into proxmox, I get failures such as Detected aborted journal, EXT4 remounts read only and EXT4-fs error ext4_reserve_inode_write:5792: IO failure, Remounting filesytem readonly where it just halts and won't continue to boot

I can't even reset the system without a power cycle.

I just updated the BIOS to the latest non-beta version (January 2025) and no luck.

So I can boot fine without virtualization but not run any VMs or not boot at all when virtualization is enabled.

Any thoughts on what's going on?
 
Hi. My first thought would be "Do I have backups?" :)

The second one would be booting from some "live CD" or "recovery CD" and trying to fsck this filesystem :)
 
So I can boot fine without virtualization but not run any VMs or not boot at all when virtualization is enabled.

Any thoughts on what's going on?
My guess would be that you have VMs with PCI(e) passthrough that start automatically. However, the PCI ID of various devices is different on the new motherboard. Also the IOMMU groups are different on the new motherboard. Therfore, the wrong devices or devices that are needed by the Proxmox host (like a NVMe SSD) are taken away from the host when starting a VM.

Disable IOMMU temporarily (in the boot menu or the motherboard BIOS) to prevent VMs with PCI(e) passthrough from starting automatically and make sure they don't start automatically.
Then sort out the mess of the new PCI IDs and check the IOMMU groups that come with a change in motherboard (and/or different motherboard BIOS version).
 
My guess would be that you have VMs with PCI(e) passthrough that start automatically. However, the PCI ID of various devices is different on the new motherboard. Also the IOMMU groups are different on the new motherboard. Therfore, the wrong devices or devices that are needed by the Proxmox host (like a NVMe SSD) are taken away from the host when starting a VM.

Disable IOMMU temporarily (in the boot menu or the motherboard BIOS) to prevent VMs with PCI(e) passthrough from starting automatically and make sure they don't start automatically.
Then sort out the mess of the new PCI IDs and check the IOMMU groups that come with a change in motherboard (and/or different motherboard BIOS version).

Progress!
Yes I had 2 VMs with PCI passthrough enabled (pfSense VMs) I turned off start on boot and was able to get to the webUI with a normal boot with virtualization enabled.

None of the other VMs started but the error message makes sense. "bridge 'vmbr0' does not exist" which is correct as vmbr0 was the onboard NIC from the old MB and doesn't exist anymore.

I'll work at fixing that and report back to either summarize or ask a few more things.

Thanks.
 
My VMs with virtual NICs are working fine now.

So far no luck in bring both my VMs with PCI passthrough back online. One works, both won't/

In Prox I'm showing

enp7s0f0 192.168.1.254
enp7s0f1 192.168.1.253
enp9s0f0 192.168.1.252
enp9s0f1 192.168.1.251
vmbr0 192.168.1.250

(I'm using enp7s0f0 as the management interface IP)

afaik, these are on separate iommu groups (7 & 9?)

When I assign a single enp pci passthru VM, I can start it.

As soon as I choose any of the other enp's for the second pci passthrough and try to start the VM, it crashes prox/debian.

Either VM will start but not both.

I've tried both VMs on the same NIC (these are dual 10gb cards)

VM1 enp7s0f0
VM2 enp7s0f1

and different NICs
VM1 enp7s0f0
VM2 enp9s0f1

I've moved the NICs around to different slots so I've also tried using combinations of

enp5s0f0 & en5s0f1
enp6s0f0 & en6s0f1

No variations work but my reading seems to indicate this should work



Any ideas ?
 
I didn't show the whole list but 1 card is in group 15 and the other card is in 16 & 17. There are other things in the 15 group but the 16 & 17 are the only things in those groups. The MB has 5 slots (NIC1 in 1, nothing in 2, NIC2 in 3, videocard in 3 (planned on removing it when set up is fixed) & nothing in 5 plus 2 M.2 slots each with a 1 tb nvme
 

Attachments

  • PXL_20251101_211456618.jpg
    PXL_20251101_211456618.jpg
    366.7 KB · Views: 5
  • PXL_20251101_211512637.jpg
    PXL_20251101_211512637.jpg
    334.4 KB · Views: 5
Please show text in CODE-tags instead of photo's. You cannot passthrough the 0000:07.* devices (enp7s* NICs) because they share their IOMMU group with other devices that are important to the Proxmox host (and that's causing crashes). Passthrough of the 0000:09.* devices (ep9s* NICs) might work (but that also depends on the devices themselves and whether they reset properly and don't lie about DMA). IOMMU groups are explained on the the Proxmox Wik (and the links on that page)i: https://pve.proxmox.com/wiki/PCI_Passthrough#Verify_IOMMU_isolation