PCI Passthrough crashes system when launching VM

andreimx

New Member
May 2, 2024
4
0
1
Hello, I am trying to passthrough a PCIe card (LSI sas 2008) on the motherboard (PRIME B450-PLUS)

I have enabled the settings in the BIOS, the VM I am trying to boot with has the latest Debian, I have also disabled ballooning for the memory.

I add the PCI in the hardware setting of the VM. Everything works well on the Proxmox. As soon as I launch the VM with the PCI attached, the whole Proxmox server crashes. Here what I get in the console, I have also attached an image here where it is easier to read (https://gyazo.com/cfef26dcbc32c247f0b266574efed998):
Code:
267.7066981
tap10410: left allmulticast mode
267.7067371
vmbro:
port 5(tap10410)
328.8848571
328.891149]
VFIO - User Level meta-driver version: 0.3 r8169 0000:03:00.0 enp30: Link is Down
328.8990441
vmbr®: port 1(enp3s0)
328.8992221
r8169
328.8992381
0000:03:00.0 enps0 (unregistering): left alimulticast mode r8169 0000:03:00.0 enp3s0 (unregistering): left promiscuous mode
328.8992521
328.942315]
vmbro: port 1(enp3se)
xhc1. hcd
328.9423321
0000:01:00.0:
328.942652]
ush usb2: USB
disconnect,
device number 1
xhci_hcd 0000:01:00.0:
USB bus 2 deregistered
328.9426771
xhc1 hcd 0000:01:00.08 remove,
state 4
328. 942687
ush usb1: USB disconnect,
cevice number 1
328.943511]
xhci hcd 0000:01:00.0: USB bus
328.944812]
mptsas_cme: sending message unit reset
11t reset tered
328.946336]
mptsas_cme: message unit reset: SUCCESS
328.9943163
zio pool=rpool vdev=/dev/disk/by-id/ata-Samsung_SSD_850_EVO_120GB_S21UNXAG834075V-part error=5 type=1 offset=270396 size=8192 flags=721601
329.0057743
sd 0:0:0:0: [sdal
Synchronizing SCSI cache
329.0073501
329.530759]
ata1.00: Entering standby power
mode
sd 1:0:0:0: [sdbl Synchronizing ScST
329.532264] ata2.00:
Entering standby power mode
cache
329.5686283
zio pool=rpool
vdev=/dev/disk/by-id/ata-Samsung_SSD_840_EVO_120GB_S1D7NEADB02976M-parti error=5 type=1 offset=270336 size=8192 flags=721601
329.
570579
WARNING: Pool
'rpool' has encountered an uncorrectable 1/0 failure and has been suspended.
329.570579]
329.571477]
WARNING: Pool 'rpool has encountered an uncorrectable 1/0 failure and has been suspended.
329.
571477)
329.571775]
sd 4:0:0:0: [sdc]
Synchronizing
SCSI cache
329.5746531
ata5.00:
Entering standby power
mode
330.6687763
sd 5:0:0:0:
[sdd] Synchronizing
ScsI cache
330.6695501
ata6.00: Entering standby power mode
330.9442553
WARNING: Pool
'rpool'
has
encountered
an uncorrectable I/0 failure and has been suspended.

Any help / advice is very helpful. Thank you!
 
Check your IOMMU groups (and read up what they are for) because it looks like some devices that are essential to Proxmox share the group with the device you passthrough to the VM. B450 motherboards only support passthrough of one PCIe x16 and one M.2 PCIe x4 slot connected directly to the CPU.
 
Check your IOMMU groups (and read up what they are for) because it looks like some devices that are essential to Proxmox share the group with the device you passthrough to the VM. B450 motherboards only support passthrough of one PCIe x16 and one M.2 PCIe x4 slot connected directly to the CPU.
Thank you very much for the answer. I looked a bit into it and here is the output of the IOMMU groups and devices:
https://ctxt.io/2/AAAoHldzFg
 
Thank you very much for the answer. I looked a bit into it and here is the output of the IOMMU groups and devices:
https://ctxt.io/2/AAAoHldzFg
Sorry, but I'm not going to click unknown links. You cannot (securely) share devices in the same group between VMs and/or the Proxmox host. Is your device in the PCIe x16 slot connected to the CPU? Is it alone in the IOMMU group, or is the drive controller of your rpool also in that group? Maybe you can check yourself instead of asking me to read it for you?
 
Sorry, but I'm not going to click unknown links. You cannot (securely) share devices in the same group between VMs and/or the Proxmox host. Is your device in the PCIe x16 slot connected to the CPU? Is it alone in the IOMMU group, or is the drive controller of your rpool also in that group? Maybe you can check yourself instead of asking me to read it for you?
Sorry, it was too large to include it in the reply. It is in group 8, together with several others (PCI Express Gigabit Ethernet Controller, [AMD] 400 Series Chipset PCIe Port (rev 01), [AMD] 400 Series Chipset PCIe Bridge...). It is in the x4/x2 slot. When I mount it on the x16 slot, the portal is no accessible neither ssh into the server.
 
It is in group 8, together with several others (PCI Express Gigabit Ethernet Controller, [AMD] 400 Series Chipset PCIe Port (rev 01), [AMD] 400 Series Chipset PCIe Bridge...). It is in the x4/x2 slot.
That's your problem then (which is quite common with Ryzen except X570). The way to fix this is indeed to move the PCIe device to the primary PCIe x16 slot.
When I mount it on the x16 slot, the portal is no accessible neither ssh into the server.
The name of the network device depends on the PCI ID of the network controller, which can change when you add/enable/remove/disable/move PCI(e) devices. Lots of threads about this on the forum. Use ip a to find the new network device name and adjust /etc/network/interfaces accordingly.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!