Hi. I've been banging my head against this problem for a few days now.
I've got this HBA:
and I'd like to pass it through to my linux guest. I'm doing all my testing with no media connected to the HBA. I've read https://pve.proxmox.com/wiki/PCI_Passthrough
My IOMMUs are all split, no numbers are shared. I've got the megaraid_sas module blacklisted and as you can see above, vfio-pci loaded in its place.
After playing with settings (PCI-express checkbox, ROM-bar checkbox, CPU type, vfio_iommu_type1.allow_unsafe_interrupts, many other things), I finally got my VM to boot and show the HBA passed through in the (Arch Linux) guest OS (as verified with lspci showing the HBA in the guest). But it turns out this only works maybe 1 out of 5 times when I shut down and restart the VM. 4 out of 5 times the whole proxmox server crashes instantly. When it crashes, there are no errors, just a hard lock up of the proxmox os. Nothing in the journal, nothing in dmesg, nothing in kvm stderr. Nothing.
The "working" (maybe 1 in 5 vm boots) qemu config looks like this:
When I remove
What can make a PCI passthrough setup work as expected on some VM boots, but cause the host OS to lock up hard on others with no configuration changes in between?
I've got this HBA:
Code:
# lspci -nnkv -d 1000:005d
61:00.0 RAID bus controller [0104]: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] [1000:005d] (rev 02)
Subsystem: Dell PERC H730P Adapter [1028:1f42]
Flags: bus master, fast devsel, latency 0, IRQ 255, NUMA node 0, IOMMU group 4
I/O ports at 8000 [size=256]
Memory at b8d00000 (64-bit, non-prefetchable) [size=64K]
Memory at b8c00000 (64-bit, non-prefetchable) [size=1M]
Expansion ROM at <ignored> [disabled]
Capabilities: [50] Power Management version 3
Capabilities: [68] Express Endpoint, MSI 00
Capabilities: [a8] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [c0] MSI-X: Enable- Count=97 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [1e0] Secondary PCI Express
Capabilities: [1c0] Power Budgeting <?>
Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: vfio-pci
Kernel modules: megaraid_sas
and I'd like to pass it through to my linux guest. I'm doing all my testing with no media connected to the HBA. I've read https://pve.proxmox.com/wiki/PCI_Passthrough
My IOMMUs are all split, no numbers are shared. I've got the megaraid_sas module blacklisted and as you can see above, vfio-pci loaded in its place.
After playing with settings (PCI-express checkbox, ROM-bar checkbox, CPU type, vfio_iommu_type1.allow_unsafe_interrupts, many other things), I finally got my VM to boot and show the HBA passed through in the (Arch Linux) guest OS (as verified with lspci showing the HBA in the guest). But it turns out this only works maybe 1 out of 5 times when I shut down and restart the VM. 4 out of 5 times the whole proxmox server crashes instantly. When it crashes, there are no errors, just a hard lock up of the proxmox os. Nothing in the journal, nothing in dmesg, nothing in kvm stderr. Nothing.
The "working" (maybe 1 in 5 vm boots) qemu config looks like this:
Code:
balloon: 0
bios: ovmf
boot: order=ide2
cores: 1
cpu: IvyBridge
efidisk0: local-zfs:vm-103-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:61:00
ide2: local:iso/archlinux-2023.07.01-x86_64.iso,media=cdrom,size=832844K
machine: q35
memory: 2048
meta: creation-qemu=8.0.2,ctime=1690173855
name: archbase2
numa: 0
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=714c8cd5-e224-4562-846d-1737a1944c64
sockets: 1
vmgenid: 7a0bc34d-13c9-4918-a9d9-75190cd58f2f
When I remove
hostpci0: 0000:61:00 passthrough
line from the qemu config, the VM, as well as the rest of PVE works as expected.What can make a PCI passthrough setup work as expected on some VM boots, but cause the host OS to lock up hard on others with no configuration changes in between?