[SOLVED] Working VM now halts at _ Booting from Hard Disk _ with LSI HBA Passthrough

alafrosty

New Member
Jun 20, 2022
12
6
3
Thanks for reading. I'm a week newbie to Proxmox. I had a good setup (i.e. things were working exactly as expected) and I was playing with various Proxmox assets when my NAS (openmediavault) VM suddenly stopped working on July 2, 2022. It was set up with my LSI HBA in a passthrough to the OS. There were lots of installation and updating things going on with various VMs and CTs so I have no idea what set it off. I attempted a bunch of troubleshooting, but no luck. The BIOS appears to Post and the LSI HBA BIOS runs and finds all the drives, but then it halts immediately when it attempts to "boot from Hard Disk" with the PCIe device in passthrough. If I remove the PCI device from the hardware list, the VM boots perfectly. The "halt" in processing is accompanied by memory usage running to maximum and CPU core to max usage. I suspect a bug with a runaway process sucking up all the RAM but I have no idea how to troubleshoot it.

I created a VM with a bare-bones Debian install and got it running without the PCIe passthrough, then attempted to add the device. That also caused the exact same behavior. The BIOS says that it's booting, but then the GRUB message does not appear.

I decided that I must've blown something up while attempting to get some recalcitrant PCI NICs to do PCI passthrough, so I reinstalled proxmox from scratch and it's behaving exactly the same now as right before the re-install (i.e. broken).

Any suggestions on how to do additional debug or what to fix would be very helpful, please. No hardware changes at all. Some BIOS changes were going on, but I don't think that there were any between when it was working and when it stopped working.

Other probably irrelevant details
• I've tried installing the VM both with and without the HBA at initial boot. It will boot the first time with the HBA, and do the install, but on the first attempt to boot into grub the Memory usage shoots up to almost 100% a couple of seconds after starting and one of the CPUs get maxed out (25% CPU usage with 4 cores).
• VM kern.log has no entries for the stalled boots.
• I've tried a lot of different combinations of things differently in the hardware and Options list and the only one that makes a difference is removing the HBA controller. (i.e. adding or removing the second GBe device to/from the VM makes no difference - VM boots when the HBA isn't in and doesn't boot when the HBA is out).
• I have a hunch that some recent update has broken the functionality, but I can't figure out where
Code:
# dmesg | grep -e DMAR -e IOMMU
[    0.000000] ACPI: DMAR 0x00000000BF75E0D0 000128 (v01 AMI    OEMDMAR  00000001 MSFT 00000097)
[    0.000000] ACPI: Reserving DMAR table memory at [mem 0xbf75e0d0-0xbf75e1f7]
[    0.000000] DMAR: IOMMU enabled


Code:
 cat /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
…
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt transparent_hugepage=always pcie_port_pm=off nofb vfio-pci.ids=1000:0064"
GRUB_CMDLINE_LINUX=""
Code:
# find /sys/kernel/iommu_groups/ -type l|grep 37
/sys/kernel/iommu_groups/37/devices/0000:04:00.0

Code:
04:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
        Subsystem: Broadcom / LSI SAS 9201-16i
        Kernel driver in use: vfio-pci
        Kernel modules: mpt3sas

Code:
# uname -r
5.15.30-2-pve
 
Last edited:
Code:
# lspci -k
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22)
        Subsystem: Super Micro Computer Inc 5520 I/O Hub to ESI Port
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22)
        Kernel driver in use: pcieport
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22)
        Kernel driver in use: pcieport
00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 22)
        Kernel driver in use: pcieport
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22)
        Kernel driver in use: pcieport
00:0d.0 Host bridge: Intel Corporation Device 343a (rev 22)
00:0d.1 Host bridge: Intel Corporation Device 343b (rev 22)
00:0d.2 Host bridge: Intel Corporation Device 343c (rev 22)
00:0d.3 Host bridge: Intel Corporation Device 343d (rev 22)
00:0d.4 Host bridge: Intel Corporation 7500/5520/5500/X58 Physical Layer Port 0 (rev 22)
00:0d.5 Host bridge: Intel Corporation 7500/5520/5500 Physical Layer Port 1 (rev 22)
00:0d.6 Host bridge: Intel Corporation Device 341a (rev 22)
00:0e.0 Host bridge: Intel Corporation Device 341c (rev 22)
00:0e.1 Host bridge: Intel Corporation Device 341d (rev 22)
00:0e.2 Host bridge: Intel Corporation Device 341e (rev 22)
00:0e.4 Host bridge: Intel Corporation Device 3439 (rev 22)
00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22)
00:14.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub System Management Registers (rev 22)
        Kernel driver in use: i7core_edac
        Kernel modules: i7core_edac
00:14.1 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22)
00:14.2 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22)
00:14.3 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle Registers (rev 22)
        Kernel modules: i5500_temp
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
        Subsystem: Super Micro Computer Inc 5520/5500/X58 Chipset QuickData Technology Device
        Kernel driver in use: ioatdma
        Kernel modules: ioatdma
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB2 EHCI Controller
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
        Kernel driver in use: pcieport
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB UHCI Controller
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci_hcd
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) USB2 EHCI Controller
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
        Subsystem: Super Micro Computer Inc 82801JIR (ICH10R) LPC Interface Controller
        Kernel driver in use: lpc_ich
        Kernel modules: lpc_ich
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) SATA AHCI Controller
        Kernel driver in use: ahci
        Kernel modules: ahci
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
        Subsystem: Super Micro Computer Inc 82801JI (ICH10 Family) SMBus Controller
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801
01:03.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)
        Subsystem: Super Micro Computer Inc MGA G200eW WPCM450
        Kernel driver in use: mgag200
        Kernel modules: matroxfb_base, mgag200
01:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8169/8110 Family PCI Gigabit Ethernet NIC
        Kernel driver in use: r8169
        Kernel modules: r8169
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
        Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter
        Kernel driver in use: r8169
        Kernel modules: r8169
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
        Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter
        Kernel driver in use: vfio-pci
        Kernel modules: r8169
04:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
        Subsystem: Broadcom / LSI SAS 9201-16i
        Kernel driver in use: vfio-pci
        Kernel modules: mpt3sas
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
        Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
        Kernel driver in use: nvme
        Kernel modules: nvme
06:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
        Subsystem: Super Micro Computer Inc 82576 Gigabit Network Connection
        Kernel driver in use: igb
        Kernel modules: igb
06:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
        Subsystem: Super Micro Computer Inc 82576 Gigabit Network Connection
        Kernel driver in use: igb
        Kernel modules: igb
fe:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers (rev 02)
        Subsystem: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers
… <+ a bunch moreidentical Xeon 5600 Series items>


Code:
 # lspci -kv
04:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
        Subsystem: Broadcom / LSI SAS 9201-16i
        Flags: bus master, fast devsel, latency 0, IRQ 29, IOMMU group 37
        I/O ports at d000 [size=256]
        Memory at fac3c000 (64-bit, non-prefetchable) [size=16K]
        Memory at fac40000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at fac80000 [disabled] [size=512K]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable- Count=15 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [138] Power Budgeting <?>
        Capabilities: [150] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
        Kernel driver in use: vfio-pci
        Kernel modules: mpt3sas
Code:
# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1000:0064
softdep mpt3sas pre:vfio-pci

Code:
# cat /etc/pve/qemu-server/201.conf
agent: 1
boot: order=scsi0
cores: 2
hostpci0: 0000:03:00.0
hostpci1: 0000:04:00.0
ide2: none,media=cdrom
machine: q35
memory: 16536
meta: creation-qemu=6.2.0,ctime=1656899310
name: NAS
net0: virtio=52:8B:52:B5:19:95,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-zfs:vm-201-disk-0,discard=on,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=79485b4d-cb80-4d73-91c4-8a992eb3b12b
sockets: 2
tablet: 0
vmgenid: f0b286b0-2bc9-4efc-bafd-2973ec91a842

Tail of dmesg at start of VM boot until it stalls.
Code:
# qm start 201
# tail -50 /var/log/syslog
…
Jul  3 23:49:07 xxxxxx qmeventd[20381]: Starting cleanup for 201
Jul  3 23:49:07 xxxxxx qmeventd[20381]: Finished cleanup for 201
Jul  3 23:49:07 xxxxxx systemd[1]: 201.scope: Succeeded.
Jul  3 23:49:07 xxxxxx systemd[1]: 201.scope: Consumed 57.863s CPU time.
Jul  3 23:54:31 xxxxxx qm[22697]: <root@pam> starting task UPID:xxxxxx:000058AA:00036AAF:62C28EA7:qmstart:201:root@pam:
Jul  3 23:54:31 xxxxxx qm[22698]: start VM 201: UPID:xxxxxx:000058AA:00036AAF:62C28EA7:qmstart:201:root@pam:
Jul  3 23:54:32 xxxxxx systemd[1]: Started 201.scope.
Jul  3 23:54:32 xxxxxx systemd-udevd[22711]: Using default interface naming scheme 'v247'.
Jul  3 23:54:32 xxxxxx systemd-udevd[22711]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul  3 23:54:33 xxxxxx kernel: [ 2241.803136] device tap201i0 entered promiscuous mode
Jul  3 23:54:33 xxxxxx systemd-udevd[22711]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul  3 23:54:33 xxxxxx systemd-udevd[22711]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul  3 23:54:33 xxxxxx systemd-udevd[22714]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul  3 23:54:33 xxxxxx systemd-udevd[22714]: Using default interface naming scheme 'v247'.
Jul  3 23:54:33 xxxxxx kernel: [ 2241.836899] vmbr0: port 2(fwpr201p0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.836905] vmbr0: port 2(fwpr201p0) entered disabled state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.836986] device fwpr201p0 entered promiscuous mode
Jul  3 23:54:33 xxxxxx kernel: [ 2241.837036] vmbr0: port 2(fwpr201p0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.837039] vmbr0: port 2(fwpr201p0) entered forwarding state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.842490] fwbr201i0: port 1(fwln201i0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.842496] fwbr201i0: port 1(fwln201i0) entered disabled state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.842680] device fwln201i0 entered promiscuous mode
Jul  3 23:54:33 xxxxxx kernel: [ 2241.842764] fwbr201i0: port 1(fwln201i0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.842767] fwbr201i0: port 1(fwln201i0) entered forwarding state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.848756] fwbr201i0: port 2(tap201i0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.848762] fwbr201i0: port 2(tap201i0) entered disabled state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.848877] fwbr201i0: port 2(tap201i0) entered blocking state
Jul  3 23:54:33 xxxxxx kernel: [ 2241.848880] fwbr201i0: port 2(tap201i0) entered forwarding state
Jul  3 23:54:39 xxxxxx pvedaemon[2893]: VM 201 qmp command failed - VM 201 qmp command 'query-proxmox-support' failed - got timeout
Jul  3 23:54:40 xxxxxx qm[22697]: <root@pam> end task UPID:xxxxxx:000058AA:00036AAF:62C28EA7:qmstart:201:root@pam: OK
Jul  3 23:54:40 xxxxxx pvedaemon[2893]: <root@pam> starting task UPID:xxxxxx:0000596F:00036E28:62C28EB0:vncproxy:201:root@pam:
Jul  3 23:54:40 xxxxxx pvedaemon[22895]: starting vnc proxy UPID:xxxxxx:0000596F:00036E28:62C28EB0:vncproxy:201:root@pam:
Jul  3 23:55:13 xxxxxx pveproxy[2934]: worker exit
Jul  3 23:55:13 xxxxxx pveproxy[2933]: worker 2934 finished
Jul  3 23:55:13 xxxxxx pveproxy[2933]: starting 1 worker(s)
Jul  3 23:55:13 xxxxxx pveproxy[2933]: worker 23133 started
 
And the solution, after 2 days, is:
HARDWARE: PCI DEVICE
All Functions: Off
ROM-Bar: Off
PCI-Express ON
Primary GPU: OFF

But the magic bullet was definitely "ROM-Bar: Off"
 
  • Like
Reactions: ekin06
Sorry, but what is ROM-Bar?

From what I understand, ROM-BAR is the PCIe card's ROM "Base Address Register." When this is selected, the firmware ROM on the PCIe card will run during the VM's boot process. When it is not enabled (not selected), then the board's firmware will not run during the VM POST. I've had to disable this to get HBAs to passthrough and work in the VM.

DataCenter -> Node -> ### VM (xxxxx) -> Hardware -> PCI Device -> <EDIT>

1657041735361.png
 
Last edited:
  • Like
Reactions: dougmakes
And the LSI 9201 controller bit the dust with an IOC fault less than one month after I got it working. I picked up an LSI 9300-16i as a replacement and it works fine. Interestingly, where the LSI 9201-16i had one 16 drive controller on it, the 9300-16i design is two 8-channel controllers mounted on one board and they're detected as two different 8-drive devices by Proxmox (see below).

Code:
# lspci |grep LSI
05:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
07:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

These newer 12G controllers are considerably cheaper than attempting to replace the 9201 controller, but I did need to buy new cables for it. The 9201 requires male SFF-8087 multi-lane cables where the 9300 requires SFF-8643 multi-lane cables that breakout to either SAS or SATA connections. At this time, I also picked up an 8-port SATA PCIe card (ASMedia) and I'm successfully passthrough both PCI devices to the VM that needs them for file sharing.

Code:
# lspci -v -s 02:00
02:00.0 SATA controller: ASMedia Technology Inc. Device 1064 (rev 02) (prog-if 01 [AHCI 1.0])
        Subsystem: ZyDAS Technology Corp. Device 2116
        Flags: bus master, fast devsel, latency 0, IRQ 87, IOMMU group 42
        Memory at fa77c000 (32-bit, non-prefetchable) [size=8K]
        Memory at fa77e000 (32-bit, non-prefetchable) [size=8K]
        Expansion ROM at fa780000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [130] Secondary PCI Express
        Kernel driver in use: vfio-pci
        Kernel modules: ahci
 
Thank you. I have been trying passthrough Lsi 9207-8i these days.
I have tried to run Openmediavault on LXC. But i found that OMV can't get the SMART , and i don't know how to import the ZFS pool. So i abandon OMV on LXC.
Afterward i tried OMV on VM the same as use Esxi. Now, i have tried you method, but i found PVE reboot that VM run some minutes.
I had taked sceenshot before PVE rebooted.
Screenshot_20230529_225206.png

Plese forgive me for my poor english.
 
Code:
# lspci | grep LSI
02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

Code:
# lspci -v -s 02:00
02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
        Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA
        Physical Slot: 3
        Flags: bus master, fast devsel, latency 0, IRQ 26, NUMA node 0, IOMMU group 37
        I/O ports at e000 [size=256]
        Memory at fb240000 (64-bit, non-prefetchable) [size=64K]
        Memory at fb200000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at fb100000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=16 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1e0] Secondary PCI Express
        Capabilities: [1c0] Power Budgeting <?>
        Capabilities: [190] Dynamic Power Allocation <?>
        Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
 
Is your HBA in it's own group?
Sorry, I'm busying with my work recently.
A few days ago, I tried again. Now i found one problem.
In /etc/modprobe.d/pve-blacklist.conf,it don't lack spaces.
Code:
softdep some_name pre: vfio-pci
I'm missing a space between "pre:" and "vfio-pci".
Now, VM seems to be working fine until I was woken by the sound of IPMI's alert. Because it's summer now,:)
 
And the solution, after 2 days, is:
HARDWARE: PCI DEVICE
All Functions: Off
ROM-Bar: Off
PCI-Express ON
Primary GPU: OFF

But the magic bullet was definitely "ROM-Bar: Off"
Amazing, thanks for your support! Working

Do you know how to test the passthrough mechanism in a proper way? that everything is working as expected? I was doing it since years but I replace the pcie device and wondering how to verify everything is well setup...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!