Onboard SATA controller PCI Passthrough

Groovy6727

New Member
May 2, 2025
5
0
1
I am following Proxmox documentation
https://pve.proxmox.com/wiki/PCI_Passthrough
https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_pci_passthrough
and trying to passthrough onboard SATA controller to an OMV VM.

IOMMU is enabled in the BIOS/UEFI, SATA controller is the only device in IOMMU group #14.
If I remove PCI passthrough from VM, everything works like a charm, with SATA controller passthrough, host becomes unresponsive and hard reset is required.
Please help me understand what changes needs to be implemented for this PCI passthrough to work.

Host info:
Code:
OS: Proxmox VE 8.2.2 x86_64
Host: 11A4000HGE ThinkCentre M75q-1
Kernel: 6.8.4-2-pve
CPU: AMD Ryzen 5 PRO 3400GE w/ Radeon Vega Graphics (8) @ 3.300GHz
GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series

SATA controller info:
Code:
~# lspci -v
05:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 61) (prog-if 01 [AHCI 1.0])
    Subsystem: Lenovo FCH SATA Controller [AHCI mode]
    Flags: bus master, fast devsel, latency 0, IRQ 255, IOMMU group 14
    Memory at fcc00000 (32-bit, non-prefetchable) [size=2K]
    Capabilities: [48] Vendor Specific Information: Len=08 <?>
    Capabilities: [50] Power Management version 3
    Capabilities: [64] Express Endpoint, MSI 00
    Capabilities: [a0] MSI: Enable- Count=1/2 Maskable- 64bit+
    Capabilities: [d0] SATA HBA v1.0
    Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
    Capabilities: [150] Advanced Error Reporting
    Capabilities: [270] Secondary PCI Express
    Capabilities: [2a0] Access Control Services
    Kernel driver in use: vfio-pci
    Kernel modules: ahci

Host config:
Code:
~# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt vfio-pci.ids=1022:7901"
GRUB_CMDLINE_LINUX=""

~# cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

~# cat /etc/modprobe.d/pve-blacklist.conf
blacklist nvidiafb
blacklist ahci

~# cat /etc/modprobe.d/vfio_iommu.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

~# cat /etc/modprobe.d/vfio-pci.conf
options vfio-pci ids=1022:7901

Code:
~# update-grub && update-initramfs -u -k all && reboot now
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.8.4-2-pve
Found initrd image: /boot/initrd.img-6.8.4-2-pve
Found memtest86+ 64bit EFI image: /boot/memtest86+x64.efi
Adding boot menu entry for UEFI Firmware Settings ...
done
update-initramfs: Generating /boot/initrd.img-6.8.4-2-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.

checking if PCIe passthrough is possible
Code:
~# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi -e remapping
[    0.066674] AMD-Vi: Unknown option - 'on'
[    0.159073] AMD-Vi: Using global IVHD EFR:0x4f77ef22294ada, EFR2:0x0
[    0.439331] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.440703] AMD-Vi: Extended features (0x4f77ef22294ada, 0x0): PPR NX GT IA GA PC GA_vAPIC
[    0.440716] AMD-Vi: Interrupt remapping enabled
[    0.440872] AMD-Vi: Virtual APIC enabled
[    0.441003] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

VM config:
Code:
~# cat /etc/pve/qemu-server/100.conf
balloon: 0
bios: ovmf
boot: order=scsi0;net0
cores: 6
cpu: x86-64-v2-AES
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:05:00.0
machine: q35,viommu=intel
memory: 24576
meta: creation-qemu=8.1.5,ctime=1745780999
name: OMVvm
net0: virtio=BC:24:11:74:A4:F5,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-1,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=c8365e26-85ba-431b-9a16-2f398e5f1ea4
sockets: 1
tags: openmediavault
tpmstate0: local-lvm:vm-100-disk-2,size=4M,version=v2.0
vmgenid: e8aeb4ed-def7-41e9-8ec9-b965f8647795

kernel log after starting the VM:
Code:
~# echo "starting VM with PCI PT" > /dev/kmsg && dmesg -wH
[May 2 19:06] starting VM with PCI PT
[May 2 19:07] tap100i0: entered promiscuous mode
[  +0.083950] vmbr0: port 2(fwpr100p0) entered blocking state
[  +0.000008] vmbr0: port 2(fwpr100p0) entered disabled state
[  +0.000028] fwpr100p0: entered allmulticast mode
[  +0.000064] fwpr100p0: entered promiscuous mode
[  +0.000054] vmbr0: port 2(fwpr100p0) entered blocking state
[  +0.000003] vmbr0: port 2(fwpr100p0) entered forwarding state
[  +0.015343] fwbr100i0: port 1(fwln100i0) entered blocking state
[  +0.000006] fwbr100i0: port 1(fwln100i0) entered disabled state
[  +0.000019] fwln100i0: entered allmulticast mode
[  +0.000061] fwln100i0: entered promiscuous mode
[  +0.000053] fwbr100i0: port 1(fwln100i0) entered blocking state
[  +0.000002] fwbr100i0: port 1(fwln100i0) entered forwarding state
[  +0.014036] fwbr100i0: port 2(tap100i0) entered blocking state
[  +0.000006] fwbr100i0: port 2(tap100i0) entered disabled state
[  +0.000019] tap100i0: entered allmulticast mode
[  +0.000116] fwbr100i0: port 2(tap100i0) entered blocking state
[  +0.000003] fwbr100i0: port 2(tap100i0) entered forwarding state
 
Please share your IOMMU groups

Bash:
#!/bin/bash
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;
Where is the OS booted from? Hopefully not from a disk connected to that SATA controller.
 
Last edited:
  • Like
Reactions: SInisterPisces
ThinkCentre M75q-1 has one NVME drive and one SATA drive. Proxmox is booting of an NVME drive, where a volume for VM data is located as well.
Code:
# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme0n1                      259:0    0 238.5G  0 disk
├─nvme0n1p1                  259:1    0  1007K  0 part
├─nvme0n1p2                  259:2    0     1G  0 part /boot/efi
└─nvme0n1p3                  259:3    0 237.5G  0 part
  ├─pve-swap                 252:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                 252:1    0  69.4G  0 lvm  /
  ├─pve-data_tmeta           252:2    0   1.4G  0 lvm 
  │ └─pve-data-tpool         252:4    0 141.2G  0 lvm 
  │   ├─pve-data             252:5    0 141.2G  1 lvm 
  │   ├─pve-vm--100--disk--0 252:6    0     4M  0 lvm 
  │   ├─pve-vm--100--disk--1 252:7    0    64G  0 lvm 
  │   └─pve-vm--100--disk--2 252:8    0     4M  0 lvm 
  └─pve-data_tdata           252:3    0 141.2G  0 lvm 
    └─pve-data-tpool         252:4    0 141.2G  0 lvm 
      ├─pve-data             252:5    0 141.2G  1 lvm 
      ├─pve-vm--100--disk--0 252:6    0     4M  0 lvm 
      ├─pve-vm--100--disk--1 252:7    0    64G  0 lvm 
      └─pve-vm--100--disk--2 252:8    0     4M  0 lvm

I am trying to create an OMV NAS in a VM and passthrough entire SATA controller to the VM.

This is output of the bash script:
Code:
# ./debug_script.sh
IOMMU Group 0:
    00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 1:
    00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU Group 2:
    00:01.5 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU Group 3:
    00:01.6 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU Group 4:
    00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 5:
    00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A [1022:15db]
IOMMU Group 6:
    00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus B [1022:15dc]
IOMMU Group 7:
    00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
    00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 8:
    00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0 [1022:15e8]
    00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1 [1022:15e9]
    00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2 [1022:15ea]
    00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3 [1022:15eb]
    00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4 [1022:15ec]
    00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5 [1022:15ed]
    00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6 [1022:15ee]
    00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7 [1022:15ef]
IOMMU Group 9:
    01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD [15b7:5006]
IOMMU Group 10:
    02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
IOMMU Group 11:
    03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0e)
    03:00.1 Serial controller [0700]: Realtek Semiconductor Co., Ltd. RTL8111xP UART #1 [10ec:816a] (rev 0e)
    03:00.2 Serial controller [0700]: Realtek Semiconductor Co., Ltd. RTL8111xP UART #2 [10ec:816b] (rev 0e)
    03:00.3 IPMI Interface [0c07]: Realtek Semiconductor Co., Ltd. RTL8111xP IPMI interface [10ec:816c] (rev 0e)
    03:00.4 USB controller [0c03]: Realtek Semiconductor Co., Ltd. RTL811x EHCI host controller [10ec:816d] (rev 0e)
IOMMU Group 12:
    04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15d8] (rev da)
IOMMU Group 13:
    04:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller [1002:15de]
    04:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
    04:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e0]
    04:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e1]
    04:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor [1022:15e2]
    04:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller [1022:15e3]
IOMMU Group 14:
    05:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 61)
 
Sorry that this is not helpful but at first glance everything related to this problem looks okay to me. I fail to see an obvious issue.
 
Last edited:
I looked at other posts on this forum dealing with onboard PCI passthrough and as an act of desperation tried few more things:

on the host:

I added one more config file
Code:
~# cat /etc/modprobe.d/ahci.conf
softdep ahci pre: vfio-pci
but it didn't help. I would be surprised if it would, as lspci was showing vfio-pci driver used by the SATA controller.

and tried adding "pcie_acs_override" to the kernel cmdline
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt vfio-pci.ids=1022:7901 pcie_acs_override=downstream"

on the vm:

I played with different combinations of
Code:
pcie=1
rombar=0
viommu=none
viommu=virtio
None of the above changes made the SATA controller functional in the VM, each time I started VM, host became unresponsive and hard reset was required.

I did however managed to obtained two new messages from kernel log after starting VM.

#1
Code:
kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

#2
Code:
[May 4 11:07] starting VM with PCI PT
[  +6.764744] tap100i0: entered promiscuous mode
[  +0.084568] vmbr0: port 2(fwpr100p0) entered blocking state
[  +0.000008] vmbr0: port 2(fwpr100p0) entered disabled state
[  +0.000029] fwpr100p0: entered allmulticast mode
[  +0.000064] fwpr100p0: entered promiscuous mode
[  +0.000060] vmbr0: port 2(fwpr100p0) entered blocking state
[  +0.000003] vmbr0: port 2(fwpr100p0) entered forwarding state
[  +0.014884] fwbr100i0: port 1(fwln100i0) entered blocking state
[  +0.000006] fwbr100i0: port 1(fwln100i0) entered disabled state
[  +0.000020] fwln100i0: entered allmulticast mode
[  +0.000060] fwln100i0: entered promiscuous mode
[  +0.000055] fwbr100i0: port 1(fwln100i0) entered blocking state
[  +0.000003] fwbr100i0: port 1(fwln100i0) entered forwarding state
[  +0.014770] fwbr100i0: port 2(tap100i0) entered blocking state
[  +0.000006] fwbr100i0: port 2(tap100i0) entered disabled state
[  +0.000012] tap100i0: entered allmulticast mode
[  +0.000086] fwbr100i0: port 2(tap100i0) entered blocking state
[  +0.000003] fwbr100i0: port 2(tap100i0) entered forwarding state
[May 4 11:08] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [kworker/6:4:529]
[  +0.000019] Modules linked in: veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables bonding tls softdog sunrpc nfnetlink_log binfmt_misc nfnetlink intel_rapl_msr snd_sof_amd_acp63 intel_rapl_common snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir edac_mce_amd snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof kvm_amd snd_sof_utils amdgpu snd_soc_core kvm snd_compress snd_hda_codec_realtek snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_codec_hdmi crct10dif_pclmul polyval_clmulni polyval_generic amdxcp ghash_clmulni_intel drm_exec gpu_sched sha256_ssse3 drm_buddy snd_pci_ps sha1_ssse3 drm_suballoc_helper snd_hda_intel drm_ttm_helper snd_rpl_pci_acp6x snd_intel_dspcfg aesni_intel snd_acp_pci snd_intel_sdw_acpi ttm crypto_simd snd_acp_legacy_common snd_hda_codec cryptd snd_pci_acp6x drm_display_helper snd_pci_acp5x snd_hda_core snd_hwdep rapl snd_rn_pci_acp3x snd_pcm cec think_lmi snd_acp_config snd_soc_acpi snd_timer rc_core
[  +0.000109]  firmware_attributes_class wmi_bmof pcspkr serio_raw snd ipmi_devintf i2c_algo_bit soundcore snd_pci_acp3x ccp ipmi_msghandler k10temp mac_hid zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci nvme xhci_pci_renesas crc32_pclmul psmouse xhci_hcd ehci_pci r8169 nvme_core i2c_piix4 ehci_hcd realtek nvme_auth video wmi
[  +0.000079] CPU: 6 PID: 529 Comm: kworker/6:4 Tainted: P           O       6.8.4-2-pve #1
[  +0.000005] Hardware name: LENOVO 11A4000HGE/3151, BIOS M2FKT33A 06/21/2023
[  +0.000003] Workqueue: events netstamp_clear
[  +0.000010] RIP: 0010:smp_call_function_many_cond+0x136/0x500
[  +0.000009] Code: 63 d0 e8 3d 3e 5d 00 3b 05 37 9b 38 02 73 25 48 63 d0 49 8b 37 48 03 34 d5 e0 ac 8a b9 8b 56 08 83 e2 01 74 0a f3 90 8b 4e 08 <83> e1 01 75 f6 83 c0 01 eb c1 48 83 c4 48 5b 41 5c 41 5d 41 5e 41
[  +0.000003] RSP: 0018:ffffa7014147bca0 EFLAGS: 00000202
[  +0.000004] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000011
[  +0.000003] RDX: 0000000000000001 RSI: ffff8ec2ee23dce0 RDI: 0000000000000000
[  +0.000002] RBP: ffffa7014147bd10 R08: 0000000000000000 R09: 0000000000000000
[  +0.000002] R10: ffff8ebc00907d08 R11: 0000000000000000 R12: ffff8ec2ee535e80
[  +0.000002] R13: 0000000000000001 R14: 0000000000000006 R15: ffff8ec2ee535e80
[  +0.000002] FS:  0000000000000000(0000) GS:ffff8ec2ee500000(0000) knlGS:0000000000000000
[  +0.000003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000002] CR2: 0000713c12e3f020 CR3: 0000000183e36000 CR4: 00000000003506f0
[  +0.000003] Call Trace:
[  +0.000004]  <IRQ>
[  +0.000004]  ? show_regs+0x6d/0x80
[  +0.000008]  ? watchdog_timer_fn+0x206/0x290
[  +0.000006]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  +0.000004]  ? __hrtimer_run_queues+0x108/0x280
[  +0.000004]  ? srso_return_thunk+0x5/0x5f
[  +0.000009]  ? hrtimer_interrupt+0xf6/0x250
[  +0.000006]  ? __sysvec_apic_timer_interrupt+0x51/0x150
[  +0.000005]  ? sysvec_apic_timer_interrupt+0x8d/0xd0
[  +0.000005]  </IRQ>
[  +0.000001]  <TASK>
[  +0.000003]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[  +0.000010]  ? smp_call_function_many_cond+0x136/0x500
[  +0.000005]  ? __pfx_do_sync_core+0x10/0x10
[  +0.000007]  on_each_cpu_cond_mask+0x24/0x60
[  +0.000004]  text_poke_bp_batch+0xbe/0x300
[  +0.000006]  text_poke_finish+0x1f/0x40
[  +0.000003]  arch_jump_label_transform_apply+0x1a/0x30
[  +0.000004]  __jump_label_update+0xf4/0x140
[  +0.000007]  jump_label_update+0xae/0x120
[  +0.000004]  static_key_enable_cpuslocked+0x87/0xb0
[  +0.000005]  static_key_enable+0x1a/0x30
[  +0.000004]  netstamp_clear+0x2d/0x50
[  +0.000003]  process_one_work+0x16d/0x350
[  +0.000008]  worker_thread+0x306/0x440
[  +0.000006]  ? __pfx_worker_thread+0x10/0x10
[  +0.000003]  kthread+0xf2/0x120
[  +0.000005]  ? __pfx_kthread+0x10/0x10
[  +0.000004]  ret_from_fork+0x47/0x70
[  +0.000004]  ? __pfx_kthread+0x10/0x10
[  +0.000003]  ret_from_fork_asm+0x1b/0x30
[  +0.000009]  </TASK>
[May 4 11:09] watchdog: BUG: soft lockup - CPU#6 stuck for 52s! [kworker/6:4:529]
[  +0.000016] Modules linked in: veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables bonding tls softdog sunrpc nfnetlink_log binfmt_misc nfnetlink intel_rapl_msr snd_sof_amd_acp63 intel_rapl_common snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir edac_mce_amd snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof kvm_amd snd_sof_utils amdgpu snd_soc_core kvm snd_compress snd_hda_codec_realtek snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_codec_hdmi crct10dif_pclmul polyval_clmulni polyval_generic amdxcp ghash_clmulni_intel drm_exec gpu_sched sha256_ssse3 drm_buddy snd_pci_ps sha1_ssse3 drm_suballoc_helper snd_hda_intel drm_ttm_helper snd_rpl_pci_acp6x snd_intel_dspcfg aesni_intel snd_acp_pci snd_intel_sdw_acpi ttm crypto_simd snd_acp_legacy_common snd_hda_codec cryptd snd_pci_acp6x drm_display_helper snd_pci_acp5x snd_hda_core snd_hwdep rapl snd_rn_pci_acp3x snd_pcm cec think_lmi snd_acp_config snd_soc_acpi snd_timer rc_core
[  +0.000107]  firmware_attributes_class wmi_bmof pcspkr serio_raw snd ipmi_devintf i2c_algo_bit soundcore snd_pci_acp3x ccp ipmi_msghandler k10temp mac_hid zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci nvme xhci_pci_renesas crc32_pclmul psmouse xhci_hcd ehci_pci r8169 nvme_core i2c_piix4 ehci_hcd realtek nvme_auth video wmi
[  +0.000075] CPU: 6 PID: 529 Comm: kworker/6:4 Tainted: P           O L     6.8.4-2-pve #1
[  +0.000004] Hardware name: LENOVO 11A4000HGE/3151, BIOS M2FKT33A 06/21/2023
[  +0.000003] Workqueue: events netstamp_clear
[  +0.000008] RIP: 0010:smp_call_function_many_cond+0x133/0x500
[  +0.000007] Code: 7f 08 48 63 d0 e8 3d 3e 5d 00 3b 05 37 9b 38 02 73 25 48 63 d0 49 8b 37 48 03 34 d5 e0 ac 8a b9 8b 56 08 83 e2 01 74 0a f3 90 <8b> 4e 08 83 e1 01 75 f6 83 c0 01 eb c1 48 83 c4 48 5b 41 5c 41 5d
[  +0.000003] RSP: 0018:ffffa7014147bca0 EFLAGS: 00000202
[  +0.000003] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000001
[  +0.000002] RDX: 0000000000000001 RSI: ffff8ec2ee23dce0 RDI: 0000000000000000
[  +0.000003] RBP: ffffa7014147bd10 R08: 0000000000000000 R09: 0000000000000000
[  +0.000002] R10: ffff8ebc00907d08 R11: 0000000000000000 R12: ffff8ec2ee535e80
[  +0.000002] R13: 0000000000000001 R14: 0000000000000006 R15: ffff8ec2ee535e80
[  +0.000002] FS:  0000000000000000(0000) GS:ffff8ec2ee500000(0000) knlGS:0000000000000000
[  +0.000003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000003] CR2: 0000713c12e3f020 CR3: 0000000183e36000 CR4: 00000000003506f0
[  +0.000002] Call Trace:
[  +0.000003]  <IRQ>
[  +0.000004]  ? show_regs+0x6d/0x80
[  +0.000005]  ? watchdog_timer_fn+0x206/0x290
[  +0.000005]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  +0.000004]  ? __hrtimer_run_queues+0x108/0x280
[  +0.000005]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? hrtimer_interrupt+0xf6/0x250
[  +0.000005]  ? __sysvec_apic_timer_interrupt+0x51/0x150
[  +0.000005]  ? sysvec_apic_timer_interrupt+0x8d/0xd0
[  +0.000004]  </IRQ>
[  +0.000002]  <TASK>
[  +0.000003]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[  +0.000008]  ? smp_call_function_many_cond+0x133/0x500
[  +0.000006]  ? __pfx_do_sync_core+0x10/0x10
[  +0.000007]  on_each_cpu_cond_mask+0x24/0x60
[  +0.000003]  text_poke_bp_batch+0xbe/0x300
[  +0.000007]  text_poke_finish+0x1f/0x40
[  +0.000004]  arch_jump_label_transform_apply+0x1a/0x30
[  +0.000004]  __jump_label_update+0xf4/0x140
[  +0.000007]  jump_label_update+0xae/0x120
[  +0.000004]  static_key_enable_cpuslocked+0x87/0xb0
[  +0.000005]  static_key_enable+0x1a/0x30
[  +0.000004]  netstamp_clear+0x2d/0x50
[  +0.000003]  process_one_work+0x16d/0x350
[  +0.000007]  worker_thread+0x306/0x440
[  +0.000006]  ? __pfx_worker_thread+0x10/0x10
[  +0.000004]  kthread+0xf2/0x120
[  +0.000005]  ? __pfx_kthread+0x10/0x10
[  +0.000004]  ret_from_fork+0x47/0x70
[  +0.000003]  ? __pfx_kthread+0x10/0x10
[  +0.000004]  ret_from_fork_asm+0x1b/0x30
[  +0.000008]  </TASK>

None of the above are reproducible and #2 appeared only once.

My basic understanding of the of the 2nd error is that CPU #6 can'handle Interrupt generated by the device as it is in the soft-lock state. This explains why the machine is unresponsive and requires a hard reset, but unfortunately doesn't bring me any closer to a functional PCI passthrough. Please let me know if you see more interesting things in here.

As last point I tried checking if this is not a firmware related issue as I found some other errors when viewing kernel log, but all seems to be up to date.
Code:
~# fwupdmgr update
Devices with no available firmware updates:
 • System Firmware
 • UEFI dbx
 • WDC PC SN730 SDBQNTY-256G-1001
No updatable devices

I also removed the HDD trying to see if I can passthrough controller alone, without any devices attached to it, but this also was unsuccessful and didn't show any interesting log messages.

After two days of dealing with this, I came to a conclusion, that it is not possible, but please feel free to le me know I am wrong as I would like this to work. :)
 
I run an OMV NAS on Proxmox on my N100 nini PC. I am using usb drives that I pass through individually to OMV using this command: qm set 100 -scsi1 /dev/disk/by-id/ata-TEAM_T2532TB_XXX,serial=myserial001

I change the scsi to scsi 2, scsi 3, etc.for each drive I add (I use 4), and I increment the serial number as well.

OMV advises against using USB drives in raid array, so I don't do it. But I have tried it and it seemed to work fine for me. I used mdadm to create a mirror and then formatted the file system with BTRFS. Ran fine for months. But I decided to tear it down since this is just being used as a backup destination for Proxmox, TrueNAS and Synology. And I rotate those backups constantly (keeping the last 14), so I am not concerned with bit rot.

Depending on your needs you may be better off just passing through the drives instead of the controller. OMV will work just fine at that level, but you will lose SMART reporting. But so long as you stick to EXT4 or BTRFS, it doesn't need full control of the drives the way ZFS does. In th past I have even created MDADM mirrors with passed through drives not connected by USB, and set up BTRFS with snapshotting, scrubs, etc.

Just note that this only works if you assign each drive you pass through a unique serial number and also note I am not using BTRFS to create the raid array. I follow the Synology model and create my desired array with MDADM and then simply format the file system with BTRFS. I am not sure if MD is available out of the box with OMV or if it is a module I added from OMV Extras
 
Perhaps the hardware does not support VFIO, which isn’t unusual for gaming/desktop grade hardware:
 
Please share your IOMMU groups

Bash:
#!/bin/bash
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;
Where is the OS booted from? Hopefully not from a disk connected to that SATA controller.
Thanks for this script.

The output is easier to read than the one in the PVE wiki. :)
 
Thanks for all the support, this seems to be indeed a hardware issue. For whatever reason ThinkCentre M75q-1 does not like VFIO.
I tested the same config on my XEON server and I was able to passthrough onboard SATA without any issues.

VM config:
Code:
~# cat /etc/pve/qemu-server/100.conf
bios: ovmf
boot: order=scsi0;net0
cores: 4
cpu: x86-64-v2-AES
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:00:1f.2
machine: q35,viommu=intel
memory: 2048
meta: creation-qemu=9.2.0,ctime=1746527963
name: OMVVM
net0: virtio=BC:24:11:DA:CF:19,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-1,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=9496dcc2-54a5-4ff5-a4ae-328b293362ca
sockets: 1
vmgenid: c01acf97-6053-4899-b02a-e3d931607493

host SATA controller:
Code:
~# lspci -nn | grep SATA
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 05)

vm SATA controller:
Code:
~# lspci -nn | grep SATA
00:1f.2 SATA controller [0106]: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] [8086:2922] (rev 02)
06:10.0 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 05)
 
Currious to know : if you pass the ctrl.. how is proxmox running ? Is it running on an M.2 drive with a direct single iommu ?
 
Currious to know : if you pass the ctrl.. how is proxmox running ? Is it running on an M.2 drive with a direct single iommu ?
Proxmox is installed on NVME drive which is in IOMMU group 9, I am trying to pass through SATA controller which is in IOMMU group 14.