[SOLVED] NVME Passthrough : Issues blacklisting specific device (can't unbind from host)

Giovanni

Renowned Member
Apr 1, 2009
109
11
83
I'm trying to permanently passthrough a Samsung 980 NVME m.2 disk to VMs. The problem is that my host node continues to load this device with 'nvme' driver instead of vfio for passthrough.

I have followed all of the documentation and recommended steps; yet I can't seem to unbind this specific device from 'nvme' driver. Any help? I cannot blacklist 'nvme' driver entirely on this proxmox host since the boot disk of PVE is also another nvme disk.

Unfortunately https://pve.proxmox.com/wiki/PCI(e)_Passthrough doesn't cover my scenario where I must keep 'nvme' driver viable and since the other way of doing it isn't working I am stuck... hopefully someone has some workaround for me?

See below.

Code:
root@pgn:~# cat /proc/cmdline
initrd=\EFI\proxmox\5.19.7-1-pve\initrd.img-5.19.7-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt pcie_aspm=force pcie_aspm.policy=powersupersave vfio-pci.ids=144d:a801
# lspci -nnk
02:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E16 PCIe4 NVMe Controller [1987:5016] (rev 01)
    Subsystem: Phison Electronics Corporation E16 PCIe4 NVMe Controller [1987:5016]
    Kernel driver in use: nvme
    Kernel modules: nvme
03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 [144d:a802] (rev 01)
    Subsystem: Samsung Electronics Co Ltd PM963 2.5" NVMe PCIe SSD [144d:a801]
    Kernel driver in use: nvme
    Kernel modules: nvme
 
Try moving the early binding to vfio-pci (vfio-pci.ids=144d:a801, which I think should be a802) to a file (that ends with .conf) in the /etc/modprobe.d/ directory. You need to make sure vfio-pci is loaded before the actual driver, which can be done with a softdep (which is not mentioned in the Proxmox documentation). Don't forget to run update-initramfs -u etc. For example /etc/modprobe.d/vfio.conf:
options vfio-pci ids=144d:a802 softdep nvme pre: vfio-pci
 
Last edited:
  • Like
Reactions: Giovanni
Try moving the early binding to vfio-pci (vfio-pci.ids=144d:a801, which I think should be a802) to a file (that ends with .conf) in the /etc/modprobe.d/ directory. You need to make sure vfio-pci is loaded before the actual driver, which can be done with a softdep (which is not mentioned in the Proxmox documentation). Don't forget to run update-initramfs -u etc. For example /etc/modprobe.d/vfio.conf:
options vfio-pci ids=144d:a802 softdep nvme pre: vfio-pci
oh wow, this is it! thank you so much.

Code:
02:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E16 PCIe4 NVMe Controller [1987:5016] (rev 01)
    Subsystem: Phison Electronics Corporation E16 PCIe4 NVMe Controller [1987:5016]
    Kernel driver in use: nvme
    Kernel modules: nvme
03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 [144d:a802] (rev 01)
    Subsystem: Samsung Electronics Co Ltd PM963 2.5" NVMe PCIe SSD [144d:a801]
    Kernel driver in use: vfio-pci
    Kernel modules: nvme
04:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:0097] (rev 02)
    Subsystem: Broadcom / LSI SAS9300-8i [1000:30e0]
    Kernel driver in use: mpt3sas
    Kernel modules: mpt3sas
 
I try to pass an nvme to my unraid server.The problem is that multiple devices are in the same IOMMU group

Code:
IOMMU group 0 0000:00:02.0 VGA compatible controller [0300]: Intel Corporation AlderLake-S GT1 [8086:4680] (rev 0c)
IOMMU group 10 0000:00:1a.0 PCI bridge [0604]: Intel Corporation Alder Lake-S PCH PCI Express Root Port #25 [8086:7ac8] (rev 11)
IOMMU group 11 0000:00:1b.0 PCI bridge [0604]: Intel Corporation Device [8086:7ac0] (rev 11)
IOMMU group 12 0000:00:1c.0 PCI bridge [0604]: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 [8086:7ab8] (rev 11)
IOMMU group 13 0000:00:1c.2 PCI bridge [0604]: Intel Corporation Device [8086:7aba] (rev 11)
IOMMU group 14 0000:00:1c.4 PCI bridge [0604]: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 [8086:7abc] (rev 11)
IOMMU group 15 0000:00:1d.0 PCI bridge [0604]: Intel Corporation Alder Lake-S PCH PCI Express Root Port #9 [8086:7ab0] (rev 11)
IOMMU group 16 0000:00:1f.0 ISA bridge [0601]: Intel Corporation Z690 Chipset LPC/eSPI Controller [8086:7a84] (rev 11)
IOMMU group 16 0000:00:1f.3 Audio device [0403]: Intel Corporation Alder Lake-S HD Audio Controller [8086:7ad0] (rev 11)
IOMMU group 16 0000:00:1f.4 SMBus [0c05]: Intel Corporation Alder Lake-S PCH SMBus Controller [8086:7aa3] (rev 11)
IOMMU group 16 0000:00:1f.5 Serial bus controller [0c80]: Intel Corporation Alder Lake-S PCH SPI Controller [8086:7aa4] (rev 11)
IOMMU group 17 0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P5000] [10de:1bb0] (rev a1)
IOMMU group 17 0000:01:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
IOMMU group 18 0000:05:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 19 0000:06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] [1000:0064] (rev 02)
IOMMU group 1 0000:00:00.0 Host bridge [0600]: Intel Corporation 12th Gen Core Processor Host Bridge/DRAM Registers [8086:4660] (rev 02)
IOMMU group 20 0000:07:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003]
IOMMU group 2 0000:00:01.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 [8086:460d] (rev 02)
IOMMU group 3 0000:00:06.0 System peripheral [0880]: Intel Corporation RST VMD Managed Controller [8086:09ab]
IOMMU group 4 0000:00:0a.0 Signal processing controller [1180]: Intel Corporation Platform Monitoring Technology [8086:467d] (rev 01)
IOMMU group 5 0000:00:0e.0 RAID bus controller [0104]: Intel Corporation Volume Management Device NVMe RAID Controller [8086:467f]
IOMMU group 5 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 [8086:464d] (rev 02)
IOMMU group 5 10000:e0:17.0 SATA controller [0106]: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] [8086:7ae2] (rev 11)
IOMMU group 5 10000:e0:1d.0 System peripheral [0880]: Intel Corporation RST VMD Managed Controller [8086:09ab]
IOMMU group 5 10000:e0:1d.4 PCI bridge [0604]: Intel Corporation Alder Lake-S PCH PCI Express Root Port #13 [8086:7ab4] (rev 11)
IOMMU group 5 10000:e1:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology P2 NVMe PCIe SSD [c0a9:540a] (rev 01)
IOMMU group 5 10000:e2:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a80c]
IOMMU group 6 0000:00:14.0 USB controller [0c03]: Intel Corporation Alder Lake-S PCH USB 3.2 Gen 2x2 XHCI Controller [8086:7ae0] (rev 11)
IOMMU group 6 0000:00:14.2 RAM memory [0500]: Intel Corporation Alder Lake-S PCH Shared SRAM [8086:7aa7] (rev 11)
IOMMU group 7 0000:00:15.0 Serial bus controller [0c80]: Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #0 [8086:7acc] (rev 11)
IOMMU group 7 0000:00:15.1 Serial bus controller [0c80]: Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #1 [8086:7acd] (rev 11)
IOMMU group 7 0000:00:15.2 Serial bus controller [0c80]: Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #2 [8086:7ace] (rev 11)
IOMMU group 8 0000:00:16.0 Communication controller [0780]: Intel Corporation Alder Lake-S PCH HECI Controller #1 [8086:7ae8] (rev 11)
IOMMU group 9 0000:00:17.0 System peripheral [0880]: Intel Corporation RST VMD Managed Controller [8086:09ab]

both remapping and IOMMU are available on my system
my nvme controller and sata controller are on the same IOMMU group...
so when I pass it throug.. it corrupts the functioning of proxmox as its hdd is on the SATA controller
I tried to put following line in /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream"

Code:
root@evee:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

 vfio
 vfio_iommu_type1
 vfio_pci 
 vfio_pci.ids=144d:a80c,c0a9:540a
 vfio_virqfd

root@evee:~# cat /etc/modprobe.d/vfio.conf 
options vfio-pci ids=144d:a80c,c0a9:540a
softdep nvme pre: vfio-pci

root@evee:~# cat /etc/modprobe.d/vfio_immo_options.conf 
options vfio_iommu_type1 allow_unsafe_interrupts=1
Code:
root@evee:~# lspci -nnk
.........

10000:e1:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology P2 NVMe PCIe SSD [c0a9:540a] (rev 01)
    Subsystem: Micron/Crucial Technology P2 NVMe PCIe SSD [c0a9:5021]
    Kernel driver in use: vfio-pci
    Kernel modules: nvme
10000:e2:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a80c]
    Subsystem: Samsung Electronics Co Ltd Device [144d:a801]
    Kernel driver in use: vfio-pci
    Kernel modules: nvme

but they all remain in IOMMU group 5

if I load the NVME in the unraid VM, my proxmox looses access to its main os disk as it resides in the same group

any suggestions?
 
any suggestions?
Try put the devices in other PCIe or M.2 slots, as the IOMMU groups are determined by the physical motherboard (and BIOS). Try another motherboard with a different chipset, that is know to have good groups.

If you don't can that your VM can read all of the Proxmox host memory (and therefore all other VMs and steal passwords etc.) you can try to make Proxmox ignore the IOMMU groups with pcie_acs_override=downstream,multifunction (sometimes nothing changes until you use multifunction as well). You can check if it is active with cat /proc/cmdline and you should have more groups. There are no guarantees and Proxmox may crash when you start the VM.

Lots of threads about this on this forum but sometimes hard to find. Why did you reactivate this old and solved thread?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!