Can't mount NVME Drive Using vfio-pci Driver

cbridgeman

New Member
Nov 25, 2024
4
0
1
I installed a new NVME drive, but I cannot access it with Proxmox. I cannot see it when doing a lsblk command.

After much troubleshooting, I think that it is because it is using the vfio-pci driver. I can see it listed in a VM machine's hardware section under "PCI Devices." I do not have this drive currently in use via passthrough with any VM.

I am using GPU passthrough and I also pass the PCI USB controller through to my main Windows VM.

I have tried to issue the command "echo -n "0000:0c:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind" which changes the drive attribute listed below from lspci -v (below) from "Kernel driver in use: vfio-pci Kernel modules: nvme" to "Kernel modules: nvme" When I issue the command "echo -n "0000:0c:00.0" > /sys/bus/pci/drivers/nvme/bind" I get an error "-bash: echo: write error: Device or resource busy"

When I reboot the PVE the listing from lspci -v (below) returns to its original output.

0c:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9C1a (DRAM-less) (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller PM9C1a (DRAM-less)
Flags: fast devsel, IRQ 24, IOMMU group 20
Memory at a0200000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/16 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable- Count=17 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [168] Secondary PCI Express
Capabilities: [188] Physical Layer 16.0 GT/s <?>
Capabilities: [1ac] Lane Margining at the Receiver <?>
Capabilities: [1c4] Extended Capability ID 0x2a
Capabilities: [1e8] Latency Tolerance Reporting
Capabilities: [1f0] L1 PM Substates
Capabilities: [374] Data Link Feature <?>
Kernel driver in use: vfio-pci
Kernel modules: nvme


Any help would be much appreciated.
 
Last edited:
Additional info. The drive in question is nvme2:

root@proxmox2:/sys/bus# dmesg -H | grep -i nvme
[ +0.003784] nvme 0000:02:00.0: platform quirk: setting simple suspend
[ +0.000001] nvme 0000:05:00.0: platform quirk: setting simple suspend
[ +0.000025] nvme 0000:0c:00.0: platform quirk: setting simple suspend
[ +0.000033] nvme nvme0: pci function 0000:05:00.0
[ +0.000004] nvme nvme1: pci function 0000:02:00.0
[ +0.000018] nvme nvme2: pci function 0000:0c:00.0
[ +0.000249] nvme nvme0: missing or invalid SUBNQN field.
[ +0.009699] nvme nvme2: Shutdown timeout set to 10 seconds
[ +0.011413] nvme nvme2: allocated 64 MiB host memory buffer.
[ +0.005560] nvme nvme0: allocated 32 MiB host memory buffer.
[ +0.022199] nvme nvme1: 32/0/0 default/read/poll queues
[ +0.002394] nvme nvme2: 16/0/0 default/read/poll queues
[ +0.002277] nvme1n1: p1 p2 p3
[ +0.003765] nvme2n1: p1
[ +0.001748] nvme nvme0: 8/0/0 default/read/poll queues
 
Additional info:

If I set amd_iommu=off in Grub, I am able to see the disk and add it as a disk to various VMs. When i re-enable iommu it disappears again and has the same issues I listed in the original post.