Host Crash With NVMe Passthrough

dizzydre21

Member
Apr 10, 2023
82
4
13
Hi folks,

I am having an issue with my host machine crashing whenever I try to pass through 4 NVMe SSDs. All four of them are WD CL SN720 model drives.

I'm on an Asrock Romed8-2t Epyc Milan based motherboard, and the NVMes are all in separate IOMMU groups, with the slot they are in bifurcated x4x4x4x4. This is running Proxmox 9.

I have the problematic NVMe drives in the second PCIe slot from the CPU. I have four U.2 SSDs in the first slot that are also being passed through, but the U.2 drives work without issue. I need them all passed through because it's for a TrueNAS VM.

Can anyone point me in the right direction? I'm on mobile currently, but I'm happy to provide logs and additional detail when I can SSH into the machine.
 
Last edited:
  • Like
Reactions: trrc_cdeliens
I have some additional details if there is anyone who might be able to help.

I have a soft dep set up and VFIO is the driver used for all of the m.2 drives that I want to pass through. The host will crash whether NVMe or VFIO is the driver, though.

I tried the four m.2 NVMe drives in the top slot, where I had the four U.2 drives working with passthrough. The crashes still occurred. I then tried just passing through one m.2 drive at a time and the host still crashed. I also tried one of the m.2 drives in the first embedded m.2 slot on the motherboard. The host would still crash.

Does anyone have suggestions on where to look?
 
Hi! I have the same issue :/ Host crashing when I start a VM having the NVMe passed through as a PCI device.

I'm trying to passthrough the NVMe PCI device with:
Code:
qm set 100 -hostpci0 0001:ae:00.0,pcie=1,rombar=0

I can see the NVMe drive is loaded with vfio-pci kernel driver, and I have iommu=on in my kernel boot options.

@dizzydre21 could you work around the issue in the meantime?
 
Hi! I have the same issue :/ Host crashing when I start a VM having the NVMe passed through as a PCI device.

I'm trying to passthrough the NVMe PCI device with:
Code:
qm set 100 -hostpci0 0001:ae:00.0,pcie=1,rombar=0

I can see the NVMe drive is loaded with vfio-pci kernel driver, and I have iommu=on in my kernel boot options.

@dizzydre21 could you work around the issue in the meantime?
I never got it to work with the WD drives. I ended up purchasing some Intel U.2 drives and they work flawlessly in the same PCIe slot. I didn't even need to pre-load VFIO for them.
 
  • Like
Reactions: trrc_cdeliens
Thanks for your quick reply. Here I'm on (factory installed) Dell DC NVMe ISE PE8110 RI U.2 960GB on a Dell R670 Server.

It's interesting to read that they might be the cause of the crashes! I never considered the drive to be the issue. I'll look into that.
 
Thanks for your quick reply. Here I'm on (factory installed) Dell DC NVMe ISE PE8110 RI U.2 960GB on a Dell R670 Server.

It's interesting to read that they might be the cause of the crashes! I never considered the drive to be the issue. I'll look into that.
FWIW, I've used the WD drives in other machines and they have been fine for normal use (ZFS Mirror).It seems like some drives just don't like VFIO or something. Maybe it's a drive firmware thing.