NVME SN850 Passthrough shows UNCLAIMED / Controller Error

dotnetted

Member
Nov 8, 2022
1
0
6
Hey all!

I'm attempting to pass SN850 NVME drives (via ASUS Hyper M.2 expansion) through to TrueNAS/Ubuntu/Windows VMs.

The drives show as UNCLAIMED in lshw inside the TrueNas/Ubuntu test VM.
Inside a Windows test VM, event viewer shows "The driver detected a controller error on \Device\RaidPort2" with event ID 11.

Host: ROG Zenith II Extreme Alpha - 3970X - 128G DDR4
> On-board SN850 2TB (0000:4d:00.0)
> NVME PCIe Expansion Card: ASUS Hyper M.2 Gen 4
> > SN850 2TB x 4 (0000:{4e|4f|50|51}:00.0)

(The same issue seems to be happening with the on-board NVMEs as with the ones on the expansion card)

What should my next steps be for debugging? Thanks!

Bash:
host (proxmox-7.2-3) $ lshw
...
*-pci:4
  ...
  product: Starship/Matisse GPP Bridge
  capabilities: pci pm pciexpress msi ht normal_decode bus_master cap_list
  ...
  *-storage
    ...
    product: WD Black SN850
    configuration: driver=vfio-pci latency=0
    ...
...

Bash:
vm (truenas-scale-22.02.4) $ grep -iE "(nvme|pci)" /var/log/syslog
  ...
  nvme nvme0: pci function 0000:00:10.0
  nvme nvme0: Removing after probe failure status: -19
  ...

Bash:
vm (truenas-scale-22.02.4) $ lshw
*-storage:[1-2] UNCLAIMED
     ...
     product: WD Black SN850
     capabilities: storage pm msi msix pciexpress nvm_express cap_list fb
     configuration: depth=32 latency=0 mode=1034x768 visual=truecolor xres=1024 yres=768
     ...

Additional Host Information

0000:4d:00.0
- On-board NVME (Has the same problem as ones on adapter)
0000:{4e|4f|50|51}:00.0 - NVMEs on ASUS Hyper M.2 expansion card

Bash:
apt install -y jq

root@pve:~# dmidecode -t 2 | grep Product | cut -d: -f2 | xargs
ROG ZENITH II EXTREME ALPHA

root@pve:~# lshw -c storage -json \
  | jq -c '.[] | {businfo, claimed, logicalname, product, driver: .configuration.driver}' \
  | while read -r json; \
  do \
    BUS=$(jq -r '.businfo' <<< $json); \
    DEVICE=$(ls /dev/disk/by-path/pci-${BUS:4}-* 1> /dev/null 2>&1 && readlink -f /dev/disk/by-path/pci-${BUS:4}* | head -n1); \
    IOMMU_GROUP=$(find /sys/kernel/iommu_groups/*/devices/${BUS:4} -type l | cut -d/ -f5); \
    jq -c --arg device "${DEVICE}" --argjson iommu_group ${IOMMU_GROUP} '. + { device: $device, iommu_group: $iommu_group }' <<< $json
  done

{"businfo":"pci@0000:21:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"nvme","device":"/dev/nvme0n1","iommu_group":27}
{"businfo":"pci@0000:45:00.0","claimed":true,"logicalname":null,"product":"ASM1062 Serial ATA Controller","driver":"ahci","device":"","iommu_group":61}
{"businfo":"pci@0000:46:00.0","claimed":true,"logicalname":null,"product":"ASM1062 Serial ATA Controller","driver":"ahci","device":"","iommu_group":62}
{"businfo":"pci@0000:4a:00.0","claimed":true,"logicalname":null,"product":"FCH SATA Controller [AHCI mode]","driver":"ahci","device":"","iommu_group":58}
{"businfo":"pci@0000:4b:00.0","claimed":true,"logicalname":"scsi9","product":"FCH SATA Controller [AHCI mode]","driver":"ahci","device":"/dev/sda","iommu_group":59}
{"businfo":"pci@0000:4d:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"vfio-pci","device":"","iommu_group":66}
{"businfo":"pci@0000:4e:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"vfio-pci","device":"","iommu_group":67}
{"businfo":"pci@0000:4f:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"vfio-pci","device":"","iommu_group":68}
{"businfo":"pci@0000:50:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"vfio-pci","device":"","iommu_group":69}
{"businfo":"pci@0000:51:00.0","claimed":true,"logicalname":null,"product":"WD Black SN850","driver":"vfio-pci","device":"","iommu_group":70}

# TrueNAS Test VM
root@pve:~# cat /etc/pve/qemu-server/100.conf
boot: order=scsi0;ide2;net0
cores: 4
hostpci0: 0000:4e:00
hostpci1: 0000:4f:00
hostpci2: 0000:50:00
ide2: none,media=cdrom
memory: 12288
meta: creation-qemu=6.2.0,ctime=1667776972
name: truenas-scale
net0: [redacted]
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-0,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=[redacted]
sockets: 1
vmgenid: [redacted]

# Windows 11 Test VM
root@pve:~# cat /etc/pve/qemu-server/104.conf
bios: ovmf
boot: order=ide0;ide2;net0
cores: 12
efidisk0: local-lvm:vm-104-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:4d:00.0
ide0: local-lvm:vm-104-disk-1,size=32G
ide2: local:iso/Win11_22H2_English_x64v1.iso,media=cdrom,size=5427180K
machine: pc-q35-6.2
memory: 16384
meta: creation-qemu=6.2.0,ctime=1667804699
name: win-11
net0: [redacted]
numa: 0
ostype: win11
scsihw: virtio-scsi-pci
smbios1: uuid=[redacted]
sockets: 1
tpmstate0: local-lvm:vm-104-disk-2,size=4M,version=v2.0
vmgenid: [redacted]
 
Last edited: