NVMe Passthrough Issue on Ubuntu VM (Micron T700 SSD Fails to Initialize)

hkdemiralp

New Member
Nov 23, 2024
3
0
1
Hello,

I’m encountering an issue with NVMe passthrough on my Ubuntu Desktop VM in Proxmox. The problem started recently, and the NVMe controller (Micron/Crucial T700 SSD) fails to initialize within the VM, dropping into the initramfs prompt. Interestingly, this same setup worked fine before. My Windows 11 VM, which also uses NVMe passthrough (Samsung 990 Pro SSD), continues to work without issues. Below are the details of my setup and the troubleshooting steps I’ve tried so far. As you will see, I searched a lot but could not solve it alone. I am new to Proxmox and appreciate any help. I suspect this is related to Proxmox passthrough or IOMMU configuration, but I’m not sure what’s causing the failure. Are there any additional configurations or diagnostics I can try to resolve the issue? Here are some info and what I have done so far:
  1. The Ubuntu VM does not boot and drops into the initramfs shell.
  2. Attaching a live Ubuntu ISO to the VM also fails to recognize the NVMe controller.
  3. When booting the same Ubuntu installation bare-metal (bypassing Proxmox), it works perfectly without any issues, indicating the disk and controller are functional.
  4. Upgrading the Proxmox kernel from 6.8.12-4-pve to 6.11.0-1-pve did not resolve the issue.
  5. Adding the following to /etc/default/grub on the Proxmox host did not resolve the issue: pci=nomsi nvme_core.default_ps_max_latency_us=0 nvme_core.io_timeout=30
  6. I removed GPU passthrough for Ubuntu VM but nothing changed.

Environment Information

Hardware

  • Motherboard: MSI X670E
  • CPU: AMD Ryzen 9 7950X3D 16-Core Processor
  • NVMe Devices:
    • Micron/Crucial T700 SSD (used for the Ubuntu VM)
    • Samsung 990 Pro SSD (used for the Windows 11 VM)

Proxmox Host

  • Proxmox Version: 8.3
  • Kernel: 6.11.0-1-pve
  • EMU emulator version: 9.0.2 (pve-qemu-kvm_9.0.2-4)

Passthrough Configuration

Devices specified in /etc/modprobe.d/vfio.conf:
Code:
options vfio-pci ids=10de:2684,10de:22ba,c0a9:5419,1022:43f7,1b21:3241

Device mappings:
  • GPU Passthrough: NVIDIA RTX 4090 (10de:2684) with Audio Controller (10de:22ba)
  • NVMe Passthrough: Micron/Crucial T700 SSD (c0a9:5419)
  • USB Controllers: AMD USB 3.2 Controllers (1022:43f7), ASMedia ASM3241 (1b21:3241)
GRUB configuration on the host:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pci=nomsi nvme_core.default_ps_max_latency_us=0 nvme_core.io_timeout=30"

Ubuntu VM qemu configuration
Code:
cat /etc/pve/qemu-server/100.conf
agent: 1
bios: ovmf
boot: order=ide2;hostpci0;net0
cores: 8
cpu: host
efidisk0: local-btrfs:100/vm-100-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:02:00,pcie=1,rombar=0
hostpci1: 0000:10:00,pcie=1,rombar=0
hostpci2: 0000:12:00,pcie=1,rombar=0
ide2: none,media=cdrom
machine: q35,viommu=virtio
memory: 80000
meta: creation-qemu=9.0.2,ctime=1729457387
name: achibuntu
net0: virtio=BC:24:11:22:07:5F,bridge=vmbr0,firewall=1
numa: 1
ostype: l26
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=245fadff-0de6-4564-9dc6-9b4d256f5007
sockets: 1
vga: virtio
vmgenid: 6988ea9b-147d-4f84-83a4-c6e2aeddf6f8

Ubuntu VM Information

  1. Device Details:
    Code:
    sudo lspci -nn
    01:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology T700 NVMe PCIe SSD [c0a9:5419]
  2. dmesg error output:
    Code:
    sudo dmesg | grep nvme
    [    0.839065] nvme 0000:01:00.0: Adding to iommu group 11
    [    0.841911] nvme nvme0: pci function 0000:01:00.0
    [   62.596278] nvme nvme0: I/O tag 4 (0004) QID 0 timeout, disable controller
    [   62.598393] nvme nvme0: Device not ready; aborting shutdown, CSTS=0x5
    [   62.606297] nvme nvme0: Identify Controller failed (-4)
    [   62.612397] nvme: probe of 0000:01:00.0 failed with error -5
  3. Modules Loaded:
    Code:
    sudo lsmod
    Module                  Size  Used by
    nvme                   61440  0
    nvme_core             208896  1 nvme
    virtio_gpu             94208  1
    ahci                   49152  1
    xhci_pci               24576  0
    libahci                53248  1 ahci
    virtio_dma_buf         12288  1 virtio_gpu
    xhci_pci_renesas       20480  1 xhci_pci
    nvme_auth              28672  1 nvme_core
  4. Ubuntu VM is stuck in initramfs prompt:
    Code:
    BusyBox v1.36.1  (Ubuntu 1:1.36.1-6ubuntu3.1) built-in shell (ash)
    (initramfs) blkid
    (initramfs) exit
    Gave up waiting for root file system device. Common problems:
     - Boot args (cat /proc/cmdline)
        - Check rootdelay= (did the system wait enough?)
     - Missing modules (cat /proc/modules; ls /dev)
    ...
    ALERT! /dev/disk/by-uuid/longIDxxxxxx does not exist. Dropping to a shell!
    (initramfs)_
 
Last edited:
Have you tried pcie_aspm=off and pcie_port_pm=off as suggested by here & here ?

Note: I have no (personal) experience with your situation.
 
Have you tried pcie_aspm=off and pcie_port_pm=off as suggested by here & here ?

Note: I have no (personal) experience with your situation.
Hi, @gfngfn256 thanks for your suggestion. I just added those to GRUB configuration on the host. Then run proxmox-boot-tool refresh and reboot, but nothing changed.
Now grub line become like this crazy long :)
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pci=nomsi nvme_core.default_ps_max_latency_us=0 nvme_core.io_timeout=30 pcie_aspm=off pcie_port_pm=off"
For your second link, I am not sure if I need to patch some source code or if just adding those values to grub file is enough.
 
Last edited:
I notice you tried a newer kernel. Since your problem started more recently, I'd go the opposite root of trying an older kernel. So pin your system to an older one & see if that helps.

Sorry I can't really be of anymore help - but as I said I don't have personal experience with your specific problem.
 
Thanks for your time, mate. I tried all of this with the stable kernel 6.8.12-4-pve. It did not work, so I tried the newer one. I accidentally boot the Ubuntu VM as bare metal and surprised to see that it works without Proxmox. Now I wonder if Windows VM gonna work as bare metal, as well. Until I find a fix to Proxmox nvme initiliaze problem, I can have a little fun I guess :)