[SOLVED] GPU in VM can't be claimed

parkervcp

Member
Mar 7, 2022
7
1
8
36
I have an Nvidia T600 in my host that I am trying to pass through to a VM. The VM can see the card but hits an error when trying to initialize it.

This is the error inside the VM. This is a Rocky Linux VM.
Code:
[    0.134047] pci 0000:01:00.0: [10de:1fb1] type 00 class 0x030000
[    0.145011] pci 0000:01:00.0: reg 0x10: [mem 0xc1000000-0xc1ffffff]
[    0.155011] pci 0000:01:00.0: reg 0x14: [mem 0x800000000-0x80fffffff 64bit pref]
[    0.166011] pci 0000:01:00.0: reg 0x1c: [mem 0x810000000-0x811ffffff 64bit pref]
[    0.177010] pci 0000:01:00.0: reg 0x24: [io  0xa000-0xa07f]
[    0.188017] pci 0000:01:00.0: reg 0x30: [mem 0xfff80000-0xffffffff pref]
[    0.188073] pci 0000:01:00.0: Max Payload Size set to 128 (was 256, max 256)
[    0.188409] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    0.188701] pci 0000:01:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x8 link at 0000:00:1c.0 (capable of 126.016 Gb/s with 8.0 GT/s PCIe x16 link)
[    0.278087] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    0.278091] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.334433] pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window
[    0.334450] pci 0000:01:00.0: BAR 6: assigned [mem 0xc2080000-0xc20fffff pref]

The GPU in `lspci`
Code:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1fb1] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device [17aa:1488]
        Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 13
        Memory at 91000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 4000000000 (64-bit, prefetchable) [size=256M]
        Memory at 4010000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 4000 [size=128]
        Expansion ROM at 92080000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Physical Resizable BAR
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
        Subsystem: Lenovo Device [17aa:1488]
        Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 14
        Memory at 92000000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel


This should be all i need to do to enable vfio and get passthrough working

Update the CMDLINE
Code:
sed -i 's\GRUB_CMDLINE_LINUX_DEFAULT="quiet"\GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream"\g' /etc/default/grub
update-grub

Set up VFIO modules
Code:
echo 'vfio' >> /etc/modules
echo 'vfio_iommu_type1' >> /etc/modules
echo 'vfio_pci' >> /etc/modules
echo 'vfio_virqfd' >> /etc/modules

Blacklist all the drivers
Code:
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidiafb" >> /etc/modprobe.d/blacklist.conf
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf

Add my device ID's to the vfio config
Code:
echo "options vfio-pci ids=10de:1fb1,10de:10fa " > /etc/modprobe.d/vfio.conf

My current qm config
Code:
qm config 106
bios: ovmf
boot: order=sata0;ide2;net0
cores: 2
efidisk0: disk-storage:106/vm-106-disk-1.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:01:00,pcie=1
ide2: none,media=cdrom
machine: q35
memory: 4096
meta: creation-qemu=6.1.1,ctime=1648305796
name: faramir
net0: virtio=a2:f7:9e:26:9b:52,bridge=vmbr0,firewall=1
net1: virtio=42:e7:0f:97:6d:19,bridge=vmbr2,firewall=1
numa: 0
ostype: l26
sata0: disk-storage:106/vm-106-disk-0.qcow2,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=9ce852ec-d600-46fb-a8f6-649b85503719
sockets: 1
vmgenid: 7cbbd862-421e-433b-b617-ab1646f284bf
 
Last edited:
So the answer is to set up the VM using SeaBIOS and not use OVMF. OVMF seems to cause the issue with not being able to claim the BAR.

I still UEFI and the GPU is working now.
 
  • Like
Reactions: leesteken