GPU passthrough: still problems

Ecoll

Active Member
Jan 31, 2019
8
0
41
46
Hi everybody

I know there are plenty of posts on this subject, but I can't seem to solve my problem

My system:
HP DL380 G7
2 CPU xeon X5687 (4 cores / 8 threads, 3.6gHz)
48 Gbit DDR3
AMD firepro W2100

I try to place the firepro in a VM under windows 10.

QM config ID
Code:
balloon: 0
bios: ovmf
bootdisk: sata0
cores: 16
cpu: host
description: unused0%3A local-lvm%3Avm-110-disk-0
efidisk0: local-lvm:vm-400-disk-0,size=4M
hostpci0: 08:00.0,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: q35
memory: 49152
name: Windows
net0: e1000=96:BC:36:7D:DF:C3,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
sata0: local-lvm:vm-400-disk-1,size=64G
sata1: local-lvm:vm-400-disk-2,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=e6a84121-8202-412c-a01a-d5e0035adf72
sockets: 1
vmgenid: c6451afa-43b5-4b23-8f3b-2b5816f51d9f

I followed this post
https://pve.proxmox.com/wiki/Pci_passthrough

When i execute this cmd: dmesg | grep 'remapping'
Code:
[    0.388593] DMAR-IR: This system BIOS has enabled interrupt remapping
               on a chipset that contains an erratum making that
               feature unstable.  To maintain system stability
               interrupt remapping is being disabled.  Please
               contact your BIOS vendor for an update

when i try find /sys/kernel/iommu_groups/ -type l
Code:
/sys/kernel/iommu_groups/17/devices/0000:00:1f.2
/sys/kernel/iommu_groups/17/devices/0000:00:1f.0
/sys/kernel/iommu_groups/7/devices/0000:00:07.0
/sys/kernel/iommu_groups/25/devices/0000:3e:06.3
/sys/kernel/iommu_groups/25/devices/0000:3e:06.1
/sys/kernel/iommu_groups/25/devices/0000:3e:06.2
/sys/kernel/iommu_groups/25/devices/0000:3e:06.0
/sys/kernel/iommu_groups/15/devices/0000:00:1d.3
/sys/kernel/iommu_groups/15/devices/0000:00:1d.1
/sys/kernel/iommu_groups/15/devices/0000:00:1d.2
/sys/kernel/iommu_groups/15/devices/0000:00:1d.0
/sys/kernel/iommu_groups/15/devices/0000:00:1d.7
/sys/kernel/iommu_groups/5/devices/0000:00:05.0
/sys/kernel/iommu_groups/23/devices/0000:3e:04.2
/sys/kernel/iommu_groups/23/devices/0000:3e:04.0
/sys/kernel/iommu_groups/23/devices/0000:3e:04.3
/sys/kernel/iommu_groups/23/devices/0000:3e:04.1
/sys/kernel/iommu_groups/13/devices/0000:00:14.1
/sys/kernel/iommu_groups/13/devices/0000:00:14.2
/sys/kernel/iommu_groups/13/devices/0000:00:14.0
/sys/kernel/iommu_groups/31/devices/0000:3f:06.1
/sys/kernel/iommu_groups/31/devices/0000:3f:06.2
/sys/kernel/iommu_groups/31/devices/0000:3f:06.0
/sys/kernel/iommu_groups/31/devices/0000:3f:06.3
/sys/kernel/iommu_groups/3/devices/0000:00:03.0
/sys/kernel/iommu_groups/21/devices/0000:3e:02.5
/sys/kernel/iommu_groups/21/devices/0000:3e:02.3
/sys/kernel/iommu_groups/21/devices/0000:3e:02.1
/sys/kernel/iommu_groups/21/devices/0000:3e:02.4
/sys/kernel/iommu_groups/21/devices/0000:3e:02.2
/sys/kernel/iommu_groups/21/devices/0000:3e:02.0
/sys/kernel/iommu_groups/11/devices/0000:00:0d.0
/sys/kernel/iommu_groups/11/devices/0000:00:0d.5
/sys/kernel/iommu_groups/11/devices/0000:00:0d.3
/sys/kernel/iommu_groups/11/devices/0000:00:0d.1
/sys/kernel/iommu_groups/11/devices/0000:00:0d.6
/sys/kernel/iommu_groups/11/devices/0000:00:0d.4
/sys/kernel/iommu_groups/11/devices/0000:00:0d.2
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/28/devices/0000:3f:03.4
/sys/kernel/iommu_groups/28/devices/0000:3f:03.2
/sys/kernel/iommu_groups/28/devices/0000:3f:03.0
/sys/kernel/iommu_groups/28/devices/0000:3f:03.1
/sys/kernel/iommu_groups/18/devices/0000:05:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:08.0
/sys/kernel/iommu_groups/26/devices/0000:3f:00.0
/sys/kernel/iommu_groups/26/devices/0000:3f:00.1
/sys/kernel/iommu_groups/16/devices/0000:01:03.0
/sys/kernel/iommu_groups/16/devices/0000:00:1e.0
/sys/kernel/iommu_groups/6/devices/0000:00:06.0
/sys/kernel/iommu_groups/24/devices/0000:3e:05.0
/sys/kernel/iommu_groups/24/devices/0000:3e:05.3
/sys/kernel/iommu_groups/24/devices/0000:3e:05.1
/sys/kernel/iommu_groups/24/devices/0000:3e:05.2
/sys/kernel/iommu_groups/14/devices/0000:03:00.0
/sys/kernel/iommu_groups/14/devices/0000:00:1c.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/14/devices/0000:04:00.1
/sys/kernel/iommu_groups/14/devices/0000:03:00.1
/sys/kernel/iommu_groups/14/devices/0000:00:1c.4
/sys/kernel/iommu_groups/14/devices/0000:04:00.0
/sys/kernel/iommu_groups/14/devices/0000:00:1c.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.4
/sys/kernel/iommu_groups/4/devices/0000:00:04.0
/sys/kernel/iommu_groups/22/devices/0000:3e:03.1
/sys/kernel/iommu_groups/22/devices/0000:3e:03.4
/sys/kernel/iommu_groups/22/devices/0000:3e:03.2
/sys/kernel/iommu_groups/22/devices/0000:3e:03.0
/sys/kernel/iommu_groups/12/devices/0000:00:0e.3
/sys/kernel/iommu_groups/12/devices/0000:00:0e.1
/sys/kernel/iommu_groups/12/devices/0000:00:0e.4
/sys/kernel/iommu_groups/12/devices/0000:00:0e.2
/sys/kernel/iommu_groups/12/devices/0000:00:0e.0
/sys/kernel/iommu_groups/30/devices/0000:3f:05.3
/sys/kernel/iommu_groups/30/devices/0000:3f:05.1
/sys/kernel/iommu_groups/30/devices/0000:3f:05.2
/sys/kernel/iommu_groups/30/devices/0000:3f:05.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/20/devices/0000:3e:00.0
/sys/kernel/iommu_groups/20/devices/0000:3e:00.1
/sys/kernel/iommu_groups/10/devices/0000:00:0a.0
/sys/kernel/iommu_groups/29/devices/0000:3f:04.2
/sys/kernel/iommu_groups/29/devices/0000:3f:04.0
/sys/kernel/iommu_groups/29/devices/0000:3f:04.3
/sys/kernel/iommu_groups/29/devices/0000:3f:04.1
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/19/devices/0000:08:00.0 <- firepro W2100
/sys/kernel/iommu_groups/19/devices/0000:08:00.1
/sys/kernel/iommu_groups/9/devices/0000:00:09.0
/sys/kernel/iommu_groups/27/devices/0000:3f:02.5
/sys/kernel/iommu_groups/27/devices/0000:3f:02.3
/sys/kernel/iommu_groups/27/devices/0000:3f:02.1
/sys/kernel/iommu_groups/27/devices/0000:3f:02.4
/sys/kernel/iommu_groups/27/devices/0000:3f:02.2
/sys/kernel/iommu_groups/27/devices/0000:3f:02.0

And dmesg | grep -e DMAR -e IOMMU
Code:
[    0.007175] ACPI: DMAR 0x00000000CF62FE80 000168 (v01 HP     ProLiant 00000001 \xd2?   0000162E)
[    0.196656] DMAR: IOMMU enabled
[    0.388593] DMAR-IR: This system BIOS has enabled interrupt remapping
[    1.395177] DMAR: Host address width 39
[    1.395178] DMAR: DRHD base: 0x000000d7ffe000 flags: 0x1
[    1.395201] DMAR: dmar0: reg_base_addr d7ffe000 ver 1:0 cap c90780106f0462 ecap f0207e
[    1.395202] DMAR: RMRR base: 0x000000cf7fc000 end: 0x000000cf7fdfff
[    1.395203] DMAR: RMRR base: 0x000000cf7f5000 end: 0x000000cf7fafff
[    1.395204] DMAR: RMRR base: 0x000000cf63e000 end: 0x000000cf63ffff
[    1.395205] DMAR: ATSR flags: 0x0
[    1.395498] DMAR: dmar0: Using Queued invalidation
[    1.408137] DMAR: Intel(R) Virtualization Technology for Directed I/O
[   10.116356] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[   10.116357] AMD-Vi: AMD IOMMUv2 functionality not available on this system

and now run the VM with qm showcmd ID
Code:
/usr/bin/kvm -id 400 -name Windows -chardev 'socket,id=qmp,path=/var/run/qemu-server/400.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/400.pid -daemonize -smbios 'type=1,uuid=e6a84121-8202-412c-a01a-d5e0035adf72' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/pve/vm-400-disk-0' -smp '16,sockets=1,cores=16,maxcpus=16' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/400.vnc,password -no-hpet -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vendor_id=proxmox,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt' -m 49152 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=c6451afa-43b5-4b23-8f3b-2b5816f51d9f' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:08:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:685c1e4fedc9' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' -drive 'file=/dev/pve/vm-400-disk-1,if=none,id=drive-sata0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' -drive 'file=/dev/pve/vm-400-disk-2,if=none,id=drive-sata1,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ahci0.1,drive=drive-sata1,id=sata1' -netdev 'type=tap,id=net0,ifname=tap400i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=96:BC:36:7D:DF:C3,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -rtc 'driftfix=slew,base=localtime' -machine 'type=q35+pve0' -global 'kvm-pit.lost_tick_policy=discard'

and result
Code:
kvm: -device vfio-pci,host=0000:08:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:08:00.0: failed to open /dev/vfio/19: No such file or directory

somebody can help me ?
 
lspci -v -s 08:00.0
Code:
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100] (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company Oland GL [FirePro W2100]
        Physical Slot: 1
        Flags: fast devsel, IRQ 10
        Memory at e0000000 (64-bit, prefetchable) [disabled] [size=256M]
        Memory at fbfc0000 (64-bit, non-prefetchable) [disabled] [size=256K]
        I/O ports at 5000 [disabled] [size=256]
        [virtual] Expansion ROM at fbf00000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [58] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [200] #15
        Capabilities: [270] #19
        Kernel modules: radeon, amdgpu
 
Hello,

Your GPU has loaded radeon driver so I think you cannot bind vfio on it.

in the guide, have you followed this section ?
Code:
echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf

and blacklisted radeon driver ?
Code:
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf

if so, you should try adding grub options:
Code:
 video=efifb:off,vesafb=off
 
Also, please try with UEFI. Most cards work with UEFI, but don't with BIOS.

bios: ovmf
i have try the two solutions

Hello,

Your GPU has loaded radeon driver so I think you cannot bind vfio on it.

thank you
i have check all config and adding grub options
after reboot, lspci -v -s 08:00.0

Code:
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100] (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company Oland GL [FirePro W2100]
        Physical Slot: 1
        Flags: fast devsel, IRQ 10
        Memory at e0000000 (64-bit, prefetchable) [disabled] [size=256M]
        Memory at fbfc0000 (64-bit, non-prefetchable) [disabled] [size=256K]
        I/O ports at 5000 [disabled] [size=256]
        [virtual] Expansion ROM at fbf00000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [58] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [200] #15
        Capabilities: [270] #19
        Kernel driver in use: vfio-pci
        Kernel modules: radeon, amdgpu

But i have a new message
dmesg | grep -e DMAR -e IOMMU
Code:
[    0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[    0.006584] ACPI: DMAR 0x00000000CF62FE80 000168 (v01 HP     ProLiant 00000001 \xd2?   0000162E)
[    0.196639] DMAR: IOMMU enabled
[    0.386734] DMAR-IR: This system BIOS has enabled interrupt remapping
[    1.353991] DMAR: Host address width 39
[    1.353993] DMAR: DRHD base: 0x000000d7ffe000 flags: 0x1
[    1.354008] DMAR: dmar0: reg_base_addr d7ffe000 ver 1:0 cap c90780106f0462 ecap f0207e
[    1.354009] DMAR: RMRR base: 0x000000cf7fc000 end: 0x000000cf7fdfff
[    1.354009] DMAR: RMRR base: 0x000000cf7f5000 end: 0x000000cf7fafff
[    1.354010] DMAR: RMRR base: 0x000000cf63e000 end: 0x000000cf63ffff
[    1.354011] DMAR: ATSR flags: 0x0
[    1.354289] DMAR: dmar0: Using Queued invalidation
[    1.368037] DMAR: Intel(R) Virtualization Technology for Directed I/O
[   10.087329] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[   10.087331] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[ 3058.155053] vfio-pci 0000:08:00.1: DMAR: Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.
[ 3058.155065] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
[ 3083.175705] vfio-pci 0000:08:00.1: DMAR: Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.
[ 3083.175718] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
[ 3290.597677] vfio-pci 0000:08:00.1: DMAR: Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.
[ 3290.597690] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
[ 3301.877433] vfio-pci 0000:08:00.1: DMAR: Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.
[ 3301.877445] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
[ 3320.508995] vfio-pci 0000:08:00.1: DMAR: Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.
[ 3320.509008] vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform

I forgot that I made this patch too
https://forum.proxmox.com/threads/c...ntel-iommu-driver-to-remove-rmrr-check.36374/
 
Last edited:
Take care, in your lspci for your GPU I can see
Code:
       Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+

That could lead to a code 43.
check in other posts how to enable MSI in windows guest VM
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!