Multi GPU Passthrough - 4G decoding error?

p.lakis

New Member
Jul 11, 2018
23
0
1
37
I can pass through 4 individual Tesla P100's to 4 VMs but when combining to pass through any number above 1 i get the following error when running - dmesg | grep NVRM

1 of four works, but any amount over 1 the below output is produced.

Code:
admin@gpu-host:~$ dmesg | grep NVRM
[    4.550588] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR0 is 0M @ 0x0 (PCI:0000:02:00.0)
[    4.550589] NVRM: The system BIOS may have misconfigured your GPU.
[    4.550843] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR0 is 0M @ 0x0 (PCI:0000:03:00.0)
[    4.550844] NVRM: The system BIOS may have misconfigured your GPU.
[    4.551092] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR0 is 0M @ 0x0 (PCI:0000:04:00.0)
[    4.551093] NVRM: The system BIOS may have misconfigured your GPU.
[    4.551108] NVRM: The NVIDIA probe routine failed for 3 device(s).
[    4.551109] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  410.78  Sat Nov 10 22:09:04 CST 2018 (using threaded interrupts)

I have seen this on Systems that cant decode above 4G,
My VMID.conf is attached.

Code:
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 12
cpu: host
efidisk0: vm1028gq:vm-122-disk-1,size=128K
hostpci0: 05:00,pcie=1,x-vga=on
hostpci1: 06:00,pcie=1,x-vga=on
hostpci2: 84:00,pcie=1,x-vga=on
hostpci3: 85:00,pcie=1,x-vga=on
hugepages: 2
ide2: iso:iso/ubuntu-16.04.4-desktop-amd64.iso,media=cdrom
machine: q35
memory: 131072
name: U16.04-Tensor-Box
net0: virtio=DE:FC:7F:0B:27:04,bridge=vmbr1
numa: 1
ostype: l26
scsi0: vm1028gq:vm-122-disk-2,cache=writethrough,size=200G
scsihw: virtio-scsi-pci
smbios1: uuid=65d62e28-3f97-430b-be89-68567bc9fc2b
sockets: 2
args: -machine pc,max-ram-below-4g=4G

I have tried: 1G, 2G and 4G. All return the same NVRM errors, Is there something im missing?

.
Thanks
 
i guess your 'args' line has no effect, since you specify q35 (so the machine gets overwritten by us again)
there is an open bug for this https://bugzilla.proxmox.com/show_bug.cgi?id=1267

you can try to get the qemu-commandline by 'qm showcmd ID --pretty' and
add the ',max-ram-below-4g=X' to the "-machine 'type=q35'" part and execute that by hand, until the bug i mentioned is fixed
alternatively, did you try without q35? (ofc you have to remove pcie=1 and x-vga=on then)
 
In my VM config i added the following:

Code:
machine: q35,max-ram-below-4g=1G

Which in return gives the following QEMU config:

Code:
# qm showcmd 152 --pretty
/usr/bin/kvm \
  -id 152 \
  -name Base-Window10 \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/152.qmp,server,nowait' \
  -mon 'chardev=qmp,mode=control' \
  -pidfile /var/run/qemu-server/152.pid \
  -daemonize \
  -smbios 'type=1,uuid=c5b75794-9838-4a19-91c2-0ca588bcd49b' \
  -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' \
  -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=/dev/zvol/vm1028pool/vm-152-disk-2' \
  -smp '16,sockets=2,cores=8,maxcpus=16' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vga none \
  -nographic \
  -no-hpet \
  -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=proxmox,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,kvm=off' \
  -m 64512 \
  -object 'memory-backend-ram,id=ram-node0,size=32256M' \
  -numa 'node,nodeid=0,cpus=0-7,memdev=ram-node0' \
  -object 'memory-backend-ram,id=ram-node1,size=32256M' \
  -numa 'node,nodeid=1,cpus=8-15,memdev=ram-node1' \
  -readconfig /usr/share/qemu-server/pve-q35.cfg \
  -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
  -device 'vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0' \
  -device 'vfio-pci,host=06:00.0,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0' \
  -chardev 'socket,path=/var/run/qemu-server/152.qga,server,nowait,id=qga0' \
  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:6b9a7558eb39' \
  -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.149.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
  -drive 'file=/mnt/pve/iso/template/iso/SW_DVD9_Win_Pro_Ent_Edu_N_10_1803_64BIT_English_-4_MLF_X21-87129.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
  -drive 'file=/dev/zvol/vm1028pool/vm-152-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap152i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' \
  -device 'e1000,mac=DE:D6:56:85:F1:53,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'type=q35,max-ram-below-4g=1G' \
  -global 'kvm-pit.lost_tick_policy=discard'

Still no luck on getting the resources to the VM for the GPUs,

Is there a way in QEMU to define the following?
  • Above 4G Decoding = Enabled
  • MMIOH Base = 256G
  • MMIO High Size = 128G
This is how the BIOS is set up on the host system, If we could pass these over to QEMU then im confident we could get it to work?
Im struggling to find any documentation on MMIOH and QEMU.

EDIT:
I want to add that this issue is only on these TESLA chips, Prior testing on a GeForce system allowed all 4 cards to be pass through without issue.

Thanks
 
Last edited:
does it work when you pass the 4 cards to 4 different vms?
[ 4.550589] NVRM: The system BIOS may have misconfigured your GPU.
since this line appears on the host, maybe the host bios is really configuring the system not correctly...
 
Yes i can create 4 VMs with a single GPU each running concurrently.

That line appears inside the VM, Which is part of the OVMF bios.
If we could emulate the MMIO area in QEMU it could overcome this
 
Did you manage to find any solution for pass through 4x Telsa on OVMF bios?
Based on nvidia's KB, seems like we need to very specific with the BIOS setting.
h t t p s ://nvidia.custhelp.com/app/answers/detail/a_id/4119/~/incorrect-bios-settings-on-a-server-when-used-with-a-hypervisor-can-cause-mmio
 
I registered here specifically to provide future users with a resolution for this problem that seemed to escape me on some simple searches.

The issue when passing multiple GPU's towards a Q35 KVM machine with pcipassthrough using OVMF appears to be a lack of addressable PCI memory space. Adding pci=realloc to your grub boot cmd can assist with reallocating 32-bit memory and bring the second device "up", but any attempts to actually use the device fail as the memory allocation of the 64-bit bus fails.

I was able to resolve this with a -global flag sent to qm set.

Bash:
qm set VMID -args '-global q35-pcihost.pci-hole64-size=2048G'

It appears the default hole size, which from what I can find is 1024G, is insufficient to succesfully address 2 NVIDIA cards being used in this way. In my specific instance, this was a Tesla K80 that is actually a single card but presents itself as two GPU's with two seperate PCI addresses. Your mileage may vary.
 
I registered here specifically to provide future users with a resolution for this problem that seemed to escape me on some simple searches.

The issue when passing multiple GPU's towards a Q35 KVM machine with pcipassthrough using OVMF appears to be a lack of addressable PCI memory space. Adding pci=realloc to your grub boot cmd can assist with reallocating 32-bit memory and bring the second device "up", but any attempts to actually use the device fail as the memory allocation of the 64-bit bus fails.

I was able to resolve this with a -global flag sent to qm set.

Bash:
qm set VMID -args '-global q35-pcihost.pci-hole64-size=2048G'

It appears the default hole size, which from what I can find is 1024G, is insufficient to succesfully address 2 NVIDIA cards being used in this way. In my specific instance, this was a Tesla K80 that is actually a single card but presents itself as two GPU's with two seperate PCI addresses. Your mileage may vary.

Would you mind detailing the complete set of configuration changes you had to make to your system to have 2x Tesla K80s working with proxmox? I am banging my head against the wall trying to solve this problem. Thanks.
 
Would you mind detailing the complete set of configuration changes you had to make to your system to have 2x Tesla K80s working with proxmox? I am banging my head against the wall trying to solve this problem. Thanks.
Apologies for the delay here, I missed this one.

I had to make two changes here, the first was to modify /etc/default/grub:

Code:
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rl-swap rd.lvm.lv=rl/root rd.lvm.lv=rl/swap pci=realloc rd.driver.blacklist=nouveau"

The second was, as described above,
Code:
qm set VMID -args '-global q35-pcihost.pci-hole64-size=2048G'

Hopefully that's helpful.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!