Pcie passthrough error on segmented hardware

lacathegreat

Active Member
Oct 10, 2019
2
0
41
58
Dear proxmox mate!

We have decided to install Proxmox 6 for future experience on the old HPC called SGI UV2000. This machine has 12 nodes (in two hw partitions: 8 nodes and 4 nodes parts, and these partitions can boot independent mode), and each node has separated but hierarchical coherent PCI bus to other nodes: This pci structure scheme is xxxx:xx:xx.x over against a simple PC motherboard which has xx:xx.x sheme.
We can install everything well (host config, vm config, etc...), but only one complicated issue emerged during a windows10 vm creating, which we cannot do anything: A gpu card which fitted on node "B" pcie-slot made a collision with nvme ssd card which situated on node "C" pcie-slot. The pcie passthrough selector dialog window on the vm config panel showed confused pci system.
This problem caused by the less pci config capable in /etc/pve/qemu-server/vm.conf file than the necessary: "hostpci0: 01:00.0,pcie=1" instead of "0002:01:00.0" (bus:slot:function versus domain:bus:slot:function), and vm.conf scheme doesnt allowed expand with proper address . The nvme ssd card sitting a different segment, but same signed pci bus: "0004:01:00.0". This situation confused the Proxmox os, seemingly same pci address has both card - but not:
gpu card in "0002:01:00.0" slot against the nvme ssd card in "0004:01:00.0" slot, but proxmox doesnt handle the highlighted pci domain part of pci address.

For example this issue not exist on Centos os which use the common libvirt base qemu-kvm, and the windows vm.xml config file include the full pci address:
<source>
<address domain='0x0002' bus='0x01' slot='0x00' function='0x0'/>
</source>

In short: How can we avoid the mentioned issue, how can we configure properly a proxmox windows 10 vm with the full length pci address (domain:bus:slot:function) that avoid the collisions with other pci card in another HPC node.

Best Regards
 
sadly our code does not handle this currently, but

In short: How can we avoid the mentioned issue, how can we configure properly a proxmox windows 10 vm with the full length pci address (domain:bus:slot:function) that avoid the collisions with other pci card in another HPC node.
as a current workaround, you can put the cards into the 'args' field of the vm config, there you have to manually state the qemu commandline parameters
(you can view them of a vm with 'qm shocmd ID --pretty' and adapt accordingly)

as for a genereal solution, please open an enhancment request (https://bugzilla.proxmox.com) but i can make no promises if/when this will be implemented (depending on how complicated it is in our codebase)
 
Dear Dominic,

Thx for your quick response!
I think your sound is good!
Did you thought like this in qm.conf file: "args: -device vfio-pci,host=0002:01:00.0,multifunction=on" ?
(Sorry, but I can't find punctually resolutions on kvm args, just dazed allude to this.)

Btw, implement this resolution in your source code is a good think, cause the proxmox seeming the better configurable/manageable solution than other competitors. I will open a request on your bugzilla!

Best regards,
László