Issues with multiple GPU pass through when adding to vm's

alcatail

New Member
Apr 8, 2021
2
0
1
42
I followed the guide https://pve.proxmox.com/wiki/Pci_passthrough with success and can map a gtx 1060 to a windows 10 machine successfully. The issue I have is that I have 2 duplicate gtx 1060's and depending on which is installed in the highest slot dictates which card will work. I was able to see both cards work one at a time (one works one fails) by swapping them up and down the pci slots. Can only get one card to work period but proxmox does see both cards and the slot references are correct.

There error that presents with the lower slot number is
Stopped : Start failed qemu exited with code 1
it lists the pid process.

could some one shed some light with any ideas?
I have been trying to work it out for 2 days but can't find it in the threads and google turning over nothing helpful.

It is not the dreaded error 43 ,the card when it shows up works perfectly, it is like the process sees both cards as one but across two slots and the lower slot number is the one that it allows to boot in the vm.

I have built an awesome gaming rig in a vm, the chip is an intel i7 10700K on a msi z490 motherboard and running 16 gigs of ram.
the idea is to split the cards up across different vm's. Load wise the 10700k eats it up and is playing forza 4 flawless and I mine on another vm and a freenas. So far I have the miner and the game rig vm's built and I am aware that the video cards can only be used by one vm at a time.

The miner when running uses very little resorce.

One thing to note with the gaming rig is when the sound was crackling I threw a heap of cpu at it and it fixed all issues, you wouldn't know the game was inside a vm.

edit: using proxmox 6.3

any help appreciated,

kind regards

Mick


edit in reply to Ramalama

1. for a in /sys/kernel/iommu_groups/*; do find $a -type l; done | sort --version-sort

/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/2/devices/0000:00:12.0
/sys/kernel/iommu_groups/3/devices/0000:00:14.0
/sys/kernel/iommu_groups/3/devices/0000:00:14.2
/sys/kernel/iommu_groups/4/devices/0000:00:16.0
/sys/kernel/iommu_groups/5/devices/0000:00:17.0
/sys/kernel/iommu_groups/6/devices/0000:00:1c.0
/sys/kernel/iommu_groups/6/devices/0000:00:1c.3
/sys/kernel/iommu_groups/6/devices/0000:00:1c.4
/sys/kernel/iommu_groups/6/devices/0000:01:00.0
/sys/kernel/iommu_groups/6/devices/0000:01:00.1
/sys/kernel/iommu_groups/6/devices/0000:03:00.0
/sys/kernel/iommu_groups/6/devices/0000:03:00.1
/sys/kernel/iommu_groups/7/devices/0000:00:1f.0
/sys/kernel/iommu_groups/7/devices/0000:00:1f.3
/sys/kernel/iommu_groups/7/devices/0000:00:1f.4
/sys/kernel/iommu_groups/7/devices/0000:00:1f.5
/sys/kernel/iommu_groups/7/devices/0000:00:1f.6


2. edit's so far >>> Have followed the guide's stock standard and have the card working flawlessly one up >>https://pve.proxmox.com/wiki/Pci_passthrough and https://pve.proxmox.com/wiki/Pci_passthrough#GPU_Passthrough

issue is related to more than one card setups.


4. Proxmox version. 6.3


error code thrown kvm: -device vfio-pci,host=0000:03:00.0,id=hostpci0.0,bus=pci.0,addr=0x10.0,multifunction=on: vfio 0000:03:00.0: failed to open /dev/vfio/6: Device or resource busy

but it is actually the other card being used, not this one.
 
Last edited:
you probably need to blacklist you card with an grub cmdline edit.
It sounds simply, like one card gets shortly accessed during boot, or before boot.

If that happens, it's like 99% that the passthrough itself will work, but the guest vm will render an error in the device manager, or boot it black etc...
In your case qemu fails, so you could have a different issue.

But in any way, you need to provide more info, basically 4 things:
1. for a in /sys/kernel/iommu_groups/*; do find $a -type l; done | sort --version-sort
2. What you exactly edited so far.
3. journalctl -b
4. Proxmox version.

Cheers

Edit, you can pipe those commands with > anyfile.txt at the end of the command, to make it easier for you, to copy here the output. Or upload those files.
 
Last edited:
you probably need to blacklist you card with an grub cmdline edit.
It sounds simply, like one card gets shortly accessed during boot, or before boot.

If that happens, it's like 99% that the passthrough itself will work, but the guest vm will render an error in the device manager, or boot it black etc...
In your case qemu fails, so you could have a different issue.

But in any way, you need to provide more info, basically 4 things:
1. for a in /sys/kernel/iommu_groups/*; do find $a -type l; done | sort --version-sort
2. What you exactly edited so far.
3. journalctl -b
4. Proxmox version.

Cheers

Edit, you can pipe those commands with > anyfile.txt at the end of the command, to make it easier for you, to copy here the output. Or upload those files.
thanks for the reply I will be back at home n the morning and will update it.
I have followed the general guide to a tee, with just one card in the machine it works perfectly as well.
I will get the ingo tomorrow and repost the info you asked for.
Also it takes the card with the lower slot number, I found a post where the same thing ws happening with ethernet nics and it was the same behaviour.

Is there somewhere wehre I can spellout for the vm config file pci to pass through. I currently do it as per the guide using the q35 and omvf bios.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!