Help setting up Multi GPU Passthrough

jbarry14

New Member
May 16, 2024
1
0
1
I am new to Proxmox and I am trying to setup GPU passthrough on a Windows 10 and MacOS VM.
Machine Specs:
Proxmox 8.2.2
CPU: i9-12900KS
RAM: Corsair Dominator Platinum 64GB DDR5
MOBO: Gigabyte Aorus Master
GPU 1: Sapphire RX6950XT Nitro+ Pure
GPU 2: Sapphire RX6900XT Nitro+ SE

I was able to successfully passthrough the 6900. The 6900 works perfect on the Windows 10 VM, but gets stuck on the apple logo on MacOS VM. I keep getting errors when trying to passthrough the 6950 on both VMs. I have been following many guides and doing a lot of research and I can't seem to find a solution for the 6950. I would like to use the 6900xt for the MacOS VM, so that I do not need to spoof the device ID, and the 6950 for the Windows 10 VM. Is it possbile to set this up to use the iGPU for proxmox and the other 2 GPUs for the VMs I would like to run? I would appreciate any help with this. Thanks in advance.

Win10 Config File
Code:
root@Proxmox:~# cat /etc/pve/qemu-server/101.conf
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'
balloon: 0
bios: ovmf
boot: order=virtio0;ide0;ide2;net0
cores: 8
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:03:00,pcie=1,x-vga=1
ide0: local:iso/virtio-win-0.1.248.iso,media=cdrom,size=715188K
ide2: local:iso/Windows.iso,media=cdrom,size=4779200K
machine: q35
memory: 8192
meta: creation-qemu=8.1.5,ctime=1715615951
name: Windows10
net0: virtio=BC:24:11:AF:E9:52,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsihw: virtio-scsi-single
smbios1: uuid=11159ac2-26ef-406f-802d-330abf45c4f7
sockets: 1
tpmstate0: local-lvm:vm-101-disk-1,size=4M,version=v2.0
usb0: host=1ea7:0064
usb1: host=0c45:760b
vga: virtio
virtio0: local-lvm:vm-101-disk-2,iothread=1,size=64G
vmgenid: 40c3b7b3-013f-430c-933a-b647c2b70d27

MacOS VM Config
Code:
root@Proxmox:~#  cat /etc/pve/qemu-server/1400.conf
args: -device isa-applesmc,osk="ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc" -smbios type=2 -device qemu-xhci -device usb-kbd -device usb-tablet -global nec-usb-xhci.msi=off -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off -cpu host,vendor=GenuineIntel,+invtsc,+hypervisor,kvm=on,vmware-cpuid-freq=on
balloon: 0
bios: ovmf
boot: order=virtio0;net0
cores: 10
cpu: host
efidisk0: local-lvm:vm-1400-disk-0,efitype=4m,size=4M
hostpci0: 0000:08:00,pcie=1,x-vga=1
machine: q35
memory: 24576
meta: creation-qemu=8.1.5,ctime=1714664464
name: MacOSSonoma
net0: vmxnet3=BC:24:11:90:25:B0,bridge=vmbr0,firewall=1
numa: 0
ostype: other
scsihw: virtio-scsi-pci
smbios1: uuid=a641f28c-ad0e-43ea-81f6-6be3b82e14f3
sockets: 1
usb0: host=1ea7:0064,usb3=1
usb1: host=0c45:760b,usb3=1
vga: none
virtio0: local-lvm:vm-1400-disk-1,cache=unsafe,iothread=1,size=60G
vmgenid: 119e34c9-bd42-4034-8f74-cf2212e86cac

This is the system log when trying to start the Windows 10 VM with the 6950
Code:
May 16 09:35:43 Proxmox systemd[1]: Started 101.scope.
May 16 09:35:44 Proxmox kernel: tap101i0: entered promiscuous mode
May 16 09:35:44 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered blocking state
May 16 09:35:44 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered disabled state
May 16 09:35:44 Proxmox kernel: fwpr101p0: entered allmulticast mode
May 16 09:35:44 Proxmox kernel: fwpr101p0: entered promiscuous mode
May 16 09:35:44 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered blocking state
May 16 09:35:44 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered forwarding state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered blocking state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
May 16 09:35:44 Proxmox kernel: fwln101i0: entered allmulticast mode
May 16 09:35:44 Proxmox kernel: fwln101i0: entered promiscuous mode
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered blocking state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered forwarding state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 2(tap101i0) entered blocking state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 2(tap101i0) entered disabled state
May 16 09:35:44 Proxmox kernel: tap101i0: entered allmulticast mode
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 2(tap101i0) entered blocking state
May 16 09:35:44 Proxmox kernel: fwbr101i0: port 2(tap101i0) entered forwarding state
May 16 09:35:45 Proxmox kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 16 09:35:45 Proxmox kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 16 09:35:45 Proxmox kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 16 09:35:45 Proxmox kernel: pcieport 0000:00:01.0: AER: Uncorrectable (Non-Fatal) error message received from 0000:02:00.0
May 16 09:35:45 Proxmox kernel: tap101i0: left allmulticast mode
May 16 09:35:45 Proxmox kernel: fwbr101i0: port 2(tap101i0) entered disabled state
May 16 09:35:45 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
May 16 09:35:45 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered disabled state
May 16 09:35:45 Proxmox kernel: fwln101i0 (unregistering): left allmulticast mode
May 16 09:35:45 Proxmox kernel: fwln101i0 (unregistering): left promiscuous mode
May 16 09:35:45 Proxmox kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
May 16 09:35:45 Proxmox kernel: fwpr101p0 (unregistering): left allmulticast mode
May 16 09:35:45 Proxmox kernel: fwpr101p0 (unregistering): left promiscuous mode
May 16 09:35:45 Proxmox kernel: vmbr0: port 2(fwpr101p0) entered disabled state
May 16 09:35:45 Proxmox pvedaemon[2981]: stopping swtpm instance (pid 2989) due to QEMU startup error

This is the output from the task viewer:
Code:
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,host=0000:03:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,rombar=0,multifunction=on: vfio 0000:03:00.0: error getting device from group 16: No such device
Verify all devices in group 16 are bound to vfio-<bus> or pci-stub and not already in use
stopping swtpm instance (pid 2989) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1

LSPCI -V
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c0) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 14
Memory at 42b00000 (32-bit, non-prefetchable) [size=16K]

Bus: primary=01, secondary=02, subordinate=03, sec-latency=0
I/O behind bridge: 4000-4fff [size=4K] [16-bit]

Memory behind bridge: 42900000-42afffff [size=2M] [32-bit]
Prefetchable memory behind bridge: 60000000-77ffffff [size=384M] [32-bit]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Upstream Port, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Flags: bus master, fast devsel, latency 0, IRQ 126, IOMMU group 15
Bus: primary=02, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 4000-4fff [size=4K] [16-bit]

Memory behind bridge: 42900000-42afffff [size=2M] [32-bit]
Prefetchable memory behind bridge: 60000000-77ffffff [size=384M] [32-bit]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Downstream Port (Slot-), MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [c0] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] (rev c0) (prog-if 00 [VGA controller])
Subsystem: Sapphire Technology Limited Navi 21 [Radeon RX 6950 XT]
!!! Unknown header type 7f
Memory at 60000000 (64-bit, prefetchable) [size=256M]

Memory at 70000000 (64-bit, prefetchable) [size=2M]
I/O ports at 4000
Memory at 42900000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 42a00000 [disabled] [size=128K]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu

03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
!!! Unknown header type 7f
Memory at 42a20000 (32-bit, non-prefetchable) [size=16K]

Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

06:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c0) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 20
Memory at 42800000 (32-bit, non-prefetchable) [size=16K]

Bus: primary=06, secondary=07, subordinate=08, sec-latency=0
I/O behind bridge: 3000-3fff [size=4K] [16-bit]

Memory behind bridge: 42600000-427fffff [size=2M] [32-bit]
Prefetchable memory behind bridge: 78000000-8fffffff [size=384M] [32-bit]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Upstream Port, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [370] L1 PM Substates
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

07:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Flags: bus master, fast devsel, latency 0, IRQ 127, IOMMU group 21
Bus: primary=07, secondary=08, subordinate=08, sec-latency=0
I/O behind bridge: 3000-3fff [size=4K] [16-bit]

Memory behind bridge: 42600000-427fffff [size=2M] [32-bit]
Prefetchable memory behind bridge: 78000000-8fffffff [size=384M] [32-bit]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Downstream Port (Slot-), MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [c0] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c0) (prog-if 00 [VGA controller])
Subsystem: Sapphire Technology Limited Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
Flags: bus master, fast devsel, latency 0, IRQ 11, IOMMU group 22
Memory at 80000000 (64-bit, prefetchable) [size=256M]

Memory at 78000000 (64-bit, prefetchable) [size=2M]
I/O ports at 3000
Memory at 42600000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 42700000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Physical Resizable BAR
Capabilities: [240] Power Budgeting <?>
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel modules: amdgpu

08:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Flags: bus master, fast devsel, latency 0, IRQ 10, IOMMU group 23
Memory at 42720000 (32-bit, non-prefetchable) [size=16K]

Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [2a0] Access Control Services
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel



/etc/modprobe.d/vfio.conf:
Code:
options vfio-pci ids=: 1002:73a5, 1002:73bf disable_vga=1 diable_idle_d3=1

/etc/modprobe.d/pve-blacklist.conf
Code:
blacklist nvidiafb
blacklist amdgpu
blacklist radeon

/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/default/grub
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt video=efifb:off"
GRUB_CMDLINE_LINUX=""
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!