[SOLVED] GPU Passthrough, 2 identical GPUs, same VM

Jan 10, 2022
3
0
1
41
I have 2x Nvidia Titan RTXs that I am trying to pass through to a single VM (Machine Learning).

one is in the primary slot of the motherboard. I can pass through each of them successfully, but cannot pass them through at the same time. is there some sort of trick to get this to work?
 
what happens when you configure them both and start the vm?
can you post the vm config, and the host logs? (journal/dmesg)
 
The console shows a black screen. The VM shows that it starts ok, (VM is windows 11). The VM has RDP enabled and I cannot ping, or connect to it.

I have about 48 hours experience with Proxmox, and I came from Unraid so I have some experience with passthrough from there. that being said, here are the logs I think you are asking for:

syslog (1st line is me adding the 2nd gpu):
Code:
Jan 11 09:00:41 pve pvedaemon[1342029]: <root@pam> update VM 102: -hostpci1 0000:4c:00,pcie=1
Jan 11 09:00:52 pve pvedaemon[965337]: start VM 102: UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam:
Jan 11 09:00:52 pve pvedaemon[1414260]: <root@pam> starting task UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam:
Jan 11 09:00:52 pve systemd[1]: Started 102.scope.
Jan 11 09:00:52 pve systemd-udevd[965351]: Using default interface naming scheme 'v247'.
Jan 11 09:00:52 pve systemd-udevd[965351]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve kernel: device tap102i0 entered promiscuous mode
Jan 11 09:00:53 pve systemd-udevd[965386]: Using default interface naming scheme 'v247'.
Jan 11 09:00:53 pve systemd-udevd[965386]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve systemd-udevd[965386]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve systemd-udevd[965351]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Jan 11 09:00:53 pve kernel: device fwln102i0 entered promiscuous mode
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered forwarding state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered blocking state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered disabled state
Jan 11 09:00:53 pve kernel: device fwpr102p0 entered promiscuous mode
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered blocking state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered forwarding state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered forwarding state
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: No more image in the PCI ROM
Jan 11 09:00:54 pve kernel: vfio-pci 0000:4c:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 11 09:00:54 pve kernel: vfio-pci 0000:4c:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 11 09:00:57 pve pvedaemon[1414260]: <root@pam> end task UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam: OK
Jan 11 09:01:00 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:00 pve sshd[965585]: Accepted publickey for root from 192.168.1.43 port 57136 ssh2: RSA SHA256:MPrn1xBLhuz6YqFI5DNrp4iUqZoHq/UPBhTy6q8IG3M
Jan 11 09:01:00 pve sshd[965585]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 11 09:01:00 pve systemd-logind[6955]: New session 66 of user root.
Jan 11 09:01:00 pve systemd[1]: Started Session 66 of user root.
Jan 11 09:01:39 pve sshd[965585]: Received disconnect from 192.168.1.43 port 57136:11: disconnected by user
Jan 11 09:01:39 pve sshd[965585]: Disconnected from user root 192.168.1.43 port 57136
Jan 11 09:01:39 pve sshd[965585]: pam_unix(sshd:session): session closed for user root
Jan 11 09:01:39 pve systemd[1]: session-66.scope: Succeeded.
Jan 11 09:01:39 pve systemd-logind[6955]: Session 66 logged out. Waiting for processes to exit.
Jan 11 09:01:39 pve systemd-logind[6955]: Removed session 66.
Jan 11 09:01:39 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:39 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:40 pve sshd[966346]: Accepted publickey for root from 192.168.1.43 port 57140 ssh2: RSA SHA256:MPrn1xBLhuz6YqFI5DNrp4iUqZoHq/UPBhTy6q8IG3M
Jan 11 09:01:40 pve sshd[966346]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 11 09:01:40 pve systemd-logind[6955]: New session 67 of user root.
Jan 11 09:01:40 pve systemd[1]: Started Session 67 of user root.
Jan 11 09:01:46 pve sshd[966346]: Received disconnect from 192.168.1.43 port 57140:11: disconnected by user
Jan 11 09:01:46 pve sshd[966346]: Disconnected from user root 192.168.1.43 port 57140
Jan 11 09:01:46 pve sshd[966346]: pam_unix(sshd:session): session closed for user root
Jan 11 09:01:46 pve systemd[1]: session-67.scope: Succeeded.
Jan 11 09:01:46 pve systemd-logind[6955]: Session 67 logged out. Waiting for processes to exit.
Jan 11 09:01:46 pve systemd-logind[6955]: Removed session 67.
Jan 11 09:01:46 pve pmxcfs[7466]: [status] notice: received log

DMESG:
Code:
Jan 11 09:00:41 pve pvedaemon[1342029]: <root@pam> update VM 102: -hostpci1 0000:4c:00,pcie=1
Jan 11 09:00:52 pve pvedaemon[965337]: start VM 102: UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam:
Jan 11 09:00:52 pve pvedaemon[1414260]: <root@pam> starting task UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam:
Jan 11 09:00:52 pve systemd[1]: Started 102.scope.
Jan 11 09:00:52 pve systemd-udevd[965351]: Using default interface naming scheme 'v247'.
Jan 11 09:00:52 pve systemd-udevd[965351]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve kernel: device tap102i0 entered promiscuous mode
Jan 11 09:00:53 pve systemd-udevd[965386]: Using default interface naming scheme 'v247'.
Jan 11 09:00:53 pve systemd-udevd[965386]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve systemd-udevd[965386]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve systemd-udevd[965351]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Jan 11 09:00:53 pve kernel: device fwln102i0 entered promiscuous mode
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 1(fwln102i0) entered forwarding state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered blocking state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered disabled state
Jan 11 09:00:53 pve kernel: device fwpr102p0 entered promiscuous mode
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered blocking state
Jan 11 09:00:53 pve kernel: vmbr0: port 3(fwpr102p0) entered forwarding state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered blocking state
Jan 11 09:00:53 pve kernel: fwbr102i0: port 2(tap102i0) entered forwarding state
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 11 09:00:54 pve kernel: vfio-pci 0000:01:00.0: No more image in the PCI ROM
Jan 11 09:00:54 pve kernel: vfio-pci 0000:4c:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 11 09:00:54 pve kernel: vfio-pci 0000:4c:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 11 09:00:57 pve pvedaemon[1414260]: <root@pam> end task UPID:pve:000EBAD9:006F5258:61DDA9B4:qmstart:102:root@pam: OK
Jan 11 09:01:00 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:00 pve sshd[965585]: Accepted publickey for root from 192.168.1.43 port 57136 ssh2: RSA SHA256:MPrn1xBLhuz6YqFI5DNrp4iUqZoHq/UPBhTy6q8IG3M
Jan 11 09:01:00 pve sshd[965585]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 11 09:01:00 pve systemd-logind[6955]: New session 66 of user root.
Jan 11 09:01:00 pve systemd[1]: Started Session 66 of user root.
Jan 11 09:01:39 pve sshd[965585]: Received disconnect from 192.168.1.43 port 57136:11: disconnected by user
Jan 11 09:01:39 pve sshd[965585]: Disconnected from user root 192.168.1.43 port 57136
Jan 11 09:01:39 pve sshd[965585]: pam_unix(sshd:session): session closed for user root
Jan 11 09:01:39 pve systemd[1]: session-66.scope: Succeeded.
Jan 11 09:01:39 pve systemd-logind[6955]: Session 66 logged out. Waiting for processes to exit.
Jan 11 09:01:39 pve systemd-logind[6955]: Removed session 66.
Jan 11 09:01:39 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:39 pve pmxcfs[7466]: [status] notice: received log
Jan 11 09:01:40 pve sshd[966346]: Accepted publickey for root from 192.168.1.43 port 57140 ssh2: RSA SHA256:MPrn1xBLhuz6YqFI5DNrp4iUqZoHq/UPBhTy6q8IG3M
Jan 11 09:01:40 pve sshd[966346]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 11 09:01:40 pve systemd-logind[6955]: New session 67 of user root.
Jan 11 09:01:40 pve systemd[1]: Started Session 67 of user root.
Jan 11 09:01:46 pve sshd[966346]: Received disconnect from 192.168.1.43 port 57140:11: disconnected by user
Jan 11 09:01:46 pve sshd[966346]: Disconnected from user root 192.168.1.43 port 57140
Jan 11 09:01:46 pve sshd[966346]: pam_unix(sshd:session): session closed for user root
Jan 11 09:01:46 pve systemd[1]: session-67.scope: Succeeded.
Jan 11 09:01:46 pve systemd-logind[6955]: Session 67 logged out. Waiting for processes to exit.
Jan 11 09:01:46 pve systemd-logind[6955]: Removed session 67.
Jan 11 09:01:46 pve pmxcfs[7466]: [status] notice: received log

vm config:
Code:
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'
bios: ovmf
boot: order=ide0;ide2;net0
cores: 12
cpu: host,hidden=1,flags=+pcid
efidisk0: local-zfs:vm-102-disk-4,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:01:00,pcie=1,romfile=titanrtx.bin
hostpci1: 0000:4c:00,pcie=1
ide0: local-zfs:vm-102-disk-3,discard=on,size=500G
ide2: ISO:iso/virtio-win.iso,media=cdrom,size=543390K
machine: q35
memory: 8192
meta: creation-qemu=6.1.0,ctime=1641837389
name: Crypto
net0: virtio=1A:8E:04:BA:E0:D8,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-pci
smbios1: uuid=10e3bda4-1df4-47fc-b36d-7ffb54d2e46e
sockets: 1
tpmstate0: local-zfs:vm-102-disk-5,size=4M,version=v2.0
vmgenid: b06167c7-e2f6-4e8d-af30-9deb11e212fc
 
I'm having the same problem also. I have 2 RTX 4090's and one card can be passed through, but when both are passed through the VM doesn't boot and shows Display output is not active. I selected the ROM-Bar for both and tried selecting only one ROM-Bar and keeping the other one unselected. Neither are able to boot up the VM.

The CPU is an i7-11700KF.

Has anyone successfully passed through 2 GPU's to one Windows VM?
 

Attachments

  • Screenshot 2025-01-11 at 9.43.11 PM.png
    Screenshot 2025-01-11 at 9.43.11 PM.png
    228 KB · Views: 6
  • Screenshot 2025-01-11 at 9.43.32 PM.png
    Screenshot 2025-01-11 at 9.43.32 PM.png
    306.7 KB · Views: 6
if you have the same cards, you have also the same/identicsl pcie id's.
like on a layer 2 network twice the same MAC address.
 
if you have the same cards, you have also the same/identicsl pcie id's.
like on a layer 2 network twice the same MAC address.

Ahhh. What's interesting is I can have 2 different VM's running simultaneously, each access one of the passed through GPUs and the 2 VMs run both GPUs in their VMs ok. However, putting both GPUs into one VM prevents Proxmox from starting it up successfully.

Another interesting scenario, I have another machine with 4 RTX 4090s and a Ubuntu VM is able to start up, access and run all 4 of them successfully. I'm not sure why 2 GPUs in the i7-11700KF setup is having an issue. Could it possibly be related to Windows?
 
if you have the same cards, you have also the same/identicsl pcie id's.
It depends on what you mean with pcie id. The PCI ID's are different; in this case 01:00.0 and 04:00.0 (for the VGA function). The device ID's (not shown but of the form xyzw:abcd when using lspci -nn) are probably identical for (mostly) identical devices (but can differ based on brand or when the hardware changed over time but not the model name).
 
I'm finding out more. If you switch the VM's BIOS from OVMF (UEFI) to SeaBIOS, the virtual machine is able to boot up with both of the GPUs. They're identical GPUs also so it doesn't seem to be a problem with this setting. Now Windows isn't booting without UEFI. Onto the next problem to solve haha