Multiple GPU passtrough did not work on VM

achilless · Jan 30, 2024

Hello to all.
The problem is when i passtrough one gpu VM start without problem or issue. After added second gpu VM did not start somehow down or hang i do not know what happend. I did not find any log about why second gpu passtrough does not work.
I did try a lot of setting like grub options..
How can i solve or work with multi gpu with VM?
Here is my configurations;

* /etc/default/grub

Code:

root@gpuserver:~# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

* dmesg | grep -e DMAR -e IOMMU

Code:

root@gpuserver:~# dmesg | grep -e DMAR -e IOMMU
[    0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[    0.013667] ACPI: DMAR 0x000000007C047628 000138 (v01 A M I  OEMDMAR  00000001 INTL 00000001)
[    0.013687] ACPI: Reserving DMAR table memory at [mem 0x7c047628-0x7c04775f]
[    0.565868] DMAR: IOMMU enabled
[    1.283707] DMAR: Host address width 46
[    1.283708] DMAR: DRHD base: 0x000000fbffe000 flags: 0x0
[    1.283717] DMAR: dmar0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    1.283720] DMAR: DRHD base: 0x000000dfffc000 flags: 0x1
[    1.283725] DMAR: dmar1: reg_base_addr dfffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[    1.283727] DMAR: RMRR base: 0x0000007c652000 end: 0x0000007c660fff
[    1.283730] DMAR: ATSR flags: 0x0
[    1.283732] DMAR: RHSA base: 0x000000fbffe000 proximity domain: 0x0
[    1.283734] DMAR: RHSA base: 0x000000dfffc000 proximity domain: 0x0
[    1.283737] DMAR-IR: IOAPIC id 3 under DRHD base  0xfbffe000 IOMMU 0
[    1.283740] DMAR-IR: IOAPIC id 0 under DRHD base  0xdfffc000 IOMMU 1
[    1.283741] DMAR-IR: IOAPIC id 2 under DRHD base  0xdfffc000 IOMMU 1
[    1.283743] DMAR-IR: HPET id 0 under DRHD base 0xdfffc000
[    1.283745] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    1.284875] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    2.770751] DMAR: No SATC found
[    2.770756] DMAR: dmar0: Using Queued invalidation
[    2.770770] DMAR: dmar1: Using Queued invalidation
[    2.780090] DMAR: Intel(R) Virtualization Technology for Directed I/O

* IOMMU group for all GPU devices

Code:

│ 0x030000 │ 0x2484 │ 0000:0a:00.0 │         44 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:0b:00.0 │         46 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:0c:00.0 │         48 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:11:00.0 │         57 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:12:00.0 │         59 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:13:00.0 │         61 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x136e           │                                     │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:14:00.0 │         63 │ 0x10de │ GA104 [GeForce RTX 3070]                                                                              │      │ 0x2484           │                                     │ 0x1569

*my VM setting

Code:

root@gpuserver:~# cat /etc/pve/qemu-server/100.conf
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 8
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:0a:00,pcie=1
hostpci1: 0000:0b:00,pcie=1
ide2: local:iso/ubuntu-20.04.6-live-server-amd64.iso,media=cdrom,size=1452480K
machine: q35
memory: 16384
meta: creation-qemu=8.0.2,ctime=1706534112
name: GPUServer-1
net0: virtio=86:FB:E6:5C:44:FA,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-1,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=2f79e2b9-51b5-40a0-b63f-28a66a4773a0
sockets: 2
vmgenid: 1cb93cac-d3d2-492a-b6e4-b3ae98744455

* cat /etc/modules

Code:

root@gpuserver:~# cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

* lsmod | grep vfio

Code:

root@gpuserver:~# lsmod | grep vfio
vfio_pci               16384  0
vfio_pci_core          94208  1 vfio_pci
irqbypass              16384  2 vfio_pci_core,kvm
vfio_iommu_type1       49152  0
vfio                   57344  3 vfio_pci_core,vfio_iommu_type1,vfio_pci
iommufd                73728  1 vfio

* /etc/modprobe.d/blacklist.conf

Code:

root@gpuserver:~# cat /etc/modprobe.d/blacklist.conf
blacklist nouveau
blacklist nvidia

* dmesg | grep 'remapping'

Code:

root@gpuserver:~# dmesg | grep 'remapping'
[    1.283745] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    1.284875] DMAR-IR: Enabled IRQ remapping in x2apic mode

* iommu_unsafe_interrupts.conf

Code:

root@gpuserver:~# cat /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

leesteken · Jan 30, 2024

achilless said:
The problem is when i passtrough one gpu VM start without problem or issue. After added second gpu VM did not start somehow down or hang i do not know what happend. I did not find any log about why second gpu passtrough does not work.

Is there nothing in journalctl (scroll with arrow keys) from around the time of starting the VM with two GPUs? Can you show the (failed/handing) start of VM 100 from that log?

achilless · Jan 30, 2024

leesteken said:
Is there nothing in journalctl (scroll with arrow keys) from around the time of starting the VM with two GPUs? Can you show the (failed/handing) start of VM 100 from that log?

I could not see any error or warning on journalctl output.
Is there any exact keyword to attention ?

leesteken · Jan 30, 2024

achilless said:
I could not see any error or warning on journalctl output.
Is there any exact keyword to attention ?

Find the time when VM 100 is starting (with two GPU) and show twenty lines or so.

achilless · Jan 30, 2024

leesteken said:
Find the time when VM 100 is starting (with two GPU) and show twenty lines or so.

here is the output
VM started at 17:03


Jan 30 17:03:12 gpuserver systemd[1808]: Reached target default.target - Main User Target.
Jan 30 17:03:12 gpuserver systemd[1808]: Startup finished in 159ms.
Jan 30 17:03:12 gpuserver systemd[1]: Started user@0.service - User Manager for UID 0.
Jan 30 17:03:12 gpuserver systemd[1]: Started session-1.scope - Session 1 of User root.
Jan 30 17:03:12 gpuserver login[1824]: ROOT LOGIN  on '/dev/pts/0'
Jan 30 17:03:16 gpuserver kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE
Jan 30 17:03:16 gpuserver kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.0: No more image in the PCI ROM
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0a:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: enabling device (0000 -> 0003)
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.1: enabling device (0000 -> 0002)
Jan 30 17:03:16 gpuserver kernel: vfio-pci 0000:0b:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
Jan 30 17:03:17 gpuserver pvedaemon[1428]: <root@pam> end task UPID:gpuserver:0000064B:000022A4:65B9019C:qmstart:100:root@pam: OK
Jan 30 17:03:17 gpuserver pvestatd[1401]: status update time (7.850 seconds)
Jan 30 17:03:23 gpuserver kernel: vfio-pci 0000:0a:00.0: No more image in the PCI ROM
Jan 30 17:03:23 gpuserver kernel: vfio-pci 0000:0a:00.0: No more image in the PCI ROM
Jan 30 17:03:51 gpuserver systemd[1]: session-1.scope: Deactivated successfully.
Jan 30 17:03:51 gpuserver systemd-logind[1065]: Session 1 logged out. Waiting for processes to exit.
Jan 30 17:03:51 gpuserver systemd-logind[1065]: Removed session 1.
Jan 30 17:03:51 gpuserver pvedaemon[1430]: <root@pam> end task UPID:gpuserver:00000704:000023E7:65B901A0:vncshell::root@pam: OK
Jan 30 17:03:52 gpuserver pvedaemon[1919]: starting vnc proxy UPID:gpuserver:0000077F:00003388:65B901C8:vncproxy:100:root@pam:
Jan 30 17:03:52 gpuserver pvedaemon[1428]: <root@pam> starting task UPID:gpuserver:0000077F:00003388:65B901C8:vncproxy:100:root@pam:
Jan 30 17:03:55 gpuserver pvedaemon[1428]: <root@pam> end task UPID:gpuserver:0000077F:00003388:65B901C8:vncproxy:100:root@pam: OK
Jan 30 17:03:55 gpuserver pvedaemon[1927]: starting termproxy UPID:gpuserver:00000787:000034B5:65B901CB:vncshell::root@pam:
Jan 30 17:03:55 gpuserver pvedaemon[1428]: <root@pam> starting task UPID:gpuserver:00000787:000034B5:65B901CB:vncshell::root@pam:
Jan 30 17:03:55 gpuserver pvedaemon[1429]: <root@pam> successful auth for user 'root@pam'
Jan 30 17:03:55 gpuserver login[1930]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
Jan 30 17:03:55 gpuserver systemd-logind[1065]: New session 3 of user root.
Jan 30 17:03:55 gpuserver systemd[1]: Started session-3.scope - Session 3 of User root.
Jan 30 17:03:55 gpuserver login[1935]: ROOT LOGIN  on '/dev/pts/0'

leesteken · Jan 30, 2024

Indeed, not much stands out. No messages about reset issues with any of the GPUs.

achilless said:
Jan 30 17:03:16 gpuserver kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE Jan 30 17:03:16 gpuserver kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled

I have not seen this before. Maybe try another PCIe slot?

achilless · Jan 30, 2024

leesteken said:
Indeed, not much stands out. No messages about reset issues with any of the GPUs.

I have not seen this before. Maybe try another PCIe slot?

What should i do ?
already whole pci slots inserted with GPU. VM does not work with more than one gpu.

leesteken · Jan 30, 2024

achilless said:
What should i do ?
already whole pci slots inserted with GPU. VM does not work with more than one gpu.

Then I don't know what to do or how to troubleshoot, sorry. Maybe someone else here has experience with this.

achilless · Jan 30, 2024

I have found the problem i think this GPU has a problem "
hostpci0: 0000:0a:00,pcie=1 "

That is why vm hang. When i disable that pcie slot i can activate more then one gpu.

Multiple GPU passtrough did not work on VM

achilless

New Member

leesteken

Distinguished Member

achilless

New Member

leesteken

Distinguished Member

achilless

New Member

leesteken

Distinguished Member

achilless

New Member

leesteken

Distinguished Member

achilless

New Member

We value your privacy