Hello to all.
The problem is when i passtrough one gpu VM start without problem or issue. After added second gpu VM did not start somehow down or hang i do not know what happend. I did not find any log about why second gpu passtrough does not work.
I did try a lot of setting like grub options..
How can i solve or work with multi gpu with VM?
Here is my configurations;
* /etc/default/grub
* dmesg | grep -e DMAR -e IOMMU
* IOMMU group for all GPU devices
*my VM setting
* cat /etc/modules
* lsmod | grep vfio
* /etc/modprobe.d/blacklist.conf
* dmesg | grep 'remapping'
* iommu_unsafe_interrupts.conf
The problem is when i passtrough one gpu VM start without problem or issue. After added second gpu VM did not start somehow down or hang i do not know what happend. I did not find any log about why second gpu passtrough does not work.
I did try a lot of setting like grub options..
How can i solve or work with multi gpu with VM?
Here is my configurations;
* /etc/default/grub
Code:
root@gpuserver:~# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""
* dmesg | grep -e DMAR -e IOMMU
Code:
root@gpuserver:~# dmesg | grep -e DMAR -e IOMMU
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[ 0.013667] ACPI: DMAR 0x000000007C047628 000138 (v01 A M I OEMDMAR 00000001 INTL 00000001)
[ 0.013687] ACPI: Reserving DMAR table memory at [mem 0x7c047628-0x7c04775f]
[ 0.565868] DMAR: IOMMU enabled
[ 1.283707] DMAR: Host address width 46
[ 1.283708] DMAR: DRHD base: 0x000000fbffe000 flags: 0x0
[ 1.283717] DMAR: dmar0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 1.283720] DMAR: DRHD base: 0x000000dfffc000 flags: 0x1
[ 1.283725] DMAR: dmar1: reg_base_addr dfffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 1.283727] DMAR: RMRR base: 0x0000007c652000 end: 0x0000007c660fff
[ 1.283730] DMAR: ATSR flags: 0x0
[ 1.283732] DMAR: RHSA base: 0x000000fbffe000 proximity domain: 0x0
[ 1.283734] DMAR: RHSA base: 0x000000dfffc000 proximity domain: 0x0
[ 1.283737] DMAR-IR: IOAPIC id 3 under DRHD base 0xfbffe000 IOMMU 0
[ 1.283740] DMAR-IR: IOAPIC id 0 under DRHD base 0xdfffc000 IOMMU 1
[ 1.283741] DMAR-IR: IOAPIC id 2 under DRHD base 0xdfffc000 IOMMU 1
[ 1.283743] DMAR-IR: HPET id 0 under DRHD base 0xdfffc000
[ 1.283745] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 1.284875] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 2.770751] DMAR: No SATC found
[ 2.770756] DMAR: dmar0: Using Queued invalidation
[ 2.770770] DMAR: dmar1: Using Queued invalidation
[ 2.780090] DMAR: Intel(R) Virtualization Technology for Directed I/O
* IOMMU group for all GPU devices
Code:
│ 0x030000 │ 0x2484 │ 0000:0a:00.0 │ 44 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:0b:00.0 │ 46 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:0c:00.0 │ 48 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:11:00.0 │ 57 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:12:00.0 │ 59 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:13:00.0 │ 61 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x136e │ │ 0x196e
├──────────┼────────┼──────────────┼────────────┼────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┼──────────────────┼─────────────────────────────────────┼───────────
│ 0x030000 │ 0x2484 │ 0000:14:00.0 │ 63 │ 0x10de │ GA104 [GeForce RTX 3070] │ │ 0x2484 │ │ 0x1569
*my VM setting
Code:
root@gpuserver:~# cat /etc/pve/qemu-server/100.conf
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 8
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:0a:00,pcie=1
hostpci1: 0000:0b:00,pcie=1
ide2: local:iso/ubuntu-20.04.6-live-server-amd64.iso,media=cdrom,size=1452480K
machine: q35
memory: 16384
meta: creation-qemu=8.0.2,ctime=1706534112
name: GPUServer-1
net0: virtio=86:FB:E6:5C:44:FA,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-1,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=2f79e2b9-51b5-40a0-b63f-28a66a4773a0
sockets: 2
vmgenid: 1cb93cac-d3d2-492a-b6e4-b3ae98744455
* cat /etc/modules
Code:
root@gpuserver:~# cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
* lsmod | grep vfio
Code:
root@gpuserver:~# lsmod | grep vfio
vfio_pci 16384 0
vfio_pci_core 94208 1 vfio_pci
irqbypass 16384 2 vfio_pci_core,kvm
vfio_iommu_type1 49152 0
vfio 57344 3 vfio_pci_core,vfio_iommu_type1,vfio_pci
iommufd 73728 1 vfio
* /etc/modprobe.d/blacklist.conf
Code:
root@gpuserver:~# cat /etc/modprobe.d/blacklist.conf
blacklist nouveau
blacklist nvidia
* dmesg | grep 'remapping'
Code:
root@gpuserver:~# dmesg | grep 'remapping'
[ 1.283745] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 1.284875] DMAR-IR: Enabled IRQ remapping in x2apic mode
* iommu_unsafe_interrupts.conf
Code:
root@gpuserver:~# cat /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1
Last edited: