[SOLVED] Issues with Intel ARC A770M GPU Passthrough on NUC12SNKi72 (vfio-pci not ready after FLR or bus reset)

blebo

Active Member
Apr 22, 2018
12
6
43
36
Perth, Australia
Has anyone had success with passing through the Intel ARC A770M (or similar) to a VM (Windows or Linux)?

I think I have vfio binding mostly working, however when I attempt to start the Windows 11 VM, it complains about VFIO being not ready/giving up on resets (causing the Proxmox task to timeout, the CPU core to lockup, leading to host freezing completely shortly after).

NOTE: I have managed to get iGPU split passthrough working on a separate Win11 VM (via https://github.com/strongtz/i915-sriov-dkms ), but I was getting the same issue before and after I applied this. Ideally, I'd like to have the iGPU split passthrough using SR-IOV to 1 or more VMs, and the dGPU passed to a separate designated VM - so I'd like to avoid blacklisting i915 outright if I can - but open to alternatives if that's all that's available for now.

Main error found in journalctl -f, which can also be forced with echo 1 > /sys/bus/pci/drivers/vfio-pci/0000\:03\:00.0/reset :

Code:
Jul 15 19:09:07 nuc12 kernel: vfio-pci 0000:03:00.0: not ready 1023ms after FLR; waiting
...
Jul 15 19:10:21 nuc12 kernel: vfio-pci 0000:03:00.0: not ready 65535ms after FLR; giving up
Jul 15 19:10:34 nuc12 kernel: vfio-pci 0000:03:00.0: not ready 1023ms after bus reset; waiting
...
Jul 15 19:11:48 nuc12 kernel: vfio-pci 0000:03:00.0: not ready 65535ms after bus reset; giving up
Jul 15 19:12:12 nuc12 kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 27s! [bash:5279]

CONFIGURATION

  • Intel NUC12SNKi72 (NUC12 Enthusiast Kit)
    • CPU: 14-cores (6xP + 8xE, 20 threads) Core i7-12700H (P:4.7 Ghz, E:3.5 GHz)
    • iGPU: Intel Iris Xe
    • dGPU: Intel ARC A770M, 16 GB
  • proxmox-ve: 8.0.1 (running kernel: 6.2.16-4-pve)
  • pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)

cat /etc/kernel/cmdline :

Code:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt initcall_blacklist=sysfb_init

cat /etc/modprobe.d/*.conf (alias determined via sysfs - https://null-src.com/posts/qemu-vfio-pci/post.php ):

Code:
#dGPU early alias
alias pci:v00008086d00005690sv00008086sd00003026bc03sc00i00 vfio-pci

#dGPU Audio early alias
alias pci:v00008086d00004F90sv00008086sd00003026bc04sc03i00 vfio-pci

#dGPU + Audio
options vfio-pci ids=8086:5690,8086:4f90 disable_vga=1

# Set up SR-IOV for Intel iGPU (Xe)
options i915 enable_guc=3 max_vfs=7

options kvm ignore_msrs=1 report_ignored_msrs=0

# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

cat /etc/modules :

Code:
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

# Modules required for PCI passthrough
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

# Modules required for Intel GVT-g Split
kvmgt

# for 10 Gigabit Networking over Thunderbolt
thunderbolt-net

cat /etc/pve/qemu-server/103.conf (Windows 11 VM103):

Code:
affinity: 0-11
agent: 1
balloon: 8192
bios: ovmf
boot: order=virtio0;net0
cores: 6
cpu: host
efidisk0: local-zfs:vm-103-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:03:00,pcie=1,x-vga=1
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1688796354
name: win11vfio
net0: virtio=56:4D:FF:7A:CE:23,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=72be18e4-d767-4149-a7a9-e054dba3183e
sockets: 1
tpmstate0: local-zfs:vm-103-disk-1,size=4M,version=v2.0
virtio0: local-zfs:vm-103-disk-2,iothread=1,size=256G
vmgenid: 93fc1df6-8f95-4c4d-a68f-9ee1e9e05e99

OTHER OUTPUT:

lspci -nnk :
Code:
00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c)
    DeviceName: GPU
    Subsystem: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:3026]
    Kernel driver in use: i915
    Kernel modules: i915
00:02.1 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c)
    Subsystem: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:3026]
    Kernel driver in use: i915
    Kernel modules: i915
...
03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A770M] [8086:5690] (rev 08)
    DeviceName: dGPU
    Subsystem: Intel Corporation DG2 [Arc A770M] [8086:3026]
    Kernel driver in use: vfio-pci
    Kernel modules: i915
04:00.0 Audio device [0403]: Intel Corporation DG2 Audio Controller [8086:4f90]
    Subsystem: Intel Corporation DG2 Audio Controller [8086:3026]
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel


dmesg | grep -E "IOMMU|DMAR|vfio|vga":

[ 0.012942] ACPI: DMAR 0x000000003DC47000 000088 (v02 INTEL NUCxi7A5 00000039 01000013)
[ 0.012975] ACPI: Reserving DMAR table memory at [mem 0x3dc47000-0x3dc47087]
[ 0.061397] DMAR: IOMMU enabled
[ 0.136673] DMAR: Host address width 39
[ 0.136674] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.136678] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.136681] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.136684] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.136687] DMAR: RMRR base: 0x0000004b000000 end: 0x0000004f7fffff
[ 0.136689] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.136691] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.136692] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.138328] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 1.223966] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 1.629121] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[ 1.629121] pci 0000:00:02.0: vgaarb: bridge control possible
[ 1.629121] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=mem,locks=none
[ 1.629121] pci 0000:03:00.0: vgaarb: setting as boot VGA device (overriding previous)
[ 1.629121] pci 0000:03:00.0: vgaarb: bridge control possible
[ 1.629121] pci 0000:03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 1.629121] vgaarb: loaded
[ 1.647934] DMAR: No ATSR found
[ 1.647936] DMAR: No SATC found
[ 1.647937] DMAR: IOMMU feature fl1gp_support inconsistent
[ 1.647937] DMAR: IOMMU feature pgsel_inv inconsistent
[ 1.647939] DMAR: IOMMU feature nwfs inconsistent
[ 1.647940] DMAR: IOMMU feature dit inconsistent
[ 1.647941] DMAR: IOMMU feature sc_support inconsistent
[ 1.647941] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 1.647943] DMAR: dmar0: Using Queued invalidation
[ 1.647945] DMAR: dmar1: Using Queued invalidation
[ 1.648713] DMAR: Intel(R) Virtualization Technology for Directed I/O
[ 6.208416] vfio-pci 0000:03:00.0: vgaarb: deactivate vga console
[ 6.208421] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 6.208537] vfio_pci: add [8086:5690[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.208634] vfio_pci: add [8086:4f90[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.436014] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=mem
[ 6.436022] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.482836] pci 0000:00:02.1: DMAR: Skip IOMMU disabling for graphics
[ 8.482898] pci 0000:00:02.1: vgaarb: bridge control possible
[ 8.482900] pci 0000:00:02.1: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.482904] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.482907] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.490369] pci 0000:00:02.2: DMAR: Skip IOMMU disabling for graphics
[ 8.490411] pci 0000:00:02.2: vgaarb: bridge control possible
[ 8.490412] pci 0000:00:02.2: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.490416] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.490419] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.490422] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 8.495377] pci 0000:00:02.3: DMAR: Skip IOMMU disabling for graphics
[ 8.495420] pci 0000:00:02.3: vgaarb: bridge control possible
[ 8.495421] pci 0000:00:02.3: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.495424] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.495427] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.495430] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.495433] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 8.498961] pci 0000:00:02.4: DMAR: Skip IOMMU disabling for graphics
[ 8.498997] pci 0000:00:02.4: vgaarb: bridge control possible
[ 8.498998] pci 0000:00:02.4: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.499001] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.499004] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.499007] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.499010] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.499013] i915 0000:00:02.3: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 8.501850] pci 0000:00:02.5: DMAR: Skip IOMMU disabling for graphics
[ 8.501883] pci 0000:00:02.5: vgaarb: bridge control possible
[ 8.501884] pci 0000:00:02.5: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.501887] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.501889] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.501892] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.501896] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.501899] i915 0000:00:02.3: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.501902] i915 0000:00:02.4: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 8.504971] pci 0000:00:02.6: DMAR: Skip IOMMU disabling for graphics
[ 8.505011] pci 0000:00:02.6: vgaarb: bridge control possible
[ 8.505012] pci 0000:00:02.6: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.505015] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.505018] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.505021] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.505024] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.505027] i915 0000:00:02.3: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.505030] i915 0000:00:02.4: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.505033] i915 0000:00:02.5: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
[ 8.508623] pci 0000:00:02.7: DMAR: Skip IOMMU disabling for graphics
[ 8.508659] pci 0000:00:02.7: vgaarb: bridge control possible
[ 8.508660] pci 0000:00:02.7: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 8.508664] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=io+mem
[ 8.508666] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508669] i915 0000:00:02.1: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508672] i915 0000:00:02.2: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508676] i915 0000:00:02.3: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508679] i915 0000:00:02.4: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508682] i915 0000:00:02.5: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:eek:wns=none
[ 8.508685] i915 0000:00:02.6: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:eek:wns=none
 
  • Like
Reactions: Platinumsf
Looks like Intel ARC GPUs don't reset properly (even thought they report to support Function Level Reset). You could check if another reset method is available with cat /sys/bus/pci/devices/0000:03:00.0/reset_method. Write the method you want to try to that same file before starting the VM to test.
If the GPU does not reset properly, you might be able to use it for a VM (probably once per host reboot) by early binding 8086:5690,8086:4f90 to vfio-pci (which you already did), which makes sure that it is not touched by the actual drivers before starting the VM. You might need a softdep snd_hda_intel pre: vfio-pci and softdep i915 pre: vfio-pci (in /etc/modprobe.d/*.conf) to make sure vfio-pci is loaded first.
 
  • Like
Reactions: Platinumsf
Success!! TLDR: add a hook script to clear the reset_methods at VM pre-start

catting the reset_methods gave flr bus, the 2 methods not working above. However clearing it completely allowed the VM to boot and see the attached dGPU and its Audio controller:

Code:
echo > /sys/bus/pci/devices/0000:03:00.0/reset_method
echo > /sys/bus/pci/devices/0000:04:00.0/reset_method

The commands above would only work for the 1st attempt after boot (as suggested above). Subsequent VM starts would lock up due to vfio bus resets (but not FLR??), even though the reset_method still appeared to be cleared in sysfs.

Subsequent VM starts appear to be successful if clearing reset_methods as per above is run prior to each VM restart. Hence the hook script listed below.

Observed the following working:
  • Installed latest WHQL ARC drivers
    • GPU-Z detects resizable bar enabled for the ARC, but Intel driver detects resizable bar not enabled in the VM BIOS (which I think is something still to be developed in QEMU).
  • HDMI output via the Intel dGPU
  • dGPU Audio device detected (but haven't tested)
  • For fun,
    • Passthrough of WIFI (PCI: 0000:00:14.3) and BlueTooth (USB: 8087:0033).
    • Youtube/Edge video decode via dGPU
    • Blender running and using dGPU
    • Hot-plugged wireless USB mouse/keyboard.
    • Ran another Win11 VM using a SR-IOV iGPU VF at the same time.
Note: Due to above success, I didn't end applying the softdeps.

hookscript (based on template from https://gist.github.com/kiler129/215e2c8de853209ca429ad5ed40ce128):
Code:
#!/bin/bash
set -e -o errexit -o pipefail -o nounset

# located in /var/lib/vz/snippets/intel-dGPU-hookscript.sh or your 'snippets' location.
# Add to VM via: qm set VMID --hookscript local:snippets/intel-dGPU-hookscript.sh

# Do not modify these variables (set by Proxmox when calling the script)
vmId="$1"
runPhase="$2"
echo "Running $runPhase on VM=$vmId"

case "$runPhase" in
    pre-start)
        # Clear the reset methods before each start of the VM, to prevent the PVE host locking up
        # flr and bus methods dont work and re-appear after reboots.
        # bus method may still be attempted in subseqent VM starts, even if reset_method already cleared.
        echo "VM=$vmId - $runPhase : Clearing Intel dGPU and audio device reset_methods."
        echo > /sys/bus/pci/devices/0000:03:00.0/reset_method
        echo > /sys/bus/pci/devices/0000:04:00.0/reset_method

        # will appear as the following in journalctl:
        #  kernel: vfio-pci 0000:03:00.0: All device reset methods disabled by user
        #  kernel: vfio-pci 0000:04:00.0: All device reset methods disabled by user
    ;;

    post-start)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;

    pre-stop)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;
    post-stop)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;
    *)
      echo "Unknown run phase \"$runPhase\"!"
      ;;
esac
echo "Finished $runPhase on VM=$vmId"


Working VM config:
Code:
root@nuc12:~# cat /etc/pve/qemu-server/103.conf
affinity: 0-11
agent: 1
balloon: 8192
bios: ovmf
boot: order=virtio0;net0
cores: 6
cpu: host
efidisk0: local-zfs:vm-103-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: local:snippets/intel-dGPU-hookscript.sh
hostpci0: 0000:03:00,pcie=1,x-vga=1
hostpci1: 0000:04:00,pcie=1
hostpci2: 0000:00:14.3,pcie=1
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1688796354
name: win11vfio
net0: virtio=4A:3D:7D:C5:35:B8,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=d421488c-841e-43fa-90d9-33fe4349eb61
sockets: 1
tpmstate0: local-zfs:vm-103-disk-1,size=4M,version=v2.0
usb0: host=1997:2433,usb3=1
usb1: host=8087:0033,usb3=1
virtio0: local-zfs:vm-103-disk-2,iothread=1,size=256G
vmgenid: 1873c979-4993-4627-aa05-20bb0ac28468
 
Last edited:
Success!! TLDR: add a hook script to clear the reset_methods at VM pre-start

catting the reset_methods gave flr bus, the 2 methods not working above. However clearing it completely allowed the VM to boot and see the attached dGPU and its Audio controller:

Code:
echo > /sys/bus/pci/devices/0000:03:00.0/reset_method
echo > /sys/bus/pci/devices/0000:04:00.0/reset_method

The commands above would only work for the 1st attempt after boot (as suggested above). Subsequent VM starts would lock up due to vfio bus resets (but not FLR??), even though the reset_method still appeared to be cleared in sysfs.

Subsequent VM starts appear to be successful if clearing reset_methods as per above is run prior to each VM restart. Hence the hook script listed below.

Observed the following working:
  • Installed latest WHQL ARC drivers
    • GPU-Z detects resizable bar enabled for the ARC, but Intel driver detects resizable bar not enabled in the VM BIOS (which I think is something still to be developed in QEMU).
  • HDMI output via the Intel dGPU
  • dGPU Audio device detected (but haven't tested)
  • For fun,
    • Passthrough of WIFI (PCI: 0000:00:14.3) and BlueTooth (USB: 8087:0033).
    • Youtube/Edge video decode via dGPU
    • Blender running and using dGPU
    • Hot-plugged wireless USB mouse/keyboard.
    • Ran another Win11 VM using a SR-IOV iGPU VF at the same time.
Note: Due to above success, I didn't end applying the softdeps.

hookscript (based on template from https://gist.github.com/kiler129/215e2c8de853209ca429ad5ed40ce128):
Code:
#!/bin/bash
set -e -o errexit -o pipefail -o nounset

# located in /var/lib/vz/snippets/intel-dGPU-hookscript.sh or your 'snippets' location.
# Add to VM via: qm set VMID --hookscript local:snippets/intel-dGPU-hookscript.sh

# Do not modify these variables (set by Proxmox when calling the script)
vmId="$1"
runPhase="$2"
echo "Running $runPhase on VM=$vmId"

case "$runPhase" in
    pre-start)
        # Clear the reset methods before each start of the VM, to prevent the PVE host locking up
        # flr and bus methods dont work and re-appear after reboots.
        # bus method may still be attempted in subseqent VM starts, even if reset_method already cleared.
        echo "VM=$vmId - $runPhase : Clearing Intel dGPU and audio device reset_methods."
        echo > /sys/bus/pci/devices/0000:03:00.0/reset_method
        echo > /sys/bus/pci/devices/0000:04:00.0/reset_method

        # will appear as the following in journalctl:
        #  kernel: vfio-pci 0000:03:00.0: All device reset methods disabled by user
        #  kernel: vfio-pci 0000:04:00.0: All device reset methods disabled by user
    ;;

    post-start)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;

    pre-stop)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;
    post-stop)
        # placeholder .
        echo "VM=$vmId - $runPhase : No Action."
      ;;
    *)
      echo "Unknown run phase \"$runPhase\"!"
      ;;
esac
echo "Finished $runPhase on VM=$vmId"


Working VM config:
Code:
root@nuc12:~# cat /etc/pve/qemu-server/103.conf
affinity: 0-11
agent: 1
balloon: 8192
bios: ovmf
boot: order=virtio0;net0
cores: 6
cpu: host
efidisk0: local-zfs:vm-103-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: local:snippets/intel-dGPU-hookscript.sh
hostpci0: 0000:03:00,pcie=1,x-vga=1
hostpci1: 0000:04:00,pcie=1
hostpci2: 0000:00:14.3,pcie=1
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1688796354
name: win11vfio
net0: virtio=4A:3D:7D:C5:35:B8,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=d421488c-841e-43fa-90d9-33fe4349eb61
sockets: 1
tpmstate0: local-zfs:vm-103-disk-1,size=4M,version=v2.0
usb0: host=1997:2433,usb3=1
usb1: host=8087:0033,usb3=1
virtio0: local-zfs:vm-103-disk-2,iothread=1,size=256G
vmgenid: 1873c979-4993-4627-aa05-20bb0ac28468
I had trouble with the driver but passing the vbios and your script ended up both working.. thank you very much!
 
  • Like
Reactions: mbianculli
Lifesaver!! Was working on this for two days straight. got the video passed through no problems but the audio driver kept causing the host to crash when I would shutdown/restart the VM. Used your script to reset_method for both devices and running beautifully! using an Intel Arc 310A.
 
Hello, I am attempting to pass a dGPU A770 to a vm. I have watched multiple videos and think that I have most of the items complete such as blaklisting the i915 driver and the pci devie passed to the VM. Upon the starting of the vm I get vfio-pci not ready after flr issues. I think that this script can help me but I am not really understanding how to use it. I would like to cross flash th A770 and turn it into a flex card but at the moment I would like to give a windows 10 VM a dgpu. The drivers half install but it always fails on the HDMI firmware or the flr issues. Can you explain a little better how to implement this script or point me to another resource?

Here are some of the output of some of my configuration files.

/etc/kernel/cmdline
initcall_blacklist=sysfb_init



cat /etc/modprobe.d/*.conf
blacklist i915
options vfio_iommu_type1 allow_unsafe_interrupts=1
options kvm ignore_msrs=1
blacklist nvidiafb
options vfio-pci ids=8086:56a0 disable_vga=1
options zfs zfs_arc_max=13485735936


cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


/etc/pve/qemu-server/102.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;net0
cores: 4
cpu: host
efidisk0: Data:vm-102-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:03:00
machine: pc-q35-8.1
memory: 8192
meta: creation-qemu=8.1.5,ctime=1710729866
name: GPU
net0: virtio=BC:24:11:36:26:AF,bridge=vmbr0
numa: 0
ostype: win10
scsi0: Data:vm-102-disk-1,cache=writethrough,discard=on,iothread=1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=f245baad-b1e2-4d94-899a-eef37f122dae
sockets: 1
tpmstate0: Data:vm-102-disk-2,size=4M,version=v2.0
vga: none
vmgenid: bbd05077-7677-4c98-b530-2cefb8456bf1

dmesg | grep -E "IOMMU|DMAR|vfio|vga"
[ 2.085031] pci 0000:d0:00.0: vgaarb: setting as boot VGA device
[ 2.085031] pci 0000:d0:00.0: vgaarb: bridge control possible
[ 2.085031] pci 0000:d0:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 2.085031] pci 0000:03:00.0: vgaarb: bridge control possible
[ 2.085031] pci 0000:03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 2.085031] vgaarb: loaded
[ 2.115750] pci 0000:60:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.118462] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.120467] pci 0000:20:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.122532] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.125193] pci 0000:60:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125207] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125219] pci 0000:20:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125231] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.126139] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 2.126149] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[ 2.126160] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[ 2.126169] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
[ 9.495649] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 9.495773] vfio_pci: add [8086:56a0[ffffffff:ffffffff]] class 0x000000/00000000
[ 10.092719] ast 0000:d0:00.0: vgaarb: deactivate vga console
 
Last edited:
Hello, I am attempting to pass a dGPU A770 to a vm. I have watched multiple videos and think that I have most of the items complete such as blaklisting the i915 driver and the pci devie passed to the VM. Upon the starting of the vm I get vfio-pci not ready after flr issues. I think that this script can help me but I am not really understanding how to use it. I would like to cross flash th A770 and turn it into a flex card but at the moment I would like to give a windows 10 VM a dgpu. The drivers half install but it always fails on the HDMI firmware or the flr issues. Can you explain a little better how to implement this script or point me to another resource?

Here are some of the output of some of my configuration files.

/etc/kernel/cmdline
initcall_blacklist=sysfb_init



cat /etc/modprobe.d/*.conf
blacklist i915
options vfio_iommu_type1 allow_unsafe_interrupts=1
options kvm ignore_msrs=1
blacklist nvidiafb
options vfio-pci ids=8086:56a0 disable_vga=1
options zfs zfs_arc_max=13485735936


cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


/etc/pve/qemu-server/102.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;net0
cores: 4
cpu: host
efidisk0: Data:vm-102-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:03:00
machine: pc-q35-8.1
memory: 8192
meta: creation-qemu=8.1.5,ctime=1710729866
name: GPU
net0: virtio=BC:24:11:36:26:AF,bridge=vmbr0
numa: 0
ostype: win10
scsi0: Data:vm-102-disk-1,cache=writethrough,discard=on,iothread=1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=f245baad-b1e2-4d94-899a-eef37f122dae
sockets: 1
tpmstate0: Data:vm-102-disk-2,size=4M,version=v2.0
vga: none
vmgenid: bbd05077-7677-4c98-b530-2cefb8456bf1

dmesg | grep -E "IOMMU|DMAR|vfio|vga"
[ 2.085031] pci 0000:d0:00.0: vgaarb: setting as boot VGA device
[ 2.085031] pci 0000:d0:00.0: vgaarb: bridge control possible
[ 2.085031] pci 0000:d0:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 2.085031] pci 0000:03:00.0: vgaarb: bridge control possible
[ 2.085031] pci 0000:03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[ 2.085031] vgaarb: loaded
[ 2.115750] pci 0000:60:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.118462] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.120467] pci 0000:20:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.122532] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.125193] pci 0000:60:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125207] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125219] pci 0000:20:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.125231] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.126139] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 2.126149] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[ 2.126160] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[ 2.126169] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
[ 9.495649] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 9.495773] vfio_pci: add [8086:56a0[ffffffff:ffffffff]] class 0x000000/00000000
[ 10.092719] ast 0000:d0:00.0: vgaarb: deactivate vga console
Passthrough the audio device as well since its in its own group (at least mine was) and save the script in the snippets directory https://forum.proxmox.com/threads/explaining-snippets-feature.53553/

i dumped the vbios and copied the device and vendor paths too.. that is already in the pci passthrough documentation so you can check that there since it's direct from the PVE team.

i have this block at the beginning which covers the entries you are missing. on your system they may be different.

Code:
cpu: host
efidisk0: ceph-storage-vms:vm-107-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: local:snippets/intel-dGPU-hookscript.sh
hostpci0: 0000:03:00,device-id=0x56a5,pcie=1,romfile=vbios.bin,vendor-id=0x8086,x-vga=1
hostpci1: 0000:04:00,device-id=0x4f92,pcie=1,vendor-id=0x8086
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!