Guest crashing host with iGPU passthrough

unionaire

New Member
May 18, 2020
5
0
1
41
Hi there, my hardware is as follows,
The system is
ASROCK J3455-itx with HD500 as the IGD
8G of RAM,
256G SSD
The SSD is ngff to USB3 interface as system drive, so I can passthrough both SATA controllers onboard to my guest systems.

After reading numerous articles and tips online, I was finally able to passthrough the integrated intel HD500 GPU with romfile option in legacy mode (it doesn't support UEFI mode).

The issue I am having now is that there are two errors with the passthroughs when I run the guestOS with following config:

Code:
args: -device vfio-pci,host=00:02.0,bus=pci.0,addr=02.0,romfile=j3455.bin,x-igd-gms=1,x-igd-opregion=on
bios: seabios
boot: c
bootdisk: sata4
cores: 4
cpu: host
hostpci0: 00:0e,rombar=0
hostpci1: 04:00.1,rombar=0
hostpci2: 00:12,rombar=0
hostpci3: 03:00,rombar=0
memory: 2048
name: LibreELEC
numa: 0
ostype: l26
sata4: local-lvm:vm-103-disk-4,size=52M
scsihw: virtio-scsi-pci
smbios1: uuid=b5464d27-df4b-44fe-a580-70c937010812
sockets: 1
unused0: local-lvm:vm-103-disk-2
unused1: local-lvm:vm-103-disk-1
unused3: local-lvm:vm-103-disk-0
vga: none
vmgenid: dadb9056-5b75-43ce-afa2-af6205876563

I applied ACS patch to the kernel, and here's my running kernel
pve-manager/6.1-3/37248ce6 (running kernel: 5.4.30-1-pve)

I blacklisted drivers, disabled efi framebuffer in grub (because PVE is booted in EFI mode), and forced VGA and sound to load vfci-pci drivers instead of intel drivers.
eventually I was able to pass through the GPU (not perfectly though) with following errors:

Code:
kvm: -device vfio-pci,host=00:02.0,bus=pci.0,addr=02.0,romfile=j3455.bin,x-igd-gms=1,x-igd-opregion=on: IGD device 0000:00:02.0 cannot support legacy mode due to existing devices at address 1f.0
kvm: vfio: Cannot reset device 0000:00:12.0, no available reset mechanism.
kvm: vfio: Cannot reset device 0000:00:12.0, no available reset mechanism.

Aslo dmesg error:
Code:
[   81.762510] DMAR: DRHD: handling fault status reg 2
[   81.762523] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[   81.762575] DMAR: DRHD: handling fault status reg 2
[   81.762579] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[   81.762614] DMAR: DRHD: handling fault status reg 2
[   81.762619] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[   81.898342] vfio-pci 0000:00:02.0: vfio_ecap_init: hiding ecap 0x1b@0x100
[   81.901223] vfio-pci 0000:00:02.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x533e

I do get a display on my monitor when I run LibreELEC system which means it works right?

Problem is, if I try to change the system to Synology's DSM with the same setup above, the system will boot and the vga will work (no display but I can utilize the device at /dev/dri as renderd128 for decoding work.

Everything seems to be perfect except those error logs, with only one problem.

No matter how I shutdown or restart the DSM system (linux based right?), the guestOS will crash my PVE host, the host will be forced to restart with no exceptions.

I have limited linux skills and am new to Proxmox, so I don't know where to look for error logs relating to the crash, can someone help me?
 
hostpci0: 00:0e,rombar=0 hostpci1: 04:00.1,rombar=0 hostpci2: 00:12,rombar=0 hostpci3: 03:00,rombar=0
can you post the full output of lspci -k and your iommu groups?
 
I did a fresh install to PVE 6.2 with acs patch, and in 6.2 the error message is gone when I'm using the following config, but it still crashes the host when shutdown, restart or qm stop the virtual machine.

Code:
args: -device vfio-pci,host=00:02.0,addr=0x02,romfile=vgarom.bin,x-vga=on,x-igd-opregion=on
bios: seabios
boot: cdn
bootdisk: sata1
cores: 4
cpu: host
hostpci0: 00:0e,rombar=0
hostpci2: 04:00.1
hostpci3: 00:12.0
hostpci4: 03:00.0
memory: 4096
name: DS918
numa: 0
ostype: l26
sata1: local-lvm:vm-100-disk-1,size=52M
scsihw: virtio-scsi-pci
smbios1: uuid=41f33cca-011d-4699-beb4-ea6df94622b7
sockets: 1
vga: none
vmgenid: 37d6caba-9e80-4cfb-8b08-d5d2a245a376

Noted I cannot use 'hostpciX' config because it will not assign the PCI device to a specific bus and device number, which the intel graphics card are hard coded to work on. Also with x-vga=on in args, I can get a external display, without it I cannot.


Code:
pve-manager/6.2-4/9824574a (running kernel: 5.4.41-1-pve)

Code:
root@pve:~# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/7/devices/0000:00:13.2
/sys/kernel/iommu_groups/5/devices/0000:00:13.0
/sys/kernel/iommu_groups/13/devices/0000:04:00.0
/sys/kernel/iommu_groups/3/devices/0000:00:0f.0
/sys/kernel/iommu_groups/11/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/8/devices/0000:00:13.3
/sys/kernel/iommu_groups/6/devices/0000:00:13.1
/sys/kernel/iommu_groups/14/devices/0000:04:00.1
/sys/kernel/iommu_groups/4/devices/0000:00:12.0
/sys/kernel/iommu_groups/12/devices/0000:03:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:0e.0
/sys/kernel/iommu_groups/10/devices/0000:00:1f.0
/sys/kernel/iommu_groups/10/devices/0000:00:1f.1
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:15.0

Code:
00:00.0 Host bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Host Bridge (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series Host Bridge
00:02.0 VGA compatible controller: Intel Corporation Device 5a85 (rev 0b)
    Subsystem: ASRock Incorporation Device 5a85
    Kernel driver in use: vfio-pci
    Kernel modules: i915
00:0e.0 Audio device: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Audio Cluster (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series Audio Cluster
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel, snd_sof_pci
00:0f.0 Communication controller: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Trusted Execution Engine (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series Trusted Execution Engine
    Kernel driver in use: mei_me
    Kernel modules: mei_me
00:12.0 SATA controller: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series SATA AHCI Controller (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series SATA AHCI Controller
    Kernel driver in use: ahci
    Kernel modules: ahci
00:13.0 PCI bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #1 (rev fb)
    Kernel driver in use: pcieport
00:13.1 PCI bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #2 (rev fb)
    Kernel driver in use: pcieport
00:13.2 PCI bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #3 (rev fb)
    Kernel driver in use: pcieport
00:13.3 PCI bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #4 (rev fb)
    Kernel driver in use: pcieport
00:15.0 USB controller: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series USB xHCI (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series USB xHCI
    Kernel driver in use: xhci_hcd
    Kernel modules: xhci_pci
00:1f.0 ISA bridge: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Low Pin Count Interface (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series Low Pin Count Interface
    Kernel driver in use: lpc_ich
    Kernel modules: lpc_ich
00:1f.1 SMBus: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series SMBus Controller (rev 0b)
    Subsystem: ASRock Incorporation Celeron N3350/Pentium N4200/Atom E3900 Series SMBus Controller
    Kernel driver in use: i801_smbus
    Kernel modules: i2c_i801
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 11)
    Subsystem: ASRock Incorporation Motherboard (one of many)
    Kernel driver in use: r8169
    Kernel modules: r8169
03:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02)
    Subsystem: ASRock Incorporation Motherboard
    Kernel driver in use: ahci
    Kernel modules: ahci
04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    Subsystem: Intel Corporation 82576 Gigabit Network Connection
    Kernel driver in use: igb
    Kernel modules: igb
04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    Subsystem: Intel Corporation 82576 Gigabit Network Connection
    Kernel driver in use: igb
    Kernel modules: igb
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!