IGD passthrough almost working

Vaielab

New Member
Dec 1, 2019
6
0
1
39
For the last few days I'm trying to do a igd passthrough, I managed to send signal to the guest but the drivers don't seem to be loaded correctly.
On this computer I have 2 dedicated graphic card and one on my cpu (i7-7700k)
The computer is my dayly workstation and will work 100% from the vm, so I need all 3 graphic card to be passthrough to the differents vm.
GPU passthrough work with my 2 dedicated graphic card, but will have somes problem with my integrated graphic card.
My guest os is a brand new ubuntu where I also passthrough the hd.

This is my pve version:

Code:
pveversion --verbose
proxmox-ve: 6.0-2 (running kernel: 5.0.21-5-pve)
pve-manager: 6.0-15 (running version: 6.0-15/52b91481)
pve-kernel-helper: 6.0-12
pve-kernel-5.0: 6.0-11
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-12
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.0-9
pve-container: 3.0-14
pve-docs: 6.0-9
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-8
pve-firmware: 3.0-4
pve-ha-manager: 3.0-5
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-1
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-1
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2



And here is my config:
/etc/default/grub:
Code:
...
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream video=efifb:off,vesafb:off"
...

/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


/etc/modprobe.d/iommu_unsafe_interrupts.conf
Code:
options vfio_iommu_type1 allow_unsafe_interrupts=1


/etc/modprobe.d/kvm.conf
Code:
options kvm ignore_msrs=1
options i915 enable_gvt=1

/etc/modprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia

blacklist i915


/etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=10de:1d01,10de:0fb8,10de:1d01,10de:0fb8,8086:5912 disable_vga=1
My IGD is 8086:5912


And here is my /etc/pve/qemu-server/100.conf
Code:
agent: 1
balloon: 0
bios: ovmf
boot: c
bootdisk: virtio2
cores: 8
cpu: host
efidisk0: local-lvm:vm-100-disk-0,size=128K
hostpci0: 00:02,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: q35
memory: 8192
name: Test01
net0: virtio=BA:98:04:AC:50:74,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=93cea300-9d54-405e-bf79-1f72fb49c534
sockets: 1
tablet: 0
usb0: host=045e:07a5
vga: none
virtio2: /dev/disk/by-id/wwn-0x50026b7682bd4af6,size=117220824K
vmgenid: 4e1089ab-25dd-4281-b73c-ed1aa6eec22b


If I start this guest, I will see the ubuntu loading logo for a second, but after that I will have weird lines (see the attached file)

If I ssh into this vm and do a dmesg I see there are mutiple errors on for the GPU:
Code:
[    6.815365] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[    8.796323] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[   10.812331] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[   10.812414] i915 0000:01:00.0: Resetting chip after gpu hang
[   10.812611] [drm:i915_reset [i915]] *ERROR* GPU recovery failed


I did managed to get a correct screen if I create the file /etc/X11/xorg.conf on the guest and write:
Code:
Section "Device"
        Identifier  "intel"
        Driver      "intel"
        BusID       "PCI:1:0:0"
EndSection
Section "Screen"
        Identifier  "intel"
        Device "intel"
EndSection

After rebooting the guest, I still get the ubuntu logo for a second, then the weird line for a few seconds, and then I get a normal desktop.
I still get the *ERROR* GPU recovery failed inside dmesg.
But if I try to tests the graphic cards with glxgears or glxinfo I get this error:
Code:
i965: Failed to submit batchbuffer: Input/output error

And in the dmesg I get a lots of:
Code:
[drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A: 0x00000080

Also in the software hardinfo, in the graphics section, instead of having the graphic driver, I get an (Unknown) message.

Here is my lspci -vv from the guest
Code:
01:00.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) (prog-if 00 [VGA controller])
    Subsystem: ASUSTeK Computer Inc. HD Graphics 630
    Physical Slot: 0
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 36
    Region 0: Memory at c0000000 (64-bit, non-prefetchable) [size=16M]
    Region 2: Memory at 800000000 (64-bit, prefetchable) [size=256M]
    Region 4: I/O ports at d000 [size=64]
    Expansion ROM at <ignored> [disabled]
    Capabilities: [40] Vendor Specific Information: Len=0c <?>
    Capabilities: [70] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
            MaxPayload 128 bytes, MaxReadReq 128 bytes
        DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM not supported, Exit Latency L0s <64ns, L1 <1us
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Address: fee02004  Data: 4026
    Capabilities: [d0] Power Management version 2
        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [100 v0] #00
    Capabilities: [200 v1] Address Translation Service (ATS)
        ATSCap:    Invalidate Queue Depth: 00
        ATSCtl:    Enable+, Smallest Translation Unit: 00
    Capabilities: [300 v1] Page Request Interface (PRI)
        PRICtl: Enable- Reset-
        PRISta: RF- UPRGI- Stopped+
        Page Request Capacity: 00008000, Page Request Allocation: 00000000
    Kernel driver in use: i915
    Kernel modules: i915


Any tips on the missing link that will make everything work?
Thank you
 

Attachments

  • 20191130_213930.jpg
    20191130_213930.jpg
    13.3 KB · Views: 12
Today while doing some tests, I saw that when I start a vm with a IGD passthrough I get this error on the host:
Code:
[ 1470.342498] DMAR: DRHD: handling fault status reg 3
[ 1470.342502] DMAR: [DMA Read] Request device [00:02.0] fault addr a400a000 [fault reason 05] PTE Write access is not set
[ 1470.342520] DMAR: DRHD: handling fault status reg 3
[ 1470.342521] DMAR: [DMA Read] Request device [00:02.0] fault addr a400b000 [fault reason 05] PTE Write access is not set
[ 1470.342538] DMAR: DRHD: handling fault status reg 3
[ 1470.342540] DMAR: [DMA Read] Request device [00:02.0] fault addr a400c000 [fault reason 05] PTE Write access is not set
[ 1470.342556] DMAR: DRHD: handling fault status reg 3

I only tested with my other graphic card and they don't have that error message.
 
try setting iommu=pt. At least for me those errors vanished with that. Tho still struggling booting VM with iGD passthough =)
 
Hi there,
Sadly when I try iommu=pt or intel_iommu=pt I get the "No IOMMU detected, please activate it.See Documentation for further information." error message and can't do any passthrough anymore.
 
Hello, any progress regarding this issue ?
I'm facing the same errors here ([drm:gen8_de_irq_handler [i915]] *ERROR* Fault errors on pipe A: 0x00000080) with Intel HD passthrough on an Ubuntu VM.
Thanks, bye.