IGD passthrough almost working

Vaielab

New Member
Dec 1, 2019
4
0
1
34
For the last few days I'm trying to do a igd passthrough, I managed to send signal to the guest but the drivers don't seem to be loaded correctly.
On this computer I have 2 dedicated graphic card and one on my cpu (i7-7700k)
The computer is my dayly workstation and will work 100% from the vm, so I need all 3 graphic card to be passthrough to the differents vm.
GPU passthrough work with my 2 dedicated graphic card, but will have somes problem with my integrated graphic card.
My guest os is a brand new ubuntu where I also passthrough the hd.

This is my pve version:

Code:
pveversion --verbose
proxmox-ve: 6.0-2 (running kernel: 5.0.21-5-pve)
pve-manager: 6.0-15 (running version: 6.0-15/52b91481)
pve-kernel-helper: 6.0-12
pve-kernel-5.0: 6.0-11
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-12
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.0-9
pve-container: 3.0-14
pve-docs: 6.0-9
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-8
pve-firmware: 3.0-4
pve-ha-manager: 3.0-5
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-1
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-1
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2


And here is my config:
/etc/default/grub:
Code:
...
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream video=efifb:off,vesafb:off"
...
/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/modprobe.d/iommu_unsafe_interrupts.conf
Code:
options vfio_iommu_type1 allow_unsafe_interrupts=1

/etc/modprobe.d/kvm.conf
Code:
options kvm ignore_msrs=1
options i915 enable_gvt=1
/etc/modprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia

blacklist i915

/etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=10de:1d01,10de:0fb8,10de:1d01,10de:0fb8,8086:5912 disable_vga=1
My IGD is 8086:5912


And here is my /etc/pve/qemu-server/100.conf
Code:
agent: 1
balloon: 0
bios: ovmf
boot: c
bootdisk: virtio2
cores: 8
cpu: host
efidisk0: local-lvm:vm-100-disk-0,size=128K
hostpci0: 00:02,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: q35
memory: 8192
name: Test01
net0: virtio=BA:98:04:AC:50:74,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=93cea300-9d54-405e-bf79-1f72fb49c534
sockets: 1
tablet: 0
usb0: host=045e:07a5
vga: none
virtio2: /dev/disk/by-id/wwn-0x50026b7682bd4af6,size=117220824K
vmgenid: 4e1089ab-25dd-4281-b73c-ed1aa6eec22b

If I start this guest, I will see the ubuntu loading logo for a second, but after that I will have weird lines (see the attached file)

If I ssh into this vm and do a dmesg I see there are mutiple errors on for the GPU:
Code:
[    6.815365] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[    8.796323] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[   10.812331] i915 0000:01:00.0: Resetting rcs0 after gpu hang
[   10.812414] i915 0000:01:00.0: Resetting chip after gpu hang
[   10.812611] [drm:i915_reset [i915]] *ERROR* GPU recovery failed

I did managed to get a correct screen if I create the file /etc/X11/xorg.conf on the guest and write:
Code:
Section "Device"
        Identifier  "intel"
        Driver      "intel"
        BusID       "PCI:1:0:0"
EndSection
Section "Screen"
        Identifier  "intel"
        Device "intel"
EndSection
After rebooting the guest, I still get the ubuntu logo for a second, then the weird line for a few seconds, and then I get a normal desktop.
I still get the *ERROR* GPU recovery failed inside dmesg.
But if I try to tests the graphic cards with glxgears or glxinfo I get this error:
Code:
i965: Failed to submit batchbuffer: Input/output error
And in the dmesg I get a lots of:
Code:
[drm:gen8_irq_handler [i915]] *ERROR* Fault errors on pipe A: 0x00000080
Also in the software hardinfo, in the graphics section, instead of having the graphic driver, I get an (Unknown) message.

Here is my lspci -vv from the guest
Code:
01:00.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) (prog-if 00 [VGA controller])
    Subsystem: ASUSTeK Computer Inc. HD Graphics 630
    Physical Slot: 0
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 36
    Region 0: Memory at c0000000 (64-bit, non-prefetchable) [size=16M]
    Region 2: Memory at 800000000 (64-bit, prefetchable) [size=256M]
    Region 4: I/O ports at d000 [size=64]
    Expansion ROM at <ignored> [disabled]
    Capabilities: [40] Vendor Specific Information: Len=0c <?>
    Capabilities: [70] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
            MaxPayload 128 bytes, MaxReadReq 128 bytes
        DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM not supported, Exit Latency L0s <64ns, L1 <1us
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Address: fee02004  Data: 4026
    Capabilities: [d0] Power Management version 2
        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [100 v0] #00
    Capabilities: [200 v1] Address Translation Service (ATS)
        ATSCap:    Invalidate Queue Depth: 00
        ATSCtl:    Enable+, Smallest Translation Unit: 00
    Capabilities: [300 v1] Page Request Interface (PRI)
        PRICtl: Enable- Reset-
        PRISta: RF- UPRGI- Stopped+
        Page Request Capacity: 00008000, Page Request Allocation: 00000000
    Kernel driver in use: i915
    Kernel modules: i915

Any tips on the missing link that will make everything work?
Thank you
 

Attachments

Vaielab

New Member
Dec 1, 2019
4
0
1
34
Today while doing some tests, I saw that when I start a vm with a IGD passthrough I get this error on the host:
Code:
[ 1470.342498] DMAR: DRHD: handling fault status reg 3
[ 1470.342502] DMAR: [DMA Read] Request device [00:02.0] fault addr a400a000 [fault reason 05] PTE Write access is not set
[ 1470.342520] DMAR: DRHD: handling fault status reg 3
[ 1470.342521] DMAR: [DMA Read] Request device [00:02.0] fault addr a400b000 [fault reason 05] PTE Write access is not set
[ 1470.342538] DMAR: DRHD: handling fault status reg 3
[ 1470.342540] DMAR: [DMA Read] Request device [00:02.0] fault addr a400c000 [fault reason 05] PTE Write access is not set
[ 1470.342556] DMAR: DRHD: handling fault status reg 3
I only tested with my other graphic card and they don't have that error message.
 

sulx

New Member
Dec 1, 2019
1
0
1
37
try setting iommu=pt. At least for me those errors vanished with that. Tho still struggling booting VM with iGD passthough =)
 

Vaielab

New Member
Dec 1, 2019
4
0
1
34
Hi there,
Sadly when I try iommu=pt or intel_iommu=pt I get the "No IOMMU detected, please activate it.See Documentation for further information." error message and can't do any passthrough anymore.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!