"Irq 17: nobody cared" problem with PCI passthrough

jic5760

Member
Nov 10, 2020
40
8
13
26
Version:
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-libc-dev: 5.4.106-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
openvswitch-switch: 2.10.7+ds1-0+deb10u1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

Before start the VM:
Code:
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 (prog-if 02 [NVM Express])
        Interrupt: pin A routed to IRQ 17
        NUMA node: 0
        Kernel driver in use: nvme

07:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1) (prog-if 00 [VGA controller])
        Interrupt: pin A routed to IRQ 11
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

07:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
        Interrupt: pin B routed to IRQ 10
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel



The IRQ of the 02:00 device and the IRQ of the 07:00 device are different.


However, when starting VM, the VM dies with an error(in the host) as shown below

Code:
[  918.137315] irq 17: nobody cared (try booting with the "irqpoll" option)
[  918.137339] CPU: 2 PID: 1979 Comm: kvm Tainted: P           O      5.4.73-1-pve #1
[  918.137340] Hardware name: Supermicro X10SLL+-F/X10SLL+-F, BIOS 3.3 03/28/2020
[  918.137340] Call Trace:
[  918.137342]  <IRQ>
[  918.137347]  dump_stack+0x6d/0x9a
[  918.137350]  __report_bad_irq+0x3c/0xb6
[  918.137352]  note_interrupt.cold.10+0xb/0x5d
[  918.137353]  handle_irq_event_percpu+0x6f/0x80
[  918.137354]  handle_irq_ev


At this time, the IRQ of the passthrouth device is changed.
Code:
07:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1) (prog-if 00 [VGA controller])
        Interrupt: pin A routed to IRQ 16
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

07:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
        Interrupt: pin B routed to IRQ 17
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

The 07.00.1 device's IRQ is changed to 17, which collision the existing 02:00 device.
Before starting the VM, the IRQ was unique.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!