Adaptec/USB IRQ conflict hangs PCI bus

gkovacs

Renowned Member
Dec 22, 2008
516
51
93
Budapest, Hungary
The problem
We have an Adaptec 6805E card in an Intel Q67 / Core i7 server. Whenever an USB device is plugged in or removed (in our case the data center's KVM-over-IP console), there is a kernel error and the PCI bus hangs after that (extremely slows down). It seems the Adaptec controller and the USB controller share the same IRQ.
Only a hard reset solves the problem. Has anyone experienced this?

Others reported similar issues (not necessarily with OpenVZ kernel):
http://serverfault.com/questions/165717/apparent-irq-conflict-driving-me-nuts-under-centos
http://forums.opensuse.org/english/...l-option-but-irqpoll-causes-boot-failure.html

There are a lot of boot options regarding ACPI and IRQs.
https://help.ubuntu.com/community/BootOptions
We can't really test these on a production system. Anyone know what to do?

The error
Code:
Nov 23 09:16:31 [COLOR=#ff0000]proxmox2 kernel: irq 16: nobody cared (try booting with the "irqpoll" option)[/COLOR]
Nov 23 09:16:31 proxmox2 kernel: Pid: 17958, comm: apache2 Not tainted 2.6.32-4-pve #1
Nov 23 09:16:31 proxmox2 kernel: Call Trace:
Nov 23 09:16:31 proxmox2 kernel: <IRQ>  [<ffffffff81097bfd>] ? __report_bad_irq+0x30/0x7d
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81097d4f>] ? note_interrupt+0x105/0x16e
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff810983b4>] ? handle_fasteoi_irq+0x93/0xb5
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff8101333f>] ? handle_irq+0x17/0x1d
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81012999>] ? do_IRQ+0x57/0xb6
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81011593>] ? ret_from_intr+0x0/0x11
Nov 23 09:16:31 proxmox2 kernel: <EOI>
Nov 23 09:16:31 proxmox2 kernel: handlers:
Nov 23 09:16:31 proxmox2 kernel: [<ffffffffa00c63a8>] (aac_src_intr_message+0x0/0x108 [aacraid])
Nov 23 09:16:31 proxmox2 kernel: [<ffffffffa0024848>] (usb_hcd_irq+0x0/0x7e [usbcore])
Nov 23 09:16:31 proxmox2 kernel: Disabling IRQ #16
Nov 23 09:16:37 proxmox2 kernel: usb 1-1.6: USB disconnect, address 3

The system
Code:
proxmox2:~# pveversion -v
pve-manager: 1.9-26 (pve-manager/1.9/6567)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.9-55+ovzfix-2
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.32-6-pve: 2.6.32-55+ovzfix-1
pve-kernel-2.6.32-7-pve: 2.6.32-55+ovzfix-2
qemu-server: 1.1-32
pve-firmware: 1.0-15
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-3pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-2
ksm-control-daemon: 1.0-6

PCI device list:
Code:
proxmox2:~# lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation 2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series Chipset Family MEI Controller #1 (rev 04)
00:16.2 IDE interface: Intel Corporation 6 Series Chipset Family IDE-r Controller (rev 04)
00:16.3 Serial controller: Intel Corporation 6 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
[COLOR=#ff0000]00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)[/COLOR]
00:1c.0 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation 6 Series Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series Chipset Family SMBus Controller (rev 04)
[COLOR=#ff0000]02:00.0 RAID bus controller: Adaptec Device 028b (rev 01)[/COLOR]
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)

The two conflicting devices from lspci -vv:
Code:
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
        Subsystem: Intel Corporation Device 200a
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
[COLOR=#ff0000]        Interrupt: pin A routed to IRQ 16[/COLOR]
        Region 0: Memory at fbe23000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Debug port: BAR=1 offset=00a0
        Capabilities: [98] PCIe advanced features <?>
        Kernel driver in use: ehci_hcd
        Kernel modules: ehci-hcd

02:00.0 RAID bus controller: Adaptec Device 028b (rev 01)
        Subsystem: Adaptec Device 0201
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
[COLOR=#ff0000]        Interrupt: pin A routed to IRQ 16[/COLOR]
        Region 0: Memory at fb800000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at fbc41000 (64-bit, non-prefetchable) [size=2K]
        Region 4: Memory at fbc40000 (32-bit, non-prefetchable) [size=256]
        Expansion ROM at fbc00000 [disabled] [size=256K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us
                        ClockPM- Suprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [ac] MSI-X: Enable- Mask- TabSize=16
                Vector table: BAR=0 offset=001c2000
                PBA: BAR=0 offset=001c4000
        Capabilities: [100] Advanced Error Reporting <?>
        Kernel driver in use: aacraid
        Kernel modules: aacraid

Active interrupt list
Code:
proxmox2:~# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:         99          0          0          0          0          0          0          0  IR-IO-APIC-edge      timer
  1:          2          0          0          0          0          0          0          0  IR-IO-APIC-edge      i8042
  8:          1          0          0          0          0          0          0          0  IR-IO-APIC-edge      rtc0
  9:          0          0          0          0          0          0          0          0  IR-IO-APIC-fasteoi   acpi
 12:          4          0          0          0          0          0          0          0  IR-IO-APIC-edge      i8042
[COLOR=#ff0000] 16:    6782762          0          0          0          0          0          0          0  IR-IO-APIC-fasteoi   aacraid, ehci_hcd:usb1[/COLOR]
 23:         88          0          0          0          0          0          0          0  IR-IO-APIC-fasteoi   ehci_hcd:usb2
 24:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar0
 25:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar1
 30:   40696839          0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth1
 31:          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      ahci
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:    4238841    3442513    3262886    3097275    3822337    3062084    3039864    3062109   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
PND:          0          0          0          0          0          0          0          0   Performance pending work
RES:    2091023    2526976    2132682    1876928    2226523    1816845    1427789    1141970   Rescheduling interrupts
CAL:         24         59         65         63         62         61         62         64   Function call interrupts
TLB:     156336     170392     138152     140369     129553     127167     102825      85377   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:         57         57         57         57         57         57         57         57   Machine check polls
ERR:          7
MIS:          0
 
Last edited: