FATAL! BUG in bnx2 module, eth0 got disconnected

webfrank

Member
May 28, 2011
38
1
6
Hi all,
this evening one of my CISCO UCS200 got isolated because of a bug in bnx2 module:

Sep 12 19:47:17 kvm02 kernel: ------------[ cut here ]------------
Sep 12 19:47:17 kvm02 kernel: WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0xe2/0x194()
Sep 12 19:47:17 kvm02 kernel: Hardware name: R200-1120402W
Sep 12 19:47:17 kvm02 kernel: NETDEV WATCHDOG: eth0 (bnx2): transmit queue 3 timed out
Sep 12 19:47:17 kvm02 kernel: Modules linked in: tun crc32c nfs lockd fscache nfs_acl auth_rpcgss sunrpc kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_tcpudp xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables x_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core joydev pcspkr power_meter evdev ac ioatdma button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot usbhid hid usb_storage ata_piix ata_generic libata igb ehci_hcd uhci_hcd dca usbcore nls_base bnx2 mptsas mptscsih mptbase scsi_transport_sas thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Sep 12 19:47:17 kvm02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Sep 12 19:47:17 kvm02 kernel: Call Trace:
Sep 12 19:47:17 kvm02 kernel: <IRQ> [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a160>] ? dev_watchdog+0x0/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e2a4>] ? warn_slowpath_fmt+0x51/0x59
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81040ab0>] ? tg_shares_up+0x0/0x259
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81039e61>] ? tg_nop+0x0/0x3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8103f367>] ? walk_tg_tree+0x5e/0x73
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a134>] ? netif_tx_lock+0x3d/0x69
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81264868>] ? netdev_drivername+0x3b/0x40
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105b5b0>] ? run_timer_softirq+0x1ee/0x2c0
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810419c5>] ? enqueue_task_fair+0x3e/0x82
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810549b8>] ? __do_softirq+0x127/0x22f
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011d6c>] ? call_softirq+0x1c/0x30
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810132eb>] ? do_softirq+0x3f/0x7c
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105473f>] ? irq_exit+0x78/0xb8
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81025140>] ? smp_apic_timer_interrupt+0x87/0x95
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011733>] ? apic_timer_interrupt+0x13/0x20
Sep 12 19:47:17 kvm02 kernel: <EOI> [<ffffffffa019d4f9>] ? acpi_idle_enter_bm+0x27d/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffffa019d4f2>] ? acpi_idle_enter_bm+0x276/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffff812508ae>] ? cpuidle_idle_call+0x94/0xee
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8100ff09>] ? cpu_idle+0xa2/0xda
Sep 12 19:47:17 kvm02 kernel: ---[ end trace 6bdcf165456a3f26 ]---


kvm02:~# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6

Looking around I found a post with similar problem on RedHat which suggest to insert this in /etc/modules.conf

options bnx2 disable_msi=1,1,1,1

Any idea?
 
The current kernel use this driver (2.6.32-4-pve #1 SMP Mon May 9 12:59:57 CEST 2011 x86_64 GNU/Linux)

Code:
modinfo bnx2

filename:       /lib/modules/2.6.32-4-pve/kernel/drivers/net/bnx2.ko
version:        2.0.8e
license:        GPL
description:    Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver


the latest kernel from pvetest (will be released tommorrow as 1.9): (2.6.32-6-pve #1 SMP Fri Sep 9 07:20:30 CEST 2011 x86_64 GNU/Linux)

Code:
modinfo bnx2

filename:       /lib/modules/2.6.32-6-pve/kernel/drivers/net/bnx2.ko
version:        2.0.23b
license:        GPL
description:    Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver

Therefore I suggest you upgrade to the new kernel and test again.
 
yes, as always.
 
Hi, will it be a simple apt-get update / upgrade / dist-upgrade ?
It will be as simple as any kernel installation on Debian :)

apt-get install kernelX.XX.XXX

I am not shure though if the testing kernel will be available via proxmox repo or if you have to grab it from the webserver
 
the kernel is pvetest repo.
 
Hi, I have had also this bug this night !

pve-kernel-2.6.32-6-pve

kvm1:~# modinfo bnx2
filename: /lib/modules/2.6.32-6-pve/kernel/drivers/net/bnx2.ko
version: 2.0.23b

Code:
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0x29f/0x2b0() (Not tainted)
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: Hardware name: PowerEdge R815
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: NETDEV WATCHDOG: eth0 (bnx2): transmit queue 4 timed out
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: Modules linked in: vhost_net macvtap macvlan tun kvm_amd kvm vzethdev vznetdev simfs vzrst nf_nat nf_conntrack_ipv4 nf_conntrack nf_
defrag_ipv4 vzcpt nfs lockd fscache nfs_acl auth_rpcgss sunrpc vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_length xt_hl xt_tcp
mss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr i
scsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge bonding ipv6 8021q garp stp llc dm_round_robin dm_multipath dcdbas snd_pcsp snd_pcm snd_timer tpm_tis 
tpm tpm_bios snd soundcore snd_page_alloc serio_raw ghes amd64_edac_mod k10temp edac_core edac_mce_amd i2c_piix4 i2c_core hed power_meter hwmon ext3 jbd mbcache 
dm_mirror dm_region_hash dm_log dm_snapshot sg ahci igb dca megaraid_sas bnx2 [last unloaded: scsi_wait_scan]
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-6-pve #1
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: Call Trace:
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: <IRQ>  [<ffffffff8145676f>] ? dev_watchdog+0x29f/0x2b0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8145676f>] ? dev_watchdog+0x29f/0x2b0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff810694f8>] ? warn_slowpath_common+0x88/0xe0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81054e9c>] ? enqueue_task_fair+0x1c/0x60
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8106964e>] ? warn_slowpath_fmt+0x6e/0x70
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff810533ab>] ? task_rq_lock+0x5b/0xa0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8105e05b>] ? try_to_wake_up+0xfb/0x410
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8125fdda>] ? strlcpy+0x4a/0x60
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81437a08>] ? netdev_drivername+0x48/0x60
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8119c94c>] ? pollwake+0x5c/0x60
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8145676f>] ? dev_watchdog+0x29f/0x2b0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8101cbc4>] ? x86_pmu_enable+0x1f4/0x280
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff814564d0>] ? dev_watchdog+0x0/0x2b0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8107d56e>] ? run_timer_softirq+0x1be/0x350
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8107284a>] ? __do_softirq+0x13a/0x230
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff810a1494>] ? clockevents_program_event+0x54/0xa0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8100c48c>] ? call_softirq+0x1c/0x30
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8100e0c5>] ? do_softirq+0x65/0xa0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff810724e5>] ? irq_exit+0xc5/0xf0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff814fc191>] ? smp_apic_timer_interrupt+0x71/0x9a
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8100be53>] ? apic_timer_interrupt+0x13/0x20
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: <EOI>  [<ffffffff8103734b>] ? native_safe_halt+0xb/0x10
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81014455>] ? default_idle+0x75/0xb0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff8100a251>] ? cpu_idle+0xb1/0x110
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff814da435>] ? rest_init+0x85/0xa0
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81bc1f9d>] ? start_kernel+0x446/0x511
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81bc12b9>] ? x86_64_start_reservations+0x99/0xb9
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81bc13df>] ? x86_64_start_kernel+0x106/0x121
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: [<ffffffff81bc1140>] ? early_idt_handler+0x0/0x71
[COLOR=#00008B]Sep 17[/COLOR] 09:26:22 kvm1 kernel: ---[ end trace a9e627456250ecb6 ]---
[COLOR=#00008B]Sep 17[/COLOR] 09:42:43 kvm1 shutdown[899016]: shutting down for system reboot
 
Re: FATAL! BUG in ethernet cards IRQ

i have save problem with a system with 4 Ethernet card (eth0:1*onboard and eth1&eth2&eth3:D-link DGE-528T).
i inestalled a windows 2008 x64 R2 , after i start it my system network going slow and /var/log/message show me this error:

Sep 21 16:23:13 vmhost1 kernel: Pid: 3564, comm: kvm Not tainted 2.6.32-6-pve #1
Sep 21 16:23:13 vmhost1 kernel: Call Trace:
Sep 21 16:23:13 vmhost1 kernel: <IRQ> [<ffffffff810e801b>] ? __report_bad_irq+0x2b/0xb0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff810e8238>] ? note_interrupt+0x198/0x1e0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff810e8925>] ? handle_fasteoi_irq+0xc5/0xf0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100c48c>] ? call_softirq+0x1c/0x30
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100e14b>] ? handle_irq+0x4b/0xb0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff814fadcf>] ? do_IRQ+0x6f/0xf0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100bc93>] ? ret_from_intr+0x0/0x11
Sep 21 16:23:13 vmhost1 kernel: <EOI> [<ffffffffa0601b61>] ? kvm_arch_vcpu_ioctl_run+0xbf1/0xf70 [kvm]
Sep 21 16:23:13 vmhost1 kernel: [<ffffffffa0601b4f>] ? kvm_arch_vcpu_ioctl_run+0xbdf/0xf70 [kvm]
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100983d>] ? __switch_to+0xcd/0x320
Sep 21 16:23:13 vmhost1 kernel: [<ffffffffa05eb493>] ? kvm_vcpu_ioctl+0x433/0x790 [kvm]
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8111e01a>] ? fire_user_return_notifiers+0x3a/0x50
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100a47d>] ? do_notify_resume+0x9d/0x910
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff81199e86>] ? vfs_ioctl+0x36/0xb0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8119a385>] ? do_vfs_ioctl+0x3e5/0x5e0
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8119a5cf>] ? sys_ioctl+0x4f/0x80
Sep 21 16:23:13 vmhost1 kernel: [<ffffffff8100b302>] ? system_call_fastpath+0x16/0x1b




i have another problem with this system:
after a while the kernel print "kernel:disabling #IRQ16" and network of system going slow, my IRQ16 is eth1.
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 130 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 8 0 0 0 0 0 0 0 IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
16: 67618 0 0 0 0 0 0 0 IO-APIC-fasteoi eth1
17: 497230 0 0 0 0 0 0 0 IO-APIC-fasteoi hda_intel, eth2
18: 1000001 0 0 0 0 0 0 0 IO-APIC-fasteoi eth3
19: 40 0 0 0 0 0 0 0 IO-APIC-fasteoi firewire_ohci
23: 60 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
24: 8 0 0 0 0 0 0 0 HPET_MSI-edge hpet2
25: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet3
26: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet4
27: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet5
28: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet6
36: 242054 0 0 0 0 0 0 0 PCI-MSI-edge ahci
37: 171062 0 0 0 0 0 0 0 PCI-MSI-edge eth0
38: 263 0 0 0 0 0 0 0 PCI-MSI-edge hda_intel
39: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
40: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
41: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
42: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
43: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
44: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
45: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
46: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
NMI: 844 792 713 605 592 578 506 433 Non-maskable interrupts
LOC: 5771651 4029676 4037051 4078146 6556692 4582591 4733845 4912658 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 844 792 713 605 592 578 506 433 Performance monitoring interrupts
PND: 0 0 0 0 0 0 0 0 Performance pending work
RES: 4991280 4919044 4364085 3750766 3214401 2715127 2078728 1595495 Rescheduling interrupts
CAL: 567244 478795 415497 349449 364810 287708 239163 205423 Function call interrupts
TLB: 44 28 34 32 115 232 256 121 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 10 10 10 10 10 10 10 10 Machine check polls
ERR: 7
MIS: 0

--------------
pveversion -v
pve-manager: 1.9-24 (pve-manager/1.9/6542)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 1.9-43
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.32-6-pve: 2.6.32-43
qemu-server: 1.1-32
pve-firmware: 1.0-13
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.28-1pve5
vzdump: 1.2-15
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-1
ksm-control-daemon: 1.0-6
 
Last edited:
Re: FATAL! BUG in ethernet cards IRQ

upgrade to the latest pvetest kernel.
(pve-kernel-2.6.32-6-pve: 2.6.32-46)
 
Hi, not yet tested with new kernel cause several bugs reported but with option bnx2 disable_msi=1,1,1,1 seems working, no lock in one week.
 
Re: FATAL! BUG in ethernet cards IRQ

i upgrade to "pve-kernel-2.6.32-6-pve: 2.6.32-46" but i have problem yet.
i have a pfsense server as vm on proxmox , the pfsense routing for me when i copying a huge file(3GB) from other machine to other one ,proxmox print follow message on /var/log/message and disconnect the copy.


Sep 24 11:58:40 vmhost1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-6-pve #1
Sep 24 11:58:40 vmhost1 kernel: Call Trace:
Sep 24 11:58:40 vmhost1 kernel: <IRQ> [<ffffffff810e807b>] ? __report_bad_irq+0x2b/0xb0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff810e8298>] ? note_interrupt+0x198/0x1e0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff810e8985>] ? handle_fasteoi_irq+0xc5/0xf0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100e14b>] ? handle_irq+0x4b/0xb0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff814fb0bf>] ? do_IRQ+0x6f/0xf0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100bc93>] ? ret_from_intr+0x0/0x11
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81072c1b>] ? __do_softirq+0x9b/0x230
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81012ae0>] ? native_sched_clock+0x20/0x80
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100c48c>] ? call_softirq+0x1c/0x30
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100e0c5>] ? do_softirq+0x65/0xa0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81072955>] ? irq_exit+0xc5/0xf0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff814fb0c8>] ? do_IRQ+0x78/0xf0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100bc93>] ? ret_from_intr+0x0/0x11
Sep 24 11:58:40 vmhost1 kernel: <EOI> [<ffffffff8103734b>] ? native_safe_halt+0xb/0x10
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8109e486>] ? ktime_get_real+0x16/0x50
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff812e0e84>] ? acpi_idle_do_entry+0x3c/0x65
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff812e0f1f>] ? acpi_idle_enter_c1+0x72/0xc3
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff814f8846>] ? notifier_call_chain+0x16/0x80
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81406e77>] ? menu_select+0x157/0x350
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81405d74>] ? cpuidle_idle_call+0xb4/0x140
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff8100a251>] ? cpu_idle+0xb1/0x110
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff814d9445>] ? rest_init+0x85/0xa0
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81c21f9d>] ? start_kernel+0x446/0x511
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81c212b9>] ? x86_64_start_reservations+0x99/0xb9
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81c213df>] ? x86_64_start_kernel+0x106/0x121
Sep 24 11:58:40 vmhost1 kernel: [<ffffffff81c21140>] ? early_idt_handler+0x0/0x71


Message from syslogd@vmhost1 at Sep 24 12:00:53 ...
kernel:Disabling IRQ #17

IRQ17 is:
17: 1100182 0 0 0 0 0 0 0 IO-APIC-fasteoi hda_intel, eth2
 
Re: FATAL! BUG in ethernet cards IRQ

any body cant resolve above problem?????please help!!!!!!
i changed NIC but i have this problem yet!!!!!
--------------------------------------------------
MB:ASUS P8P67 LE
CPU:COREi7 2600
 
Last edited:
Re: FATAL! BUG in ethernet cards IRQ


i upgraded my kernel to this version but i have problem yet...!!

Oct 16 09:35:55 vmhost1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-6-pve #1
Oct 16 09:35:55 vmhost1 kernel: Call Trace:
Oct 16 09:35:55 vmhost1 kernel: <IRQ> [<ffffffff810e804b>] ? __report_bad_irq+0x2b/0xb0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff810e8268>] ? note_interrupt+0x198/0x1e0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff810e8955>] ? handle_fasteoi_irq+0xc5/0xf0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff8100e14b>] ? handle_irq+0x4b/0xb0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff814fb5cf>] ? do_IRQ+0x6f/0xf0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff8100bc93>] ? ret_from_intr+0x0/0x11
Oct 16 09:35:55 vmhost1 kernel: <EOI> [<ffffffff8103734b>] ? native_safe_halt+0xb/0x10
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff8109e446>] ? ktime_get_real+0x16/0x50
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff812e13c4>] ? acpi_idle_do_entry+0x3c/0x65
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff812e145f>] ? acpi_idle_enter_c1+0x72/0xc3
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff814f8d56>] ? notifier_call_chain+0x16/0x80
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81407437>] ? menu_select+0x157/0x350
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81406334>] ? cpuidle_idle_call+0xb4/0x140
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff8100a251>] ? cpu_idle+0xb1/0x110
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff814d9a15>] ? rest_init+0x85/0xa0
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81c21f9d>] ? start_kernel+0x446/0x511
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81c212b9>] ? x86_64_start_reservations+0x99/0xb9
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81c213df>] ? x86_64_start_kernel+0x106/0x121
Oct 16 09:35:55 vmhost1 kernel: [<ffffffff81c21140>] ? early_idt_handler+0x0/0x71


--------------------------------------------------
Linux vmhost1 2.6.32-6-pve #1 SMP Mon Oct 10 06:53:48 CEST 2011 x86_64 GNU/Linux
MB:ASUS P8P67 LE
CPU:COREi7 2600
RAM:8GB 1600Mhz
HDD:500GB&1TB
NIC:1 onboard & 2 PCI Slot
 
Last edited:
Hi
smb3843,


your problem doesn't see to be to be related to your network card like us ...




what's is you config ? (server /motherboard/cpu/network card =
I'll try to help you ...




Linux vmhost1 2.6.32-6-pve #1 SMP Mon Oct 10 06:53:48 CEST 2011 x86_64 GNU/Linux
MB:ASUS P8P67 LE
CPU:COREi7 2600
RAM:8GB 1600Mhz
HDD:500GB&1TB
NIC:1 onboard & 2 PCI Slot(Dlink)




dmesg:

i801_smbus 0000:00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18
HDA Intel 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
alloc irq_desc for 47 on node -1
alloc kstat_irqs on node -1
HDA Intel 0000:01:00.1: irq 47 for MSI/MSI-X
HDA Intel 0000:01:00.1: setting latency timer to 64
Error: Driver 'pcspkr' is already registered, aborting...
Adding 7340024k swap on /dev/mapper/pve-swap. Priority:-1 extents:1 across:7340024k
EXT3-fs (dm-1): using internal journal
EXT3-fs: barriers disabled
kjournald starting. Commit interval 5 seconds
EXT3-fs (dm-2): using internal journal
EXT3-fs (dm-2): mounted filesystem with ordered data mode
EXT4-fs (sdb1): mounted filesystem with ordered data mode
EXT3-fs: barriers disabled
kjournald starting. Commit interval 5 seconds
EXT3-fs (sda1): using internal journal
EXT3-fs (sda1): mounted filesystem with ordered data mode
Bridge firewalling registered
device eth0 entered promiscuous mode
r8169 0000:07:00.0: eth0: link up
r8169 0000:07:00.0: eth0: link up
vmbr0: port 1(eth0) entering forwarding state
device eth1 entered promiscuous mode
skge eth1: enabling interface
skge eth1: Link is up at 100 Mbps, full duplex, flow control both
vmbr1: port 1(eth1) entering forwarding state
device eth2 entered promiscuous mode
skge eth2: enabling interface
skge eth2: Link is up at 100 Mbps, full duplex, flow control both
vmbr2: port 1(eth2) entering forwarding state
Loading iSCSI transport class v2.0-870.
iscsi: registered transport (tcp)
NET: Registered protocol family 10
iscsi: registered transport (iser)
ip_tables: (C) 2000-2006 Netfilter Core Team
ip6_tables: (C) 2000-2006 Netfilter Core Team
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Slow work thread pool: Starting up
Slow work thread pool: Ready
FS-Cache: Loaded
FS-Cache: Netfs 'nfs' registered for caching
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
device tap101i0d0 entered promiscuous mode
vmbr0: port 2(tap101i0d0) entering forwarding state
device tap101i1d0 entered promiscuous mode
vmbr1: port 2(tap101i1d0) entering forwarding state
device tap101i1d1 entered promiscuous mode
vmbr1: port 3(tap101i1d1) entering forwarding state
device tap101i1d2 entered promiscuous mode
vmbr1: port 4(tap101i1d2) entering forwarding state
device tap101i2d0 entered promiscuous mode
vmbr2: port 2(tap101i2d0) entering forwarding state
device tap101i99d0 entered promiscuous mode
vmbr99: port 1(tap101i99d0) entering forwarding state
device tap102i0d0 entered promiscuous mode
vmbr0: port 3(tap102i0d0) entering forwarding state
device tap102i99d0 entered promiscuous mode
vmbr99: port 2(tap102i99d0) entering forwarding state
vmbr99: no IPv6 routers present
vmbr0: no IPv6 routers present
eth2: no IPv6 routers present
vmbr1: no IPv6 routers present
eth1: no IPv6 routers present
eth0: no IPv6 routers present
vmbr2: no IPv6 routers present
venet0: no IPv6 routers present
tap101i99d0: no IPv6 routers present
tap101i0d0: no IPv6 routers present
tap101i2d0: no IPv6 routers present
tap102i0d0: no IPv6 routers present
tap101i1d0: no IPv6 routers present
tap101i1d1: no IPv6 routers present
tap101i1d2: no IPv6 routers present
tap102i99d0: no IPv6 routers present
warning: `ntpd' uses 32-bit capabilities (legacy support in use)
vmbr0: port 3(tap102i0d0) entering disabled state
vmbr0: port 3(tap102i0d0) entering disabled state
vmbr99: port 2(tap102i99d0) entering disabled state
vmbr99: port 2(tap102i99d0) entering disabled state
device tap102i0d0 entered promiscuous mode
vmbr0: port 3(tap102i0d0) entering forwarding state
device tap102i99d0 entered promiscuous mode
vmbr99: port 2(tap102i99d0) entering forwarding state
tap102i99d0: no IPv6 routers present
tap102i0d0: no IPv6 routers present
irq 17: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-6-pve #1
Call Trace:
<IRQ> [<ffffffff810e807b>] ? __report_bad_irq+0x2b/0xb0
[<ffffffff810e8298>] ? note_interrupt+0x198/0x1e0
[<ffffffff810e8985>] ? handle_fasteoi_irq+0xc5/0xf0
[<ffffffff8100e14b>] ? handle_irq+0x4b/0xb0
[<ffffffff814fb0bf>] ? do_IRQ+0x6f/0xf0
[<ffffffff8100bc93>] ? ret_from_intr+0x0/0x11
<EOI> [<ffffffff8103734b>] ? native_safe_halt+0xb/0x10
[<ffffffff8109e486>] ? ktime_get_real+0x16/0x50
[<ffffffff812e0e84>] ? acpi_idle_do_entry+0x3c/0x65
[<ffffffff812e0f1f>] ? acpi_idle_enter_c1+0x72/0xc3
[<ffffffff814f8846>] ? notifier_call_chain+0x16/0x80
[<ffffffff81406e77>] ? menu_select+0x157/0x350
[<ffffffff81405d74>] ? cpuidle_idle_call+0xb4/0x140
[<ffffffff8100a251>] ? cpu_idle+0xb1/0x110
[<ffffffff814d9445>] ? rest_init+0x85/0xa0
[<ffffffff81c21f9d>] ? start_kernel+0x446/0x511
[<ffffffff81c212b9>] ? x86_64_start_reservations+0x99/0xb9
[<ffffffff81c213df>] ? x86_64_start_kernel+0x106/0x121
[<ffffffff81c21140>] ? early_idt_handler+0x0/0x71
handlers:
[<ffffffffa0035720>] (skge_intr+0x0/0x5f0 [skge])
Disabling IRQ #17

--------------------------------------------------
/proc/intrrupts:
[.....]
17: 200001 0 0 0 0 0 0 0 IO-APIC-fasteoi skge@pci:0000:06:01.0
[.....]
 
Last edited:
ok, so workstation motherboard,mmm.

you have only 2 pci slot ? and both are filled with you 2 dlink nic card ?


if it's really a irq sharing problem, try to disable all unused devices (sound card,bluetooth,firewire,usbs (i see 16 usb ports on doc) , maybe some sata controllers.

after, check with

cat /proc/interrupts

. you don't want to have 2 devices on same irq.



Also maybe can you try with pci-express nic cards ?
 
i have 3 pci and have 2 nic card on 2 pci
i have problem on irq17
and
cat /proc/intrrupts

CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 131 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 8 0 0 0 0 0 0 0 IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
16: 816032 0 0 0 0 0 0 0 IO-APIC-fasteoi skge@pci:0000:06:00.0
17: 28243849 0 0 0 0 0 0 0 IO-APIC-fasteoi skge@pci:0000:06:01.0
19: 136 0 0 0 0 0 0 0 IO-APIC-fasteoi firewire_ohci
23: 60 0 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
24: 28 0 0 0 0 0 0 0 HPET_MSI-edge hpet2
25: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet3
26: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet4
27: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet5
28: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet6
36: 720772 0 0 0 0 0 0 0 PCI-MSI-edge ahci
37: 32181360 0 0 0 0 0 0 0 PCI-MSI-edge eth0
38: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
39: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
40: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
41: 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
42: 0 0
 
Last edited: