Hi all,
this evening one of my CISCO UCS200 got isolated because of a bug in bnx2 module:
Sep 12 19:47:17 kvm02 kernel: ------------[ cut here ]------------
Sep 12 19:47:17 kvm02 kernel: WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0xe2/0x194()
Sep 12 19:47:17 kvm02 kernel: Hardware name: R200-1120402W
Sep 12 19:47:17 kvm02 kernel: NETDEV WATCHDOG: eth0 (bnx2): transmit queue 3 timed out
Sep 12 19:47:17 kvm02 kernel: Modules linked in: tun crc32c nfs lockd fscache nfs_acl auth_rpcgss sunrpc kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_tcpudp xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables x_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core joydev pcspkr power_meter evdev ac ioatdma button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot usbhid hid usb_storage ata_piix ata_generic libata igb ehci_hcd uhci_hcd dca usbcore nls_base bnx2 mptsas mptscsih mptbase scsi_transport_sas thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Sep 12 19:47:17 kvm02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Sep 12 19:47:17 kvm02 kernel: Call Trace:
Sep 12 19:47:17 kvm02 kernel: <IRQ> [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a160>] ? dev_watchdog+0x0/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e2a4>] ? warn_slowpath_fmt+0x51/0x59
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81040ab0>] ? tg_shares_up+0x0/0x259
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81039e61>] ? tg_nop+0x0/0x3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8103f367>] ? walk_tg_tree+0x5e/0x73
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a134>] ? netif_tx_lock+0x3d/0x69
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81264868>] ? netdev_drivername+0x3b/0x40
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105b5b0>] ? run_timer_softirq+0x1ee/0x2c0
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810419c5>] ? enqueue_task_fair+0x3e/0x82
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810549b8>] ? __do_softirq+0x127/0x22f
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011d6c>] ? call_softirq+0x1c/0x30
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810132eb>] ? do_softirq+0x3f/0x7c
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105473f>] ? irq_exit+0x78/0xb8
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81025140>] ? smp_apic_timer_interrupt+0x87/0x95
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011733>] ? apic_timer_interrupt+0x13/0x20
Sep 12 19:47:17 kvm02 kernel: <EOI> [<ffffffffa019d4f9>] ? acpi_idle_enter_bm+0x27d/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffffa019d4f2>] ? acpi_idle_enter_bm+0x276/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffff812508ae>] ? cpuidle_idle_call+0x94/0xee
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8100ff09>] ? cpu_idle+0xa2/0xda
Sep 12 19:47:17 kvm02 kernel: ---[ end trace 6bdcf165456a3f26 ]---
kvm02:~# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6
Looking around I found a post with similar problem on RedHat which suggest to insert this in /etc/modules.conf
options bnx2 disable_msi=1,1,1,1
Any idea?
this evening one of my CISCO UCS200 got isolated because of a bug in bnx2 module:
Sep 12 19:47:17 kvm02 kernel: ------------[ cut here ]------------
Sep 12 19:47:17 kvm02 kernel: WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0xe2/0x194()
Sep 12 19:47:17 kvm02 kernel: Hardware name: R200-1120402W
Sep 12 19:47:17 kvm02 kernel: NETDEV WATCHDOG: eth0 (bnx2): transmit queue 3 timed out
Sep 12 19:47:17 kvm02 kernel: Modules linked in: tun crc32c nfs lockd fscache nfs_acl auth_rpcgss sunrpc kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_tcpudp xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables x_tables vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core joydev pcspkr power_meter evdev ac ioatdma button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot usbhid hid usb_storage ata_piix ata_generic libata igb ehci_hcd uhci_hcd dca usbcore nls_base bnx2 mptsas mptscsih mptbase scsi_transport_sas thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Sep 12 19:47:17 kvm02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Sep 12 19:47:17 kvm02 kernel: Call Trace:
Sep 12 19:47:17 kvm02 kernel: <IRQ> [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a160>] ? dev_watchdog+0x0/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8104e2a4>] ? warn_slowpath_fmt+0x51/0x59
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81040ab0>] ? tg_shares_up+0x0/0x259
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81039e61>] ? tg_nop+0x0/0x3
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8103f367>] ? walk_tg_tree+0x5e/0x73
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a134>] ? netif_tx_lock+0x3d/0x69
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81264868>] ? netdev_drivername+0x3b/0x40
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8127a242>] ? dev_watchdog+0xe2/0x194
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105b5b0>] ? run_timer_softirq+0x1ee/0x2c0
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810419c5>] ? enqueue_task_fair+0x3e/0x82
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810549b8>] ? __do_softirq+0x127/0x22f
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011d6c>] ? call_softirq+0x1c/0x30
Sep 12 19:47:17 kvm02 kernel: [<ffffffff810132eb>] ? do_softirq+0x3f/0x7c
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8105473f>] ? irq_exit+0x78/0xb8
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81025140>] ? smp_apic_timer_interrupt+0x87/0x95
Sep 12 19:47:17 kvm02 kernel: [<ffffffff81011733>] ? apic_timer_interrupt+0x13/0x20
Sep 12 19:47:17 kvm02 kernel: <EOI> [<ffffffffa019d4f9>] ? acpi_idle_enter_bm+0x27d/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffffa019d4f2>] ? acpi_idle_enter_bm+0x276/0x2af [processor]
Sep 12 19:47:17 kvm02 kernel: [<ffffffff812508ae>] ? cpuidle_idle_call+0x94/0xee
Sep 12 19:47:17 kvm02 kernel: [<ffffffff8100ff09>] ? cpu_idle+0xa2/0xda
Sep 12 19:47:17 kvm02 kernel: ---[ end trace 6bdcf165456a3f26 ]---
kvm02:~# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6
Looking around I found a post with similar problem on RedHat which suggest to insert this in /etc/modules.conf
options bnx2 disable_msi=1,1,1,1
Any idea?