.32 kernel crashing on our hardware.

mvrhov

Active Member
Jan 29, 2011
20
2
43
I tried to upgrade our server from 1.5 and .24 kernel to 1.8 and .32 kernel.
The proxmox upgrade went well but we are having a problem with .32 kernel.
The 1st crash is in vzwdog module:

Code:
Apr 23 00:30:00 master kernel: disk_io:  104       0 cciss/c0d0 1877 409 56530 8272 111 126 1896 192 0 6088 8456
Apr 23 00:30:00 master kernel: 104       1 cciss/c0d0p1 44 264 1434 188 1 0 8 4 0 124 192
Apr 23 00:30:00 master kernel: 104       2 cciss/c0d0p2 1808 85 54756 8032 110 126 1888 188 0 5956 8212
Apr 23 00:30:00 master kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Apr 23 00:30:00 master kernel: IP: [<ffffffff811705ee>] disk_part_iter_next+0x59/0xb6
Apr 23 00:30:00 master kernel: PGD 0 
Apr 23 00:30:00 master kernel: Oops: 0000 [#1] SMP 
Apr 23 00:30:00 master kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/class
Apr 23 00:30:00 master kernel: CPU 12 
Apr 23 00:30:00 master kernel: Modules linked in: vzrst vzcpt vzwdog vzdquota vzmon vzdev xt_tcpudp ipt_ULOG nf_nat_ftp iptable_nat nf_nat xt_state xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_owner xt_multiport xt_mac xt_limit ipt_ecn xt_recent nf_conntrack_ftp nf_conntrack_irc nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ipt_LOG xt_DSCP xt_dscp ipt_REJECT ip_tables x_tables ipmi_devintf ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding ipmi_si ipmi_msghandler usbhid container snd_pcm evdev hid snd_timer snd soundcore snd_page_alloc hpilo psmouse serio_raw power_meter pcspkr processor button ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot ehci_hcd uhci_hcd usbcore nls_base bnx2 cciss thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Apr 23 00:30:00 master kernel: Pid: 2639, comm: vzwdog Not tainted 2.6.32-4-pve #1 feoktistov ProLiant DL360 G6
Apr 23 00:30:00 master kernel: RIP: 0010:[<ffffffff811705ee>]  [<ffffffff811705ee>] disk_part_iter_next+0x59/0xb6
Apr 23 00:30:00 master kernel: RSP: 0018:ffff88031b4dfcc0  EFLAGS: 00010246
Apr 23 00:30:00 master kernel: RAX: 0000000000000008 RBX: ffff88031b4dfd60 RCX: 0000000000000001
Apr 23 00:30:00 master kernel: RDX: 0000000000000000 RSI: ffff88031c111bc0 RDI: ffff88031b4dfd60
Apr 23 00:30:00 master kernel: RBP: ffff88031b4dffd8 R08: 0000000000000002 R09: ffffffff81403814
Apr 23 00:30:00 master kernel: R10: 0000000000000004 R11: 0000000000000710 R12: ffffffffa02cd000
Apr 23 00:30:00 master kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff8150a5c0
Apr 23 00:30:00 master kernel: FS:  0000000000000000(0000) GS:ffff88000f580000(0000) knlGS:0000000000000000
Apr 23 00:30:00 master kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Apr 23 00:30:00 master kernel: CR2: 0000000000000010 CR3: 0000000001001000 CR4: 00000000000006e0
Apr 23 00:30:00 master kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 23 00:30:00 master kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 23 00:30:00 master kernel: Process vzwdog (pid: 2639, veid=0, threadinfo ffff88031b4de000, task ffff88031c7db000)
Apr 23 00:30:00 master kernel: Stack:
Apr 23 00:30:00 master kernel: ffff88031b4dfd60 ffff88031b4dffd8 ffffffffa02cd000 ffffffffa02cd428
Apr 23 00:30:00 master kernel: <0> 000000000000d5e4 0000000000001f60 000000000000006e 000000000000007e
Apr 23 00:30:00 master kernel: <0> 0000000000000760 ffffffff000000bc 0000000000000000 0000000000001744
Apr 23 00:30:00 master kernel: Call Trace:
Apr 23 00:30:00 master kernel: [<ffffffffa02cd000>] ? show_one_disk_io+0x0/0x46d [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffffa02cd428>] ? show_one_disk_io+0x428/0x46d [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffff8131386d>] ? printk+0x4e/0x59
Apr 23 00:30:00 master kernel: [<ffffffffa02cd000>] ? show_one_disk_io+0x0/0x46d [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffff812140d2>] ? class_for_each_device+0x7f/0xac
Apr 23 00:30:00 master kernel: [<ffffffffa02cd46d>] ? wdog_loop+0x0/0x36b [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffffa02cd64a>] ? wdog_loop+0x1dd/0x36b [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffffa02cd46d>] ? wdog_loop+0x0/0x36b [vzwdog]
Apr 23 00:30:00 master kernel: [<ffffffff81066742>] ? kthread+0xc0/0xca
Apr 23 00:30:00 master kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Apr 23 00:30:00 master kernel: [<ffffffff81066682>] ? kthread+0x0/0xca
Apr 23 00:30:00 master kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Apr 23 00:30:00 master kernel: Code: 2a 83 c9 ff a8 0c b8 00 00 00 00 41 89 cc 0f 44 c8 48 8d 72 20 49 b8 ff ff ff ff 08 00 00 00 48 bf 00 00 00 00 08 00 00 00 eb 4d <8b> 4a 10 41 bc 01 00 00 00 eb db 48 63 c2 48 8d 04 c6 48 8b 28 
Apr 23 00:30:00 master kernel: RIP  [<ffffffff811705ee>] disk_part_iter_next+0x59/0xb6
Apr 23 00:30:00 master kernel: RSP <ffff88031b4dfcc0>
Apr 23 00:30:00 master kernel: CR2: 0000000000000010
Apr 23 00:30:00 master kernel: ---[ end trace 377c218aff8daffa ]---
The other one is:
Code:
Apr 23 00:31:06 master kernel: ------------[ cut here ]------------
Apr 23 00:31:06 master kernel: WARNING: at mm/page_alloc.c:1828 __alloc_pages_nodemask+0x183/0x6a8()
Apr 23 00:31:06 master kernel: Hardware name: ProLiant DL360 G6
Apr 23 00:31:06 master kernel: Modules linked in: vzethdev vznetdev simfs vzrst vzcpt vzwdog vzdquota vzmon vzdev xt_tcpudp ipt_ULOG nf_nat_ftp iptable_nat nf_nat xt_state xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_owner xt_multiport xt_mac xt_limit ipt_ecn xt_recent nf_conntrack_ftp nf_conntrack_irc nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ipt_LOG xt_DSCP xt_dscp ipt_REJECT ip_tables x_tables ipmi_devintf ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding ipmi_si ipmi_msghandler usbhid container snd_pcm evdev hid snd_timer snd soundcore snd_page_alloc hpilo psmouse serio_raw power_meter pcspkr processor button ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot ehci_hcd uhci_hcd usbcore nls_base bnx2 cciss thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Apr 23 00:31:06 master kernel: Pid: 13308, comm: mountall Tainted: G      D    2.6.32-4-pve #1
Apr 23 00:31:06 master kernel: Call Trace:
Apr 23 00:31:06 master kernel: [<ffffffff810bd037>] ? __alloc_pages_nodemask+0x183/0x6a8
Apr 23 00:31:06 master kernel: [<ffffffff810bd037>] ? __alloc_pages_nodemask+0x183/0x6a8
Apr 23 00:31:06 master kernel: [<ffffffff8104e21c>] ? warn_slowpath_common+0x77/0xa3
Apr 23 00:31:06 master kernel: [<ffffffff810bd037>] ? __alloc_pages_nodemask+0x183/0x6a8
Apr 23 00:31:06 master kernel: [<ffffffff810e99a7>] ? new_slab+0x104/0x236
Apr 23 00:31:06 master kernel: [<ffffffff81100735>] ? __d_path+0x116/0x1e0
Apr 23 00:31:06 master kernel: [<ffffffff810bc4e1>] ? __get_free_pages+0x9/0x46
Apr 23 00:31:06 master kernel: [<ffffffff810e9435>] ? __kmalloc+0x3f/0x17f
Apr 23 00:31:06 master kernel: [<ffffffff81109324>] ? seq_read+0x226/0x388
Apr 23 00:31:06 master kernel: [<ffffffff810f2122>] ? vfs_read+0xa6/0xff
Apr 23 00:31:06 master kernel: [<ffffffff810f2295>] ? sys_read+0x49/0xc4
Apr 23 00:31:06 master kernel: [<ffffffff81037623>] ? ia32_sysret+0x0/0x5
Apr 23 00:31:06 master kernel: ---[ end trace 377c218aff8daffb ]---
And I have no Idea from where it's coming.

Regards,
Miha
 
This is the other part as the forum didn't like the length of a post.

This is the one I got after I decided it's best to restart and boot back to .24 kernel:
Code:
Apr 23 00:42:10 master kernel: BUG: unable to handle kernel paging request at 00000000dead111c
Apr 23 00:42:10 master kernel: IP: [<ffffffff81074aa6>] ub_task_put+0x15/0xf4
Apr 23 00:42:10 master kernel: PGD 61ba2e067 PUD 0 
Apr 23 00:42:10 master kernel: Oops: 0000 [#2] SMP 
Apr 23 00:42:10 master kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/class
Apr 23 00:42:10 master kernel: CPU 3 
Apr 23 00:42:10 master kernel: Modules linked in: kvm_intel kvm simfs  vzwdog(-) vzdquota vzmon vzdev xt_tcpudp ipt_ULOG nf_nat_ftp iptable_nat  nf_nat xt_state xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle  iptable_filter xt_owner xt_multiport xt_mac xt_limit ipt_ecn xt_recent  nf_conntrack_ftp nf_conntrack_irc nf_conntrack_ipv4 nf_conntrack  nf_defrag_ipv4 ipt_LOG xt_DSCP xt_dscp ipt_REJECT ip_tables x_tables  ipmi_devintf ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr  iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp bonding  ipmi_si ipmi_msghandler usbhid container snd_pcm evdev hid snd_timer snd  soundcore snd_page_alloc hpilo psmouse serio_raw power_meter pcspkr  processor button ext3 jbd mbcache dm_mirror dm_region_hash dm_log  dm_snapshot ehci_hcd uhci_hcd usbcore nls_base bnx2 cciss thermal fan  thermal_sys [last unloaded: vzrst]
Apr 23 00:42:10 master kernel: Pid: 22096, comm: modprobe Tainted: G      D W  2.6.32-4-pve #1 feoktistov ProLiant DL360 G6
Apr 23 00:42:10 master kernel: RIP: 0010:[<ffffffff81074aa6>]  [<ffffffff81074aa6>] ub_task_put+0x15/0xf4
Apr 23 00:42:10 master kernel: RSP: 0018:ffff8805f6933e98  EFLAGS: 00010287
Apr 23 00:42:10 master kernel: RAX: 00000000dead100c RBX: ffff88031c7db000 RCX: 0000000000000200
Apr 23 00:42:10 master kernel: RDX: 0000000000000000 RSI: 0000000000000200 RDI: ffff88031c7db000
Apr 23 00:42:10 master kernel: RBP: ffff88031c7db000 R08: 0000000000000000 R09: ffffffff8150a600
Apr 23 00:42:10 master kernel: R10: 0000000100000000 R11: ffffffff813e1c43 R12: 0000000000000000
Apr 23 00:42:10 master kernel: R13: 00000000dead100c R14: 0000000000000000 R15: 0000000000000000
Apr 23 00:42:10 master kernel: FS:  00007f7adc23c6e0(0000) GS:ffff88032dc40000(0000) knlGS:0000000000000000
Apr 23 00:42:10 master kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 23 00:42:10 master kernel: CR2: 00000000dead111c CR3: 000000061690a000 CR4: 00000000000026e0
Apr 23 00:42:10 master kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 23 00:42:10 master kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 23 00:42:10 master kernel: Process modprobe (pid: 22096, veid=0, threadinfo ffff8805f6932000, task ffff8805cac63000)
Apr 23 00:42:10 master kernel: Stack:
Apr 23 00:42:10 master kernel: ffff88031c7db000 ffff88031c7db000 0000000000000000 00007fffa584d650
Apr 23 00:42:10 master kernel: <0> 0000000000000000 ffffffff8104dbe8 ffff88031c7db010 ffffffff810667c2
Apr 23 00:42:10 master kernel: <0> 00000000fffffff5 ffffffffa02cda10 0000000000000080 ffffffffa02cd7e8
Apr 23 00:42:10 master kernel: Call Trace:
Apr 23 00:42:10 master kernel: [<ffffffff8104dbe8>] ? __put_task_struct+0x5d/0xc5
Apr 23 00:42:10 master kernel: [<ffffffff810667c2>] ? kthread_stop+0x76/0xa2
Apr 23 00:42:10 master kernel: [<ffffffffa02cd7e8>] ? wdog_exit+0x10/0x1f [vzwdog]
Apr 23 00:42:10 master kernel: [<ffffffff81081804>] ? sys_delete_module+0x1d2/0x258
Apr 23 00:42:10 master kernel: [<ffffffff810d4f6c>] ? sys_munmap+0x4d/0x59
Apr 23 00:42:10 master kernel: [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
Apr 23 00:42:10 master kernel: Code: a7 90 00 00 00 48 83 c4 28 89 d8 5b  5d 41 5c 41 5d 41 5e 41 5f c3 41 56 41 55 41 54 55 53 48 8b 87 10 07 00  00 48 89 fb 49 89 c5 <48> 8b 80 10 01 00 00 48 85 c0 75 f1 4d 8d  65 50 4c 89 e7 e8 91 
Apr 23 00:42:10 master kernel: RIP  [<ffffffff81074aa6>] ub_task_put+0x15/0xf4
Apr 23 00:42:10 master kernel: RSP <ffff8805f6933e98>
Apr 23 00:42:10 master kernel: CR2: 00000000dead111c
Apr 23 00:42:10 master kernel: ---[ end trace 377c218aff8daffc ]---
pveversion run on .24 kernel
Code:
pveversion -V
pve-manager: 1.8-15 (pve-manager/1.8/5754)
running kernel: 2.6.24-8-pve
proxmox-ve-2.6.32: 1.8-32
pve-kernel-2.6.32-4-pve: 2.6.32-32
pve-kernel-2.6.24-8-pve: 2.6.24-16
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-11
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.0-3
ksm-control-daemon: 1.0-5

Regards,
Miha
 
try 2.6.18. working?
 
We were using .24 for the past 18 months and with .32 crashing I went back to it and .24 is working just fine with 1.8.
We have a development environment built with standard PC components: Phenom 2 x6 and Proxmox 1.8 with .32 kernel are working without a hitch.
But somewhat .32 doesn't like HP ProLiant DL360 G6

Regards,
Miha
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!