Kernel error

zzhjkrqlne

Renowned Member
Oct 16, 2008
38
0
71
Hi,

Has anyone come across to the following kernel error on Proxmox 1.5? Machine was running pretty much idled with only 2 KVMs running Proxmox on DRBD.

Code:
Jul 18 19:14:43 test kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
Jul 18 19:14:43 test kernel: IP: [<ffffffffa028e0da>] gfn_to_rmap+0x2a/0x80 [kvm]
Jul 18 19:14:43 test kernel: PGD 15a0c4067 PUD 1c194b067 PMD 0
Jul 18 19:14:43 test kernel: Oops: 0000 [#1] SMP
Jul 18 19:14:43 test kernel: last sysfs file: /sys/kernel/uevent_seqnum
Jul 18 19:14:43 test kernel: CPU 0
Jul 18 19:14:43 test kernel: Modules linked in: ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_mangle nf_conntrack_ftp ipt_REJECT ipt_LOG xt_limit xt_multiport xt_state nf_conntrack iptable_filter ip_tables x_tables kvm_intel kvm sha1_generic drbd bridge stp snd_hda_codec_analog video output psmouse i2c_i801 serio_raw pcspkr snd_hda_intel snd_hda_codec snd_hwdep intel_agp snd_pcm snd_timer snd soundcore snd_page_alloc firewire_ohci firewire_core crc_itu_t ohci1394 ieee1394 ahci e1000e [last unloaded: scsi_transport_iscsi]
Jul 18 19:14:43 test kernel: Pid: 5141, comm: kvm Not tainted 2.6.32-2-pve #1
Jul 18 19:14:43 test kernel: RIP: 0010:[<ffffffffa028e0da>]  [<ffffffffa028e0da>] gfn_to_rmap+0x2a/0x80 [kvm]
Jul 18 19:14:43 test kernel: RSP: 0018:ffff8801317af9c8  EFLAGS: 00010246
Jul 18 19:14:43 test kernel: RAX: 0000000000000000 RBX: 000000000140f520 RCX: 0000000000000022
Jul 18 19:14:43 test kernel: RDX: 00000000000fee01 RSI: ffff880213360990 RDI: 0000000000000000
Jul 18 19:14:43 test kernel: RBP: ffff8801317af9d8 R08: 0000000000000022 R09: 0000000000000000
Jul 18 19:14:43 test kernel: R10: 0000000000110b3f R11: 0000000000000000 R12: 0000000000000001
Jul 18 19:14:43 test kernel: R13: ffff8801c2a6a210 R14: ffff880213360000 R15: ffff8801317afa58
Jul 18 19:14:43 test kernel: FS:  0000000040fdb950(0000) GS:ffff880009e00000(0000) knlGS:0000000000000000
Jul 18 19:14:43 test kernel: CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
Jul 18 19:14:43 test kernel: CR2: 0000000000000000 CR3: 000000021fedd000 CR4: 00000000000426e0
Jul 18 19:14:43 test kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 18 19:14:43 test kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jul 18 19:14:43 test kernel: Process kvm (pid: 5141, threadinfo ffff8801317ae000, task ffff88021ffb8000)
Jul 18 19:14:43 test kernel: Stack:
Jul 18 19:14:43 test kernel: 000000ffffbfffff ffff88019795f4e8 ffff8801317afa08 ffffffffa028e1d0
Jul 18 19:14:43 test kernel: <0> 00000000000004e8 ffff8801c2a6a210 000000000140f520 ffff880213368000
Jul 18 19:14:43 test kernel: <0> ffff8801317afa88 ffffffffa02912bf ffff880200000001 0000000000110b3f
Jul 18 19:14:43 test kernel: Call Trace:
Jul 18 19:14:43 test kernel: [<ffffffffa028e1d0>] rmap_remove+0xa0/0x210 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa02912bf>] paging64_sync_page+0xaf/0x1b0 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa028e8de>] ? rmap_write_protect+0xee/0x180 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa028f469>] kvm_sync_page+0x99/0x110 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa0291ba6>] mmu_sync_children+0x246/0x3a0 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa028f3be>] ? paging_new_cr3+0xe/0x10 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa0291db2>] mmu_sync_roots+0xb2/0xc0 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa0293c4d>] kvm_mmu_load+0x15d/0x290 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa02d4805>] ? vmx_handle_exit+0x145/0x260 [kvm_intel]
Jul 18 19:14:43 test kernel: [<ffffffffa0289c0d>] kvm_arch_vcpu_ioctl_run+0x67d/0xd50 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffffa0276332>] kvm_vcpu_ioctl+0x2f2/0x640 [kvm]
Jul 18 19:14:43 test kernel: [<ffffffff81154486>] vfs_ioctl+0x36/0xb0
Jul 18 19:14:43 test kernel: [<ffffffff8115462a>] do_vfs_ioctl+0x8a/0x5b0
Jul 18 19:14:43 test kernel: [<ffffffff810999f9>] ? sys_futex+0x89/0x160
Jul 18 19:14:43 test kernel: [<ffffffff81154bf1>] sys_ioctl+0xa1/0xb0
Jul 18 19:14:43 test kernel: [<ffffffff81084261>] ? posix_ktime_get_ts+0x11/0x20
Jul 18 19:14:43 test kernel: [<ffffffff810131f2>] system_call_fastpath+0x16/0x1b
Jul 18 19:14:43 test kernel: Code: 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 0f 1f 44 00 00 41 89 d4 48 89 f3 e8 4f 99 fe ff 41 83 fc 01 48 89 c7 75 1d <48> 2b 18 48 8d 34 dd 00 00 00 00 48 03 70 18 48 89 f0 48 8b 1c
Jul 18 19:14:43 test kernel: RIP  [<ffffffffa028e0da>] gfn_to_rmap+0x2a/0x80 [kvm]
Jul 18 19:14:43 test kernel: RSP <ffff8801317af9c8>
Jul 18 19:14:43 test kernel: CR2: 0000000000000000
Jul 18 19:14:43 test kernel: ---[ end trace 3f551af8855f21f7 ]---
Jul 18 19:15:48 test kernel: BUG: soft lockup - CPU#1 stuck for 61s! [kvm:5142]
Jul 18 19:15:48 test kernel: Modules linked in: ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_mangle nf_conntrack_ftp ipt_REJECT ipt_LOG xt_limit xt_multiport xt_state nf_conntrack iptable_filter ip_tables x_tables kvm_intel kvm sha1_generic drbd bridge stp snd_hda_codec_analog video output psmouse i2c_i801 serio_raw pcspkr snd_hda_intel snd_hda_codec snd_hwdep intel_agp snd_pcm snd_timer snd soundcore snd_page_alloc firewire_ohci firewire_core crc_itu_t ohci1394 ieee1394 ahci e1000e [last unloaded: scsi_transport_iscsi]
Jul 18 19:15:48 test kernel: CPU 1:
Jul 18 19:15:48 test kernel: Modules linked in: ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_mangle nf_conntrack_ftp ipt_REJECT ipt_LOG xt_limit xt_multiport xt_state nf_conntrack iptable_filter ip_tables x_tables kvm_intel kvm sha1_generic drbd bridge stp snd_hda_codec_analog video output psmouse i2c_i801 serio_raw pcspkr snd_hda_intel snd_hda_codec snd_hwdep intel_agp snd_pcm snd_timer snd soundcore snd_page_alloc firewire_ohci firewire_core crc_itu_t ohci1394 ieee1394 ahci e1000e [last unloaded: scsi_transport_iscsi]
Jul 18 19:15:48 test kernel: Pid: 5142, comm: kvm Tainted: G      D    2.6.32-2-pve #1
Jul 18 19:15:48 test kernel: RIP: 0010:[<ffffffff810397b2>]  [<ffffffff810397b2>] __ticket_spin_lock+0x12/0x20
Jul 18 19:15:48 test kernel: RSP: 0018:ffff8801e1e8db98  EFLAGS: 00000297
Jul 18 19:15:48 test kernel: RAX: 0000000000001514 RBX: ffff8801e1e8db98 RCX: fffffffffffff000
Jul 18 19:15:48 test kernel: RDX: 00000000fed000f0 RSI: 0000000000000000 RDI: ffff880213360000
Jul 18 19:15:48 test kernel: RBP: ffffffff81013cee R08: ffff880213360078 R09: 0000000000000022
Jul 18 19:15:48 test kernel: R10: ffff88021336a860 R11: ffffffff80221610 R12: 0000000000000203
Jul 18 19:15:48 test kernel: R13: 0000000000000204 R14: 0000000000000205 R15: ffff880100000001
Jul 18 19:15:48 test kernel: FS:  00000000419dc950(0000) GS:ffff880009e80000(0000) knlGS:0000000000000000
Jul 18 19:15:48 test kernel: CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
Jul 18 19:15:48 test kernel: CR2: 00007faea8ce6f20 CR3: 000000021fedd000 CR4: 00000000000426e0
Jul 18 19:15:48 test kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 18 19:15:48 test kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jul 18 19:15:48 test kernel: Call Trace:
Jul 18 19:15:48 test kernel: [<ffffffff815701be>] ? _spin_lock+0xe/0x20
Jul 18 19:15:48 test kernel: [<ffffffffa0291896>] ? kvm_mmu_unprotect_page_virt+0x66/0xf0 [kvm]
Jul 18 19:15:48 test kernel: [<ffffffffa0287210>] ? emulate_instruction+0x270/0x3a0 [kvm]
Jul 18 19:15:48 test kernel: [<ffffffffa02917e5>] ? kvm_mmu_page_fault+0x65/0xb0 [kvm]
Jul 18 19:15:48 test kernel: [<ffffffffa02d5067>] ? handle_exception+0x357/0x3b0 [kvm_intel]
Jul 18 19:15:48 test kernel: [<ffffffffa02d4805>] ? vmx_handle_exit+0x145/0x260 [kvm_intel]
Jul 18 19:15:48 test kernel: [<ffffffffa0289fbc>] ? kvm_arch_vcpu_ioctl_run+0xa2c/0xd50 [kvm]
Jul 18 19:15:48 test kernel: [<ffffffffa0276332>] ? kvm_vcpu_ioctl+0x2f2/0x640 [kvm]
Jul 18 19:15:48 test kernel: [<ffffffff81154486>] ? vfs_ioctl+0x36/0xb0
Jul 18 19:15:48 test kernel: [<ffffffff8115462a>] ? do_vfs_ioctl+0x8a/0x5b0
Jul 18 19:15:48 test kernel: [<ffffffff810999f9>] ? sys_futex+0x89/0x160
Jul 18 19:15:48 test kernel: [<ffffffff81154bf1>] ? sys_ioctl+0xa1/0xb0
Jul 18 19:15:48 test kernel: [<ffffffff81084261>] ? posix_ktime_get_ts+0x11/0x20
Jul 18 19:15:48 test kernel: [<ffffffff810131f2>] ? system_call_fastpath+0x16/0x1b

root@test:~# pveversion -v
pve-manager: 1.5-10 (pve-manager/1.5/4822)
running kernel: 2.6.32-2-pve
proxmox-ve-2.6.32: 1.5-7
pve-kernel-2.6.32-2-pve: 2.6.32-7
pve-kernel-2.6.24-11-pve: 2.6.24-23
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-16
pve-firmware: 1.0-5
libpve-storage-perl: 1.0-13
vncterm: 0.9-2
vzctl: 3.0.23-1pve11
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.4-1
ksm-control-daemon: 1.0-3

Machine became inaccessible once this happened and required a reboot to get it going again. Any ideas?
 
Is it possible to reproduce/trigger that bug somehow?

Machine was pretty much left untouched for almost a month with those 2 KVMs running before it had this issue. Since no particular job was running other than plain install of Proxmox in those 2 KVMs, I can't really tell what triggered it so I doubt I can reproduce this manually.

Leaning towards the possibility of hardware issue for now. Just wondered if anyone was able to either confirm that, or tell me otherwise.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!