Hello,
While executing a backup (suspend mode) the resume
at the end of the process failed and left the OpenVZ
VM hanged. Impossible to stop the vm with vzctl.
I had to reboot the host node to recover control over
the vm.
I noticed many processes in state "D" while the VM was hanged.
Other VMs on the same host node did not show any
problems after this Oops.
Oct 26 03:38:08 ns300364 kernel: Unable to handle kernel NULL pointer dereference at 0000000000000808 RIP:
Oct 26 03:38:08 ns300364 kernel: [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: PGD 114c1b067 PUD 102886067 PMD 0
Oct 26 03:38:08 ns300364 kernel: Oops: 0002 [1] PREEMPT SMP
Oct 26 03:38:08 ns300364 kernel: CPU: 1
Oct 26 03:38:08 ns300364 kernel: Modules linked in: kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt tun vzdquota vzmon vzdev xt_length $
Oct 26 03:38:08 ns300364 kernel: Pid: 7467, comm: vzctl Not tainted 2.6.24-7-pve #1 ovz005
Oct 26 03:38:08 ns300364 kernel: RIP: 0010:[<ffffffff804c9708>] [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: RSP: 0000:ffff8101091cfdf8 EFLAGS: 00010002
Oct 26 03:38:08 ns300364 kernel: RAX: 0000000000000000 RBX: 0000000000000808 RCX: 00000000c0000100
Oct 26 03:38:08 ns300364 kernel: RDX: 0000000000000202 RSI: ffff81009bd2c8e0 RDI: 0000000000000808
Oct 26 03:38:08 ns300364 kernel: RBP: 0000000000000000 R08: ffff8101091ce000 R09: 0000000000000000
Oct 26 03:38:08 ns300364 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff81010bd44380
Oct 26 03:38:08 ns300364 kernel: R13: ffff81010bd50800 R14: ffff81010bd50998 R15: 0000000000001000
Oct 26 03:38:08 ns300364 kernel: FS: 00007f3bf90e26e0(0000) GS:ffff810117402880(0000) knlGS:0000000000000000
Oct 26 03:38:08 ns300364 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 26 03:38:08 ns300364 kernel: CR2: 0000000000000808 CR3: 000000010bd87000 CR4: 00000000000026e0
Oct 26 03:38:08 ns300364 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 26 03:38:08 ns300364 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 26 03:38:08 ns300364 kernel: Process vzctl (pid: 7467, veid=0, threadinfo ffff8101091ce000, task ffff81009880c8e0)
Oct 26 03:38:08 ns300364 kernel: Stack: ffff81009bd2c8e0 ffff8100b8c3d1c0 ffff81010bd50820 ffffffff884df796
Oct 26 03:38:08 ns300364 kernel: 0000000000000000 0000000000002d08 ffff81010bd50820 ffff81010c18ff00
Oct 26 03:38:08 ns300364 kernel: 0000000000000000 ffff81010bd50800 0000000000000000 ffffffff884dbfed
Oct 26 03:38:08 ns300364 kernel: Call Trace:
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884df796>] :vzcpt:cpt_resume+0xe6/0x210
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884dbfed>] :vzcpt:cpt_ioctl+0xa3d/0xeb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff804cba66>] do_page_fault+0x176/0x890
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884db5b0>] :vzcpt:cpt_ioctl+0x0/0xeb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff8031c43e>] proc_reg_unlocked_ioctl+0xee/0x110
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e18af>] do_ioctl+0x2f/0xb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e1bbb>] vfs_ioctl+0x28b/0x300
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802d2a3c>] vfs_write+0x12c/0x190
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e1c79>] sys_ioctl+0x49/0x80
Oct 26 03:38:08 ns300364 kernel: [<ffffffff8020c69e>] system_call+0x7e/0x83
Oct 26 03:38:08 ns300364 kernel:
Oct 26 03:38:08 ns300364 kernel:
Oct 26 03:38:08 ns300364 kernel: Code: 87 03 85 c0 7e 19 c7 43 04 00 00 00 00 48 89 d0 48 8b 5c 24
Oct 26 03:38:08 ns300364 kernel: RIP [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: RSP <ffff8101091cfdf8>
Oct 26 03:38:08 ns300364 kernel: CR2: 0000000000000808
Oct 26 03:38:08 ns300364 kernel: ---[ end trace 61c9cdf94503b360 ]---
Oct 26 03:38:08 ns300364 kernel: note: vzctl[7467] exited with preempt_count 1
root@ns300364:~# pveversion
pve-manager/1.3/4023
root@ns300364:~# pveversion -v
pve-manager: 1.3-1 (pve-manager/1.3/4023)
qemu-server: 1.0-14
pve-kernel: 2.6.24-11
pve-kvm: 86-3
pve-firmware: not correctly installed
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-2
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1dso1
root@ns300364:~# uname -a
Linux ns300364.ovh.net 2.6.24-7-pve #1 SMP PREEMPT Fri Aug 21 09:07:39 CEST 2009 x86_64 GNU/Linux
Any ideas about what could be the cause ?
Thanks in advance
Phil Ten
While executing a backup (suspend mode) the resume
at the end of the process failed and left the OpenVZ
VM hanged. Impossible to stop the vm with vzctl.
I had to reboot the host node to recover control over
the vm.
I noticed many processes in state "D" while the VM was hanged.
Other VMs on the same host node did not show any
problems after this Oops.
Oct 26 03:38:08 ns300364 kernel: Unable to handle kernel NULL pointer dereference at 0000000000000808 RIP:
Oct 26 03:38:08 ns300364 kernel: [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: PGD 114c1b067 PUD 102886067 PMD 0
Oct 26 03:38:08 ns300364 kernel: Oops: 0002 [1] PREEMPT SMP
Oct 26 03:38:08 ns300364 kernel: CPU: 1
Oct 26 03:38:08 ns300364 kernel: Modules linked in: kvm_intel kvm vzethdev vznetdev simfs vzrst vzcpt tun vzdquota vzmon vzdev xt_length $
Oct 26 03:38:08 ns300364 kernel: Pid: 7467, comm: vzctl Not tainted 2.6.24-7-pve #1 ovz005
Oct 26 03:38:08 ns300364 kernel: RIP: 0010:[<ffffffff804c9708>] [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: RSP: 0000:ffff8101091cfdf8 EFLAGS: 00010002
Oct 26 03:38:08 ns300364 kernel: RAX: 0000000000000000 RBX: 0000000000000808 RCX: 00000000c0000100
Oct 26 03:38:08 ns300364 kernel: RDX: 0000000000000202 RSI: ffff81009bd2c8e0 RDI: 0000000000000808
Oct 26 03:38:08 ns300364 kernel: RBP: 0000000000000000 R08: ffff8101091ce000 R09: 0000000000000000
Oct 26 03:38:08 ns300364 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff81010bd44380
Oct 26 03:38:08 ns300364 kernel: R13: ffff81010bd50800 R14: ffff81010bd50998 R15: 0000000000001000
Oct 26 03:38:08 ns300364 kernel: FS: 00007f3bf90e26e0(0000) GS:ffff810117402880(0000) knlGS:0000000000000000
Oct 26 03:38:08 ns300364 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 26 03:38:08 ns300364 kernel: CR2: 0000000000000808 CR3: 000000010bd87000 CR4: 00000000000026e0
Oct 26 03:38:08 ns300364 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 26 03:38:08 ns300364 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 26 03:38:08 ns300364 kernel: Process vzctl (pid: 7467, veid=0, threadinfo ffff8101091ce000, task ffff81009880c8e0)
Oct 26 03:38:08 ns300364 kernel: Stack: ffff81009bd2c8e0 ffff8100b8c3d1c0 ffff81010bd50820 ffffffff884df796
Oct 26 03:38:08 ns300364 kernel: 0000000000000000 0000000000002d08 ffff81010bd50820 ffff81010c18ff00
Oct 26 03:38:08 ns300364 kernel: 0000000000000000 ffff81010bd50800 0000000000000000 ffffffff884dbfed
Oct 26 03:38:08 ns300364 kernel: Call Trace:
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884df796>] :vzcpt:cpt_resume+0xe6/0x210
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884dbfed>] :vzcpt:cpt_ioctl+0xa3d/0xeb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff804cba66>] do_page_fault+0x176/0x890
Oct 26 03:38:08 ns300364 kernel: [<ffffffff884db5b0>] :vzcpt:cpt_ioctl+0x0/0xeb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff8031c43e>] proc_reg_unlocked_ioctl+0xee/0x110
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e18af>] do_ioctl+0x2f/0xb0
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e1bbb>] vfs_ioctl+0x28b/0x300
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802d2a3c>] vfs_write+0x12c/0x190
Oct 26 03:38:08 ns300364 kernel: [<ffffffff802e1c79>] sys_ioctl+0x49/0x80
Oct 26 03:38:08 ns300364 kernel: [<ffffffff8020c69e>] system_call+0x7e/0x83
Oct 26 03:38:08 ns300364 kernel:
Oct 26 03:38:08 ns300364 kernel:
Oct 26 03:38:08 ns300364 kernel: Code: 87 03 85 c0 7e 19 c7 43 04 00 00 00 00 48 89 d0 48 8b 5c 24
Oct 26 03:38:08 ns300364 kernel: RIP [<ffffffff804c9708>] _spin_lock_irqsave+0x28/0xb0
Oct 26 03:38:08 ns300364 kernel: RSP <ffff8101091cfdf8>
Oct 26 03:38:08 ns300364 kernel: CR2: 0000000000000808
Oct 26 03:38:08 ns300364 kernel: ---[ end trace 61c9cdf94503b360 ]---
Oct 26 03:38:08 ns300364 kernel: note: vzctl[7467] exited with preempt_count 1
root@ns300364:~# pveversion
pve-manager/1.3/4023
root@ns300364:~# pveversion -v
pve-manager: 1.3-1 (pve-manager/1.3/4023)
qemu-server: 1.0-14
pve-kernel: 2.6.24-11
pve-kvm: 86-3
pve-firmware: not correctly installed
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-2
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1dso1
root@ns300364:~# uname -a
Linux ns300364.ovh.net 2.6.24-7-pve #1 SMP PREEMPT Fri Aug 21 09:07:39 CEST 2009 x86_64 GNU/Linux
Any ideas about what could be the cause ?
Thanks in advance
Phil Ten