KVM box needs reset

hk@

Renowned Member
Feb 10, 2010
247
7
83
Vienna
kapper.net
Hi
on the host we have:
pveversion -v
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.6-25
pve-kernel-2.6.32-4-pve: 2.6.32-25
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-22
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-8
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.5-2
ksm-control-daemon: 1.0-4

the kvm-instance needs a reset and it seems we hit a kernelbug on the host, as in the kvm-instance we get this:
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] BUG: soft lockup - CPU#0
stuck for 4096s! [swapper:0]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Modules linked in: ipv6
nfs lockd nfs_acl sunrpc ipt_LOG xt_limit nf_conntrack_ipv4 xt_state
nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm
snd_timer snd soundcore snd_page_alloc pcspkr serio_raw psmouse button
i2c_piix4 i2c_core joydev evdev ext3 jbd mbcache ide_cd_mod cdrom
ata_generic libata scsi_mod dock usbhid hid ff_memless virtio_blk piix
e1000 floppy virtio_pci virtio_ring virtio ide_pci_generic ide_core
uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CPU 0:
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Modules linked in: ipv6
nfs lockd nfs_acl sunrpc ipt_LOG xt_limit nf_conntrack_ipv4 xt_state
nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm
snd_timer snd soundcore snd_page_alloc pcspkr serio_raw psmouse button
i2c_piix4 i2c_core joydev evdev ext3 jbd mbcache ide_cd_mod cdrom
ata_generic libata scsi_mod dock usbhid hid ff_memless virtio_blk piix
e1000 floppy virtio_pci virtio_ring virtio ide_pci_generic ide_core
uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Pid: 0, comm: swapper Not
tainted 2.6.26-2-amd64 #1
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RIP:
0010:[<ffffffff8021eb64>] [<ffffffff8021eb64>] native_safe_halt+0x2/0x3
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RSP: 0000:ffffffff80575f38
EFLAGS: 00000246
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RAX: ffffffff80575fd8 RBX:
0000000000000000 RCX: 0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RDX: 0000000000000000 RSI:
0000000000000001 RDI: ffffffff804fce70
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RBP: 0000000012e0d0a8 R08:
ffffffff8021eb64 R09: ffff81011d0624c0
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] R10: ffff810100f7d938 R11:
ffff81011d0624c0 R12: ffffffff80575ed8
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] R13: 0000000000000000 R14:
ffffffff8023ce76 R15: 00035f9663eb9df5
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] FS:
0000000000000000(0000) GS:ffffffff8053d000(0000) knlGS:0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CS: 0010 DS: 0018 ES:
0018 CR0: 000000008005003b
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CR2: 00000000e78eb730 CR3:
000000011086c000 CR4: 00000000000006e0
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Nov 18 22:16:23 fin01tn kernel: [1082291.622189]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Call Trace:
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] [<ffffffff8020b0d8>] ?
default_idle+0x2a/0x46
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] [<ffffffff8020ad04>] ?
cpu_idle+0x8e/0xb8
Nov 18 22:16:23 fin01tn kernel: [1082291.622189]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] BUG: soft lockup - CPU#1
stuck for 4096s! [swapper:0]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Modules linked in: ipv6
nfs lockd nfs_acl sunrpc ipt_LOG xt_limit nf_conntrack_ipv4 xt_sta
te nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm
snd_timer snd soundcore snd_page_alloc pcspkr serio_raw psmouse button
i2c_piix4 i2c_core joydev evdev ext3 jbd mbcache ide_cd_mod cdrom
ata_generic libata scsi_mod dock usbhid hid ff_memless virtio_blk piix
e1000 floppy virtio_pci virtio_ring virtio ide_pci_generic ide_core
uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CPU 1:
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Modules linked in: ipv6
nfs lockd nfs_acl sunrpc ipt_LOG xt_limit nf_conntrack_ipv4 xt_state
nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm
snd_timer snd soundcore snd_page_alloc pcspkr serio_raw psmouse button
i2c_piix4 i2c_core joydev evdev ext3 jbd mbcache ide_cd_mod cdrom
ata_generic libata scsi_mod dock usbhid hid ff_memless virtio_blk piix
e1000 floppy virtio_pci virtio_ring virtio ide_pci_generic ide_core
uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Pid: 0, comm: swapper Not
tainted 2.6.26-2-amd64 #1
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RIP:
0010:[<ffffffff8021eb64>] [<ffffffff8021eb64>] native_safe_halt+0x2/0x3
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RSP: 0000:ffff81011faa5f38
EFLAGS: 00000246
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RAX: ffff81011faa5fd8 RBX:
0000000000000000 RCX: 0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RDX: 0000000000000000 RSI:
0000000000000001 RDI: ffffffff804fce70
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] RBP: 0000000012e0d0aa R08:
ffffffff8021eb64 R09: ffff81010417b260
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] R10: ffff810100f7dc48 R11:
ffff81011d0c8790 R12: ffff81011faa5ed8
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] R13: 0000000000000000 R14:
ffffffff8023ce76 R15: 00035f9663f2dbe7
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] FS:
0000000000000000(0000) GS:ffff81011fa738c0(0000) knlGS:0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CS: 0010 DS: 0018 ES:
0018 CR0: 000000008005003b
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] CR2: 00007f26a377cba0 CR3:
000000011086c000 CR4: 00000000000006e0
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Nov 18 22:16:23 fin01tn kernel: [1082291.622189]
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] Call Trace:
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] [<ffffffff8020b0d8>] ?
default_idle+0x2a/0x46
Nov 18 22:16:23 fin01tn kernel: [1082291.622189] [<ffffffff8020ad04>] ?
cpu_idle+0x8e/0xb8
Nov 18 22:16:23 fin01tn kernel: [1082291.622189]
 
try latest 2.6.32 kernel/kvm from pvetest (we plan to release these packages to stable anyways)
 
Hi
as this is a six server cluster I'd appreciate your releasing it to stable before we do upgrades to test-kernels.

Except - is there a clean way to get only this very kernel from pvetest?

Thank you in advance
hk
 
Hi
as this is a six server cluster I'd appreciate your releasing it to stable before we do upgrades to test-kernels.

just what you prefer. the only reason why we did not release it this week was limited time of the team.

Except - is there a clean way to get only this very kernel from pvetest?

Thank you in advance
hk

just download the kernel with wget and install with dpkg -i and reboot.

btw, the plan is to release 1.7 next week.
 
Sorry, but just to be sure - only the kernel is needed no further packages in order to get a clear upgrade here without further trouble to be expected?

Thank you
hk
 
If you want just the kernel install it as described, if you want all new packages (which i recommend) just upgrade all. but I am not sure if the new kernel solves your problem but it makes sense to try.
 
Tom,

Sorry for hi-jacking the thread, but you're breaking the news on 1.7 for next week above ... :)

Do you have a list of release notes and/or features for 1.7 you can share with the community?
 
Sorry for hi-jacking the thread, but you're breaking the news on 1.7 for next week above ... :)

Sorry- there was a (very) serious bug in the latest 2.6.18 kernel - so we decided to wait until that is fixed.

Do you have a list of release notes and/or features for 1.7 you can share with the community?

just small fixes (kvm 0.13) and kernel updates - the release will have release notes :)
 
Hi,

i have exactly the same bug :

Code:
Dec  1 01:30:34 debian06 kernel: [154553.100143] BUG: soft lockup - CPU#1 stuck for 4096s! [swapper:0]
Dec  1 01:30:34 debian06 kernel: [154553.100143] Modules linked in: iptable_filter ip_tables x_tables ipv6 loop snd_pcsp virtio_net snd_pcm snd_timer snd soundcore snd_page_alloc psmouse button i2c_piix4 i2c_core serio_raw joydev evdev ext3 jbd mbcache ide_cd_mod cdrom usbhid hid ff_memless virtio_blk piix ide_pci_generic ide_core floppy ata_generic virtio_pci libata scsi_mod dock virtio_ring virtio uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Dec  1 01:30:34 debian06 kernel: [154553.100143] CPU 1:
Dec  1 01:30:34 debian06 kernel: [154553.100143] Modules linked in: iptable_filter ip_tables x_tables ipv6 loop snd_pcsp virtio_net snd_pcm snd_timer snd soundcore snd_page_alloc psmouse button i2c_piix4 i2c_core serio_raw joydev evdev ext3 jbd mbcache ide_cd_mod cdrom usbhid hid ff_memless virtio_blk piix ide_pci_generic ide_core floppy ata_generic virtio_pci libata scsi_mod dock virtio_ring virtio uhci_hcd thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Dec  1 01:30:34 debian06 kernel: [154553.100143] Pid: 0, comm: swapper Not tainted 2.6.26-2-amd64 #1
Dec  1 01:30:34 debian06 kernel: [154553.100143] RIP: 0010:[<ffffffff8021eb64>]  [<ffffffff8021eb64>] native_safe_halt+0x2/0x3
Dec  1 01:30:34 debian06 kernel: [154553.100143] RSP: 0018:ffff81031e4b1f38  EFLAGS: 00000246
Dec  1 01:30:34 debian06 kernel: [154553.100143] RAX: ffff81031e4b1fd8 RBX: 0000000000000000 RCX: 0000000000000000
Dec  1 01:30:34 debian06 kernel: [154553.100143] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff804fce70
Dec  1 01:30:34 debian06 kernel: [154553.100143] RBP: 0000000004f472c6 R08: ffffffff8021eb64 R09: ffff81031be94d60
Dec  1 08:13:10 debian06 kernel: imklog 3.18.6, log source = /proc/kmsg started.
on this host :

pveversion -v
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.32-3-pve
pve-kernel-2.6.32-3-pve: 2.6.32-18
qemu-server: 1.1-22
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-8
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1dso1

the host is http://www.ovh.co.uk/products/mg_hybrid.xml
with RAID Hardware, my guest is on the SSD Raid 1 (formatted as EXT3).

with this guest:

debian06:~# uname -a
Linux debian06.localdomain 2.6.26-2-amd64 #1 SMP Thu Sep 16 15:56:38 UTC 2010 x86_64 GNU/Linux

ostype: l26
memory: 12288
sockets: 4
vlan2: virtio=06:67:60:33:97:8E
name: VM-XXXXXXXXXXXXXX
ide2: none,media=cdrom
bootdisk: virtio0
virtio0: local:122/vm-122-disk-1.raw
virtio1: local:122/vm-122-disk-2.raw
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
onboot: 0
cores: 1
description: IP PRIVE 192.168.115.20

i have two others guest on this host, only 1 core, 512Mb RAM, they never freeze, looks like a problem with multiple vCPU. When the freeze happen there is no load on the server, it's really "random".

I didn't try 1.7 at this time...
 
try the latest 2.6.32 from 1.7.

i'll ... but i have to fix my windows 2003 problem with 1.7 first :(

to be clear :
- on my production server i have random freeze in debian but no problem with 2003
- on my test environnement with 1.7 i have freeze with 2003

a bit hard to fix no ? :)
 
any details about the win2003 problem? post 'pveversion -v' and VMID.conf file to the guest.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!