PVE 5: VMs hanging Host Server Crashing

Rais Ahmed

Active Member
Apr 14, 2017
50
4
28
37
Hi,
i have a PVE 5 server which is hosted 7 VMs, which as working fine since few months. but now from last few days different VMs are hanging and also Host server is crashing. found below logs. please help


Mar 13 09:44:05 kernel: NMI watchdog: BUG: soft lockup - CPU#36 stuck for 23s! [kvm:69907]
Mar 13 09:44:05 kernel: Modules linked in: tcp_diag inet_diag ip_set ip6table_filter ip6_tables ses enclosure dm_round_robin binfmt_misc 8021q garp mrp bonding softdog ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack nfnetlink_log nfnetlink iptable_filter nls_iso8859_1 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c ipmi_ssif intel_rapl sb_edac edac_core mgag200 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ttm kvm drm_kms_helper drm irqbypass crct10dif_pclmul i2c_algo_bit crc32_pclmul fb_sys_fops ghash_clmulni_intel syscopyarea pcbc sysfillrect aesni_intel ioatdma sysimgblt hpilo joydev input_leds aes_x86_64 crypto_simd snd_pcm snd_timer snd glue_helper soundcore lpc_ich cryptd pcspkr shpchp intel_cstate intel_rapl_perf acpi_power_meter
Mar 13 09:44:05 kernel: ipmi_si ipmi_devintf wmi ipmi_msghandler mac_hid dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua vhost_net vhost macvtap macvlan ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp sunrpc libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs xor raid6_pq hid_generic usbkbd usbmouse usbhid i2c_i801 hid ixgbe(O) hpsa dca scsi_transport_sas ptp pps_core fjes
Mar 13 09:44:05 kernel: CPU: 36 PID: 69907 Comm: kvm Tainted: G D W O L 4.10.17-2-pve #1
Mar 13 09:44:05 kernel: Hardware name: HP ProLiant BL460c Gen9, BIOS I36 09/12/2016
Mar 13 09:44:05 kernel: task: ffff8df532338000 task.stack: ffffb9ee1bff0000
Mar 13 09:44:05 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x17b/0x1a0
Mar 13 09:44:05 kernel: RSP: 0018:ffffb9ee1bff3c20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Mar 13 09:44:05 kernel: RAX: 0000000000000101 RBX: 000055e71df96ce0 RCX: 0000000000000001
Mar 13 09:44:05 kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffb9ee19a3c104
Mar 13 09:44:05 kernel: RBP: ffffb9ee1bff3c20 R08: 0000000000000101 R09: 0000000000000000
Mar 13 09:44:05 kernel: R10: 0000000000000002 R11: 000055e71df96ce0 R12: ffffb9ee1bff3cc8
Mar 13 09:44:05 kernel: R13: ffffb9ee1bff3cd0 R14: ffffb9ee1bff3d08 R15: ffffb9ee19a3c100
Mar 13 09:44:05 kernel: FS: 00007f32e4fff700(0000) GS:ffff8ddc3f880000(0000) knlGS:0000000000000000
Mar 13 09:44:05 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 13 09:44:05 kernel: CR2: 00007f32e4ffc640 CR3: 00000014416eb000 CR4: 00000000003426e0
Mar 13 09:44:05 kernel: Call Trace:
Mar 13 09:44:05 kernel: _raw_spin_lock+0x20/0x30
Mar 13 09:44:05 kernel: futex_wait_setup+0x82/0x120
Mar 13 09:44:05 kernel: ? futex_wait_queue_me+0xd3/0x120
Mar 13 09:44:05 kernel: futex_wait+0xf9/0x270
Mar 13 09:44:05 kernel: ? numa_migrate_preferred+0x2b/0x80
Mar 13 09:44:05 kernel: ? task_numa_fault+0x8f1/0xaf0
Mar 13 09:44:05 kernel: do_futex+0x2cd/0xb60
Mar 13 09:44:05 kernel: ? handle_mm_fault+0xac2/0x1330
Mar 13 09:44:05 kernel: SyS_futex+0x85/0x180
Mar 13 09:44:05 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad
Mar 13 09:44:05 kernel: RIP: 0033:0x7f36f2afbf5c
Mar 13 09:44:05 kernel: RSP: 002b:00007f32e4ffc598 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
Mar 13 09:44:05 kernel: RAX: ffffffffffffffda RBX: 00007f32e5006000 RCX: 00007f36f2afbf5c
Mar 13 09:44:05 kernel: RDX: 0000000000000002 RSI: 0000000000000080 RDI: 000055e71df96ce0
Mar 13 09:44:05 kernel: RBP: 000055e71df96ce0 R08: 000055e71df96ce0 R09: 0000000000000009
Mar 13 09:44:05 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f32e50060a3
Mar 13 09:44:05 kernel: R13: 00007ffe1765d88f R14: 0000000000000000 R15: 00007f370ab04040
Mar 13 09:44:05 kernel: Code: c0 74 e6 4d 85 c9 c6 07 01 74 30 41 c7 41 08 01 00 00 00 e9 52 ff ff ff 83 fa 01 0f 84 b0 fe ff ff 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 f3 90 4c 8b 09 4d 85
 
updated pve box with apt-get update && apt-get dist-upgrade
#pveversion
pve-manager/5.1-46/ae8241d4 (running kernel: 4.13.13-6-pve)

now keeping it on observation..!
 
Last edited:
never run "apt-get upgrade", please always follow our upgrade guides, telling:

> apt update
> apt dist-upgrade
 
  • Like
Reactions: Rais Ahmed

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!