Node crash with CPU stuck

steff123

Member
Aug 1, 2020
42
1
13
40
Hey, my node crashed with following messages in kern.log. Bug in kernel or something I could avoid with another configuration

Code:
Aug  6 23:14:51 node03 kernel: [300260.313502] watchdog: BUG: soft lockup - CPU#8 stuck for 134s! [kvm:3135]
Aug  6 23:14:51 node03 kernel: [300260.313508] Modules linked in: vhost_vsock vmw_vsock_virtio_transport_common vsock ip6t_REJECT nf_reject_ipv6 xt_mark xt_set xt_physdev xt_comment xt_multiport ip_set_hash_net tcp_diag inet_diag nf_conntrack_netlink xt_nat nft_chain_nat xt_MASQUERADE nf_nat xfrm_user xfrm_algo overlay ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter veth rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter sctp ip6_udp_tunnel udp_tunnel nf_tables libcrc32c bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common amdgpu edac_mce_amd snd_usb_audio btusb iommu_v2 snd_hda_codec_hdmi snd_usbmidi_lib gpu_sched btrtl btbcm drm_ttm_helper snd_rawmidi kvm_amd snd_hda_intel btintel ttm snd_intel_dspcfg bluetooth snd_seq_device snd_intel_sdw_acpi ecdh_generic mc
Aug  6 23:14:51 node03 kernel: [300260.313549]  drm_kms_helper ecc kvm mt7921e irqbypass snd_hda_codec mt76_connac_lib snd_pci_acp6x cec rapl snd_hda_core mt76 rc_core snd_hwdep i2c_algo_bit snd_pci_acp5x mac80211 fb_sys_fops snd_pcm snd_timer syscopyarea cfg80211 sysfillrect efi_pstore snd_rn_pci_acp3x pcspkr snd sysimgblt k10temp libarc4 soundcore snd_pci_acp3x ccp zfs(PO) cm32181 industrialio zunicode(PO) mac_hid zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 dm_crypt simplefb hid_cmedia hid_generic usbhid xhci_pci crct10dif_pclmul xhci_pci_renesas crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd i2c_piix4 ahci libahci nvme amd_sfh r8169 xhci_hcd igc realtek nvme_core video i2c_hid_acpi i2c_hid hid
Aug  6 23:14:51 node03 kernel: [300260.313549] CPU: 8 PID: 3135 Comm: kvm Tainted: P      D    O L    5.15.39-2-pve #1
Aug  6 23:14:51 node03 kernel: [300260.313549] Hardware name: BESSTAR TECH LIMITED HM90/HM90, BIOS 5.16 10/13/2021
Aug  6 23:14:51 node03 kernel: [300260.313549] RIP: 0010:queued_write_lock_slowpath+0x61/0x90
Aug  6 23:14:51 node03 kernel: [300260.313549] Code: 00 0f 1f 40 00 5b 41 5c 5d e9 4b d8 ec 00 f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 <8b> 03 3d 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 4c
Aug  6 23:14:51 node03 kernel: [300260.313549] RSP: 0018:ffffbcc785317cc8 EFLAGS: 00000206
Aug  6 23:14:51 node03 kernel: [300260.313549] RAX: 0000000000000500 RBX: ffffbcc785329000 RCX: 0000000000000100
Aug  6 23:14:51 node03 kernel: [300260.313549] RDX: 00000000000000ff RSI: 00007f575c60a9a8 RDI: ffffbcc785329000
Aug  6 23:14:51 node03 kernel: [300260.313549] RBP: ffffbcc785317cd8 R08: 0000000000000008 R09: 0000000000000009
Aug  6 23:14:51 node03 kernel: [300260.313549] R10: 00007f575c60a9a0 R11: 0000000000000000 R12: ffffbcc785329004
Aug  6 23:14:51 node03 kernel: [300260.313549] R13: ffff9bd6b6ff0328 R14: ffffbcc785317d58 R15: ffffbcc785329000
Aug  6 23:14:51 node03 kernel: [300260.313549] FS:  00007f5766664000(0000) GS:ffff9bdc2f800000(0000) knlGS:0000000000000000
Aug  6 23:14:51 node03 kernel: [300260.313549] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug  6 23:14:51 node03 kernel: [300260.313549] CR2: 00007f56549c0010 CR3: 0000000104dfe000 CR4: 0000000000350ee0
Aug  6 23:14:51 node03 kernel: [300260.313549] Call Trace:
Aug  6 23:14:51 node03 kernel: [300260.313549]  <TASK>
Aug  6 23:14:51 node03 kernel: [300260.313549]  _raw_write_lock+0x20/0x30
Aug  6 23:14:51 node03 kernel: [300260.313549]  kvm_clear_dirty_log_protect+0x1b3/0x2e0 [kvm]
Aug  6 23:14:51 node03 kernel: [300260.313549]  kvm_vm_ioctl+0x165/0xf70 [kvm]
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? __fget_files+0x86/0xc0
Aug  6 23:14:51 node03 kernel: [300260.313549]  __x64_sys_ioctl+0x95/0xd0
Aug  6 23:14:51 node03 kernel: [300260.313549]  do_syscall_64+0x5c/0xc0
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? handle_mm_fault+0xd8/0x2c0
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? exit_to_user_mode_prepare+0x90/0x1b0
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? irqentry_exit_to_user_mode+0x9/0x20
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? irqentry_exit+0x1d/0x30
Aug  6 23:14:51 node03 kernel: [300260.313549]  ? exc_page_fault+0x89/0x170
Aug  6 23:14:51 node03 kernel: [300260.313549]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
Aug  6 23:14:51 node03 kernel: [300260.313549] RIP: 0033:0x7f5771279cc7
Aug  6 23:14:51 node03 kernel: [300260.313549] Code: 00 00 00 48 8b 05 c9 91 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 91 0c 00 f7 d8 64 89 01 48
Aug  6 23:14:51 node03 kernel: [300260.313549] RSP: 002b:00007ffc33cef358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Aug  6 23:14:51 node03 kernel: [300260.313549] RAX: ffffffffffffffda RBX: 00000000c018aec0 RCX: 00007f5771279cc7
Aug  6 23:14:51 node03 kernel: [300260.313549] RDX: 00007ffc33cef470 RSI: ffffffffc018aec0 RDI: 0000000000000012
Aug  6 23:14:51 node03 kernel: [300260.313549] RBP: 000055dec78d5b60 R08: 0000000000000010 R09: 000055dec7595010
Aug  6 23:14:51 node03 kernel: [300260.313549] R10: 00007f5771343b80 R11: 0000000000000246 R12: 00007ffc33cef470
Aug  6 23:14:51 node03 kernel: [300260.313549] R13: 000055dec78d6c20 R14: 0000000000000000 R15: 00007f57651c02e0
Aug  6 23:14:51 node03 kernel: [300260.313549]  </TASK>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!