Hey, my node crashed with following messages in kern.log. Bug in kernel or something I could avoid with another configuration
Code:
Aug 6 23:14:51 node03 kernel: [300260.313502] watchdog: BUG: soft lockup - CPU#8 stuck for 134s! [kvm:3135]
Aug 6 23:14:51 node03 kernel: [300260.313508] Modules linked in: vhost_vsock vmw_vsock_virtio_transport_common vsock ip6t_REJECT nf_reject_ipv6 xt_mark xt_set xt_physdev xt_comment xt_multiport ip_set_hash_net tcp_diag inet_diag nf_conntrack_netlink xt_nat nft_chain_nat xt_MASQUERADE nf_nat xfrm_user xfrm_algo overlay ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter veth rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter sctp ip6_udp_tunnel udp_tunnel nf_tables libcrc32c bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common amdgpu edac_mce_amd snd_usb_audio btusb iommu_v2 snd_hda_codec_hdmi snd_usbmidi_lib gpu_sched btrtl btbcm drm_ttm_helper snd_rawmidi kvm_amd snd_hda_intel btintel ttm snd_intel_dspcfg bluetooth snd_seq_device snd_intel_sdw_acpi ecdh_generic mc
Aug 6 23:14:51 node03 kernel: [300260.313549] drm_kms_helper ecc kvm mt7921e irqbypass snd_hda_codec mt76_connac_lib snd_pci_acp6x cec rapl snd_hda_core mt76 rc_core snd_hwdep i2c_algo_bit snd_pci_acp5x mac80211 fb_sys_fops snd_pcm snd_timer syscopyarea cfg80211 sysfillrect efi_pstore snd_rn_pci_acp3x pcspkr snd sysimgblt k10temp libarc4 soundcore snd_pci_acp3x ccp zfs(PO) cm32181 industrialio zunicode(PO) mac_hid zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 dm_crypt simplefb hid_cmedia hid_generic usbhid xhci_pci crct10dif_pclmul xhci_pci_renesas crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd i2c_piix4 ahci libahci nvme amd_sfh r8169 xhci_hcd igc realtek nvme_core video i2c_hid_acpi i2c_hid hid
Aug 6 23:14:51 node03 kernel: [300260.313549] CPU: 8 PID: 3135 Comm: kvm Tainted: P D O L 5.15.39-2-pve #1
Aug 6 23:14:51 node03 kernel: [300260.313549] Hardware name: BESSTAR TECH LIMITED HM90/HM90, BIOS 5.16 10/13/2021
Aug 6 23:14:51 node03 kernel: [300260.313549] RIP: 0010:queued_write_lock_slowpath+0x61/0x90
Aug 6 23:14:51 node03 kernel: [300260.313549] Code: 00 0f 1f 40 00 5b 41 5c 5d e9 4b d8 ec 00 f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 <8b> 03 3d 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 4c
Aug 6 23:14:51 node03 kernel: [300260.313549] RSP: 0018:ffffbcc785317cc8 EFLAGS: 00000206
Aug 6 23:14:51 node03 kernel: [300260.313549] RAX: 0000000000000500 RBX: ffffbcc785329000 RCX: 0000000000000100
Aug 6 23:14:51 node03 kernel: [300260.313549] RDX: 00000000000000ff RSI: 00007f575c60a9a8 RDI: ffffbcc785329000
Aug 6 23:14:51 node03 kernel: [300260.313549] RBP: ffffbcc785317cd8 R08: 0000000000000008 R09: 0000000000000009
Aug 6 23:14:51 node03 kernel: [300260.313549] R10: 00007f575c60a9a0 R11: 0000000000000000 R12: ffffbcc785329004
Aug 6 23:14:51 node03 kernel: [300260.313549] R13: ffff9bd6b6ff0328 R14: ffffbcc785317d58 R15: ffffbcc785329000
Aug 6 23:14:51 node03 kernel: [300260.313549] FS: 00007f5766664000(0000) GS:ffff9bdc2f800000(0000) knlGS:0000000000000000
Aug 6 23:14:51 node03 kernel: [300260.313549] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 6 23:14:51 node03 kernel: [300260.313549] CR2: 00007f56549c0010 CR3: 0000000104dfe000 CR4: 0000000000350ee0
Aug 6 23:14:51 node03 kernel: [300260.313549] Call Trace:
Aug 6 23:14:51 node03 kernel: [300260.313549] <TASK>
Aug 6 23:14:51 node03 kernel: [300260.313549] _raw_write_lock+0x20/0x30
Aug 6 23:14:51 node03 kernel: [300260.313549] kvm_clear_dirty_log_protect+0x1b3/0x2e0 [kvm]
Aug 6 23:14:51 node03 kernel: [300260.313549] kvm_vm_ioctl+0x165/0xf70 [kvm]
Aug 6 23:14:51 node03 kernel: [300260.313549] ? __fget_files+0x86/0xc0
Aug 6 23:14:51 node03 kernel: [300260.313549] __x64_sys_ioctl+0x95/0xd0
Aug 6 23:14:51 node03 kernel: [300260.313549] do_syscall_64+0x5c/0xc0
Aug 6 23:14:51 node03 kernel: [300260.313549] ? handle_mm_fault+0xd8/0x2c0
Aug 6 23:14:51 node03 kernel: [300260.313549] ? exit_to_user_mode_prepare+0x90/0x1b0
Aug 6 23:14:51 node03 kernel: [300260.313549] ? irqentry_exit_to_user_mode+0x9/0x20
Aug 6 23:14:51 node03 kernel: [300260.313549] ? irqentry_exit+0x1d/0x30
Aug 6 23:14:51 node03 kernel: [300260.313549] ? exc_page_fault+0x89/0x170
Aug 6 23:14:51 node03 kernel: [300260.313549] entry_SYSCALL_64_after_hwframe+0x61/0xcb
Aug 6 23:14:51 node03 kernel: [300260.313549] RIP: 0033:0x7f5771279cc7
Aug 6 23:14:51 node03 kernel: [300260.313549] Code: 00 00 00 48 8b 05 c9 91 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 91 0c 00 f7 d8 64 89 01 48
Aug 6 23:14:51 node03 kernel: [300260.313549] RSP: 002b:00007ffc33cef358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Aug 6 23:14:51 node03 kernel: [300260.313549] RAX: ffffffffffffffda RBX: 00000000c018aec0 RCX: 00007f5771279cc7
Aug 6 23:14:51 node03 kernel: [300260.313549] RDX: 00007ffc33cef470 RSI: ffffffffc018aec0 RDI: 0000000000000012
Aug 6 23:14:51 node03 kernel: [300260.313549] RBP: 000055dec78d5b60 R08: 0000000000000010 R09: 000055dec7595010
Aug 6 23:14:51 node03 kernel: [300260.313549] R10: 00007f5771343b80 R11: 0000000000000246 R12: 00007ffc33cef470
Aug 6 23:14:51 node03 kernel: [300260.313549] R13: 000055dec78d6c20 R14: 0000000000000000 R15: 00007f57651c02e0
Aug 6 23:14:51 node03 kernel: [300260.313549] </TASK>