I observed the following logs in the system journal:
Which was then followed by:
This bug about a CPU softlock was outputted several times about every CPU core, and server power draw spiked from 80W to 180W (likely due to the cores being stuck in a spinlock?). The system then did a reset.
Kernel version 5.15.74-1-pve, PVE version 7.3-3
This is not the first time this has happened I am not as to why. I have included the complete logs here: https://cdn.discordapp.com/attachments/329653697422295040/1065611992766877818/journal_2
Any ideas as to why this is happening?
Code:
Jan 19 10:44:01 argynvostholt kernel: neighbour: ndisc_cache: neighbor table overflow!
Jan 19 10:44:02 argynvostholt kernel: Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
Which was then followed by:
Code:
Jan 19 10:44:23 argynvostholt kernel: watchdog: BUG: soft lockup - CPU#25 stuck for 22s! [swapper/25:0]
Jan 19 10:44:23 argynvostholt kernel: Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables xt_MASQUERADE iptable_nat xt_REDIRECT nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp bpfilter bonding tls softdog nfnetlink_log nfnetlink ipmi_ssif zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd ib_cm ib_core kvm_amd iscsi_tcp libiscsi_tcp libiscsi kvm scsi_transport_iscsi nct6775 crct10dif_pclmul ghash_clmulni_intel hwmon_vid aesni_intel crypto_simd cryptd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi drm_vram_helper snd_hda_codec drm_ttm_helper ttm snd_hda_core snd_hwdep drm_kms_helper rapl snd_pcm cec snd_timer rc_core wmi_bmof snd cdc_ether fb_sys_fops pcspkr efi_pstore syscopyarea usbnet soundcore k10temp sysfillrect ccp joydev input_leds
Jan 19 10:44:23 argynvostholt kernel: sysimgblt mii acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio vendor_reset(O) drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid0 multipath linear simplefb hid_generic usbmouse usbkbd dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c usbhid hid raid1 crc32_pclmul i2c_piix4 nvme xhci_pci xhci_pci_renesas nvme_core ahci xhci_hcd igb libahci i2c_algo_bit dca wmi gpio_amdpt gpio_generic
Jan 19 10:44:23 argynvostholt kernel: CPU: 25 PID: 0 Comm: swapper/25 Tainted: P O 5.15.74-1-pve #1
Jan 19 10:44:23 argynvostholt kernel: Hardware name: To Be Filled By O.E.M. B550D4-4L/B550D4-4L, BIOS L1.29 06/13/2022
Jan 19 10:44:23 argynvostholt kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x1f5/0x240
Jan 19 10:44:23 argynvostholt kernel: Code: c5 40 19 03 00 49 81 fe ff 1f 00 00 77 49 4e 03 2c f5 e0 fa cb a5 4d 89 65 00 41 8b 44 24 08 85 c0 75 0b f3 90 41 8b 44 24 08 <85> c0 74 f5 49 8b 0c 24 48 85 c9 0f 84 47 ff ff ff 0f 0d 09 e9 3f
Jan 19 10:44:23 argynvostholt kernel: RSP: 0018:ffffbe41c077cb40 EFLAGS: 00000246
Jan 19 10:44:23 argynvostholt kernel: RAX: 0000000000000000 RBX: ffffffffa6c05400 RCX: 0000000000000009
Jan 19 10:44:23 argynvostholt kernel: RDX: 0000000000680000 RSI: 0000000000680000 RDI: ffffffffa6c05400
Jan 19 10:44:23 argynvostholt kernel: RBP: ffffbe41c077cb68 R08: 0000000000000000 R09: 0000000000000000
Jan 19 10:44:23 argynvostholt kernel: R10: 0000000000000020 R11: ffffffff80000000 R12: ffff959fbf071940
Jan 19 10:44:23 argynvostholt kernel: R13: ffff959fbec31940 R14: 0000000000000008 R15: 0000000000680000
Jan 19 10:44:23 argynvostholt kernel: FS: 0000000000000000(0000) GS:ffff959fbf040000(0000) knlGS:0000000000000000
Jan 19 10:44:23 argynvostholt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 19 10:44:23 argynvostholt kernel: CR2: 000000c0008fd000 CR3: 000000010c590000 CR4: 0000000000750ee0
Jan 19 10:44:23 argynvostholt kernel: PKRU: 55555554
Jan 19 10:44:23 argynvostholt kernel: Call Trace:
Jan 19 10:44:23 argynvostholt kernel: <IRQ>
Jan 19 10:44:23 argynvostholt kernel: _raw_spin_lock_bh+0x2d/0x40
Jan 19 10:44:23 argynvostholt kernel: fib6_run_gc+0x43/0x110
Jan 19 10:44:23 argynvostholt kernel: ip6_dst_gc+0x95/0x160
Jan 19 10:44:23 argynvostholt kernel: dst_alloc+0x126/0x170
Jan 19 10:44:23 argynvostholt kernel: ip6_dst_alloc+0x27/0x90
Jan 19 10:44:23 argynvostholt kernel: icmp6_dst_alloc+0x76/0x220
Jan 19 10:44:23 argynvostholt kernel: ndisc_send_skb+0x96/0x380
Jan 19 10:44:23 argynvostholt kernel: ? __kmalloc_node_track_caller+0x16f/0x3a0
Jan 19 10:44:23 argynvostholt kernel: ? ksize+0x30/0x50
Jan 19 10:44:23 argynvostholt kernel: ? __build_skb_around+0xb4/0xc0
Jan 19 10:44:23 argynvostholt kernel: ndisc_send_ns+0xcd/0x200
Jan 19 10:44:23 argynvostholt kernel: ndisc_solicit+0xc1/0x170
Jan 19 10:44:23 argynvostholt kernel: ? __skb_clone+0x2e/0x140
Jan 19 10:44:23 argynvostholt kernel: neigh_probe+0x52/0x70
Jan 19 10:44:23 argynvostholt kernel: neigh_timer_handler+0x218/0x300
Jan 19 10:44:23 argynvostholt kernel: ? neigh_changeaddr+0x50/0x50
Jan 19 10:44:23 argynvostholt kernel: call_timer_fn+0x2b/0x120
Jan 19 10:44:23 argynvostholt kernel: __run_timers.part.0+0x1e1/0x270
Jan 19 10:44:23 argynvostholt kernel: ? ktime_get+0x46/0xc0
Jan 19 10:44:23 argynvostholt kernel: ? native_x2apic_icr_read+0x20/0x20
Jan 19 10:44:23 argynvostholt kernel: ? lapic_next_event+0x21/0x30
Jan 19 10:44:23 argynvostholt kernel: ? clockevents_program_event+0xab/0x130
Jan 19 10:44:23 argynvostholt kernel: run_timer_softirq+0x2a/0x60
Jan 19 10:44:23 argynvostholt kernel: __do_softirq+0xd9/0x2ea
Jan 19 10:44:23 argynvostholt kernel: irq_exit_rcu+0x94/0xc0
Jan 19 10:44:23 argynvostholt kernel: sysvec_apic_timer_interrupt+0x80/0x90
Jan 19 10:44:23 argynvostholt kernel: </IRQ>
Jan 19 10:44:23 argynvostholt kernel: <TASK>
Jan 19 10:44:23 argynvostholt kernel: asm_sysvec_apic_timer_interrupt+0x1b/0x20
Jan 19 10:44:23 argynvostholt kernel: RIP: 0010:cpuidle_enter_state+0xd9/0x620
Jan 19 10:44:23 argynvostholt kernel: Code: 3d 64 64 ff 5a e8 27 24 6e ff 49 89 c7 0f 1f 44 00 00 31 ff e8 68 31 6e ff 80 7d d0 00 0f 85 5e 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 6a 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e5 03 00 00
Jan 19 10:44:23 argynvostholt kernel: RSP: 0018:ffffbe41c022fe38 EFLAGS: 00000246
Jan 19 10:44:23 argynvostholt kernel: RAX: ffff959fbf070bc0 RBX: ffff958104948400 RCX: 0000000000000000
Jan 19 10:44:23 argynvostholt kernel: RDX: 0000000000009235 RSI: 0000000025b7c068 RDI: 0000000000000000
Jan 19 10:44:23 argynvostholt kernel: RBP: ffffbe41c022fe88 R08: 0001244744b8c72c R09: 0000000000008ca0
Jan 19 10:44:23 argynvostholt kernel: R10: 0000000000000003 R11: 071c71c71c71c71c R12: ffffffffa64e6cc0
Jan 19 10:44:23 argynvostholt kernel: R13: 0000000000000001 R14: 0000000000000001 R15: 0001244744b8c72c
Jan 19 10:44:23 argynvostholt kernel: ? cpuidle_enter_state+0xc8/0x620
Jan 19 10:44:23 argynvostholt kernel: cpuidle_enter+0x2e/0x50
Jan 19 10:44:23 argynvostholt kernel: do_idle+0x20d/0x2b0
Jan 19 10:44:23 argynvostholt kernel: cpu_startup_entry+0x20/0x30
Jan 19 10:44:23 argynvostholt kernel: start_secondary+0x12a/0x180
Jan 19 10:44:23 argynvostholt kernel: secondary_startup_64_no_verify+0xc2/0xcb
Jan 19 10:44:23 argynvostholt kernel: </TASK>
This bug about a CPU softlock was outputted several times about every CPU core, and server power draw spiked from 80W to 180W (likely due to the cores being stuck in a spinlock?). The system then did a reset.
Kernel version 5.15.74-1-pve, PVE version 7.3-3
This is not the first time this has happened I am not as to why. I have included the complete logs here: https://cdn.discordapp.com/attachments/329653697422295040/1065611992766877818/journal_2
Any ideas as to why this is happening?