latest update broke the whole system. need help!

jim.bond.9862

Renowned Member
Apr 17, 2015
395
34
68
Hi everyone.

I have been running pve 5.2 release for the last several months with no major issues.
it is all ZFS based.
since it is a single server I have 3 ZFS pools setup on the machine but they are no part of local-store.
they are my own data pools that I am sharing via bind-mount to dedicated LXC containers for use.

however last few weeks drives start dropping out of the pools and I though I traced it to my old PSU.
I have a new 1050W PSU and all drives stay as expected but the system started to lockup and need to be hard restarted every few days.
last week after reboot it never came back up. run through POST.
got the boot screen and than hung on back screen with blinking cursor.

I tried to boot with earlier kernel and it boots but non of my data pools are coming up.
the only pools I see are system pools "rpool" and second local-store "pvstore" pool.

none of the data pools starting. I can see them with zpool import.
but when I run import <pool>
whole thing hungs and I start getting CPU lockup errors.
mostly soft lockup but sometimes hard lockup.

what can it be?

thanks.

below is my system log for today.

Code:
Aug 15 12:18:02 pve2 systemd[1]: Started Proxmox VE replication runner.
Aug 15 12:19:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Aug 15 12:19:02 pve2 systemd[1]: Started Proxmox VE replication runner.
Aug 15 12:19:12 pve2 systemd[1]: systemd-timesyncd.service: State 'stop-sigabrt' timed out. Terminating.
Aug 15 12:19:17 pve2 kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
Aug 15 12:19:17 pve2 kernel:         4-...: (5 GPs behind) idle=dbe/140000000000000/0 softirq=14882/14882 fqs=20511
Aug 15 12:19:17 pve2 kernel:         (detected by 1, t=60048 jiffies, g=16424, c=16423, q=29645)
Aug 15 12:19:17 pve2 kernel: Sending NMI from CPU 1 to CPUs 4:
Aug 15 12:19:17 pve2 kernel: NMI backtrace for cpu 4
Aug 15 12:19:17 pve2 kernel: CPU: 4 PID: 7803 Comm: smartd Tainted: P      D    O    4.13.16-4-pve #1
Aug 15 12:19:17 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:19:17 pve2 kernel: task: ffff889542fd0000 task.stack: ffffa9aed9ae8000
Aug 15 12:19:17 pve2 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x179/0x1a0
Aug 15 12:19:17 pve2 kernel: RSP: 0018:ffffa9aed9aeba98 EFLAGS: 00000002
Aug 15 12:19:17 pve2 kernel: RAX: 0000000000000101 RBX: 0000000000000293 RCX: 0000000000000001
Aug 15 12:19:17 pve2 kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff889b60784018
Aug 15 12:19:17 pve2 kernel: RBP: ffffa9aed9aeba98 R08: 0000000000000101 R09: ffff88955f772918
Aug 15 12:19:17 pve2 kernel: R10: 0000000000000400 R11: ffff889563d70028 R12: ffff8895603d4200
Aug 15 12:19:17 pve2 kernel: R13: ffff88954734dd48 R14: ffff889552305800 R15: ffff889552305800
Aug 15 12:19:17 pve2 kernel: FS:  00007ff033688480(0000) GS:ffff889567d00000(0000) knlGS:0000000000000000
Aug 15 12:19:17 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:19:17 pve2 kernel: CR2: 00007fa970e84b80 CR3: 000000060d4d2000 CR4: 00000000000006e0
Aug 15 12:19:17 pve2 kernel: Call Trace:
Aug 15 12:19:17 pve2 kernel:  _raw_spin_lock_irqsave+0x37/0x40
Aug 15 12:19:17 pve2 kernel:  ata_scsi_queuecmd+0x27/0x210
Aug 15 12:19:17 pve2 kernel:  scsi_dispatch_cmd+0xec/0x220
Aug 15 12:19:17 pve2 kernel:  scsi_request_fn+0x47c/0x620
Aug 15 12:19:17 pve2 kernel:  ? sched_clock+0x9/0x10
Aug 15 12:19:17 pve2 kernel:  __blk_run_queue+0x43/0x70
Aug 15 12:19:17 pve2 kernel:  __elv_add_request+0x134/0x250
Aug 15 12:19:17 pve2 kernel:  blk_execute_rq_nowait+0x9d/0x100
Aug 15 12:19:17 pve2 kernel:  blk_execute_rq+0x50/0xb0
Aug 15 12:19:17 pve2 kernel:  sg_io+0x1a7/0x410
Aug 15 12:19:17 pve2 kernel:  ? dput+0x34/0x1f0
Aug 15 12:19:17 pve2 kernel:  scsi_cmd_ioctl+0x2a9/0x430
Aug 15 12:19:17 pve2 kernel:  ? __slab_free+0xa9/0x300
Aug 15 12:19:17 pve2 kernel:  ? scsi_disk_put+0x3c/0x50
Aug 15 12:19:17 pve2 kernel:  scsi_cmd_blk_ioctl+0x42/0x50
Aug 15 12:19:17 pve2 kernel:  sd_ioctl+0xd9/0x1d0
Aug 15 12:19:17 pve2 kernel:  blkdev_ioctl+0x8cd/0x970
Aug 15 12:19:17 pve2 kernel:  ? __check_object_size+0xb3/0x190
Aug 15 12:19:17 pve2 kernel:  block_ioctl+0x3d/0x50
Aug 15 12:19:17 pve2 kernel:  do_vfs_ioctl+0xa6/0x620
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xe9/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xe2/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xdb/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xd4/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xcd/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xc6/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xbf/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xb8/0x139
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0xb1/0x139
Aug 15 12:19:17 pve2 kernel:  SyS_ioctl+0x79/0x90
Aug 15 12:19:17 pve2 kernel:  ? entry_SYSCALL_64_after_hwframe+0x72/0x139
Aug 15 12:19:17 pve2 kernel:  entry_SYSCALL_64_fastpath+0x24/0xab
Aug 15 12:19:17 pve2 kernel: RIP: 0033:0x7ff0324ebdd7
Aug 15 12:19:17 pve2 kernel: RSP: 002b:00007ffd7216b1d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Aug 15 12:19:17 pve2 kernel: RAX: ffffffffffffffda RBX: 000055c192e42678 RCX: 00007ff0324ebdd7
Aug 15 12:19:17 pve2 kernel: RDX: 00007ffd7216b1f0 RSI: 0000000000002285 RDI: 0000000000000003
Aug 15 12:19:17 pve2 kernel: RBP: 00007ff03357f250 R08: 0000000000000000 R09: 000055c19294a31a
Aug 15 12:19:17 pve2 kernel: R10: 000055c192b61ce0 R11: 0000000000000246 R12: 0000000000000008
Aug 15 12:19:17 pve2 kernel: R13: 00007ffd7216bb8a R14: 000055c192eba6e8 R15: 000055c192e664b0
Aug 15 12:19:17 pve2 kernel: Code: 41 39 c0 74 e6 4d 85 c9 c6 07 01 74 30 41 c7 41 08 01 00 00 00 e9 52 ff ff ff 83 fa 01 0f 84 b0 fe ff ff 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 f3 90 4c 8b 09
Aug 15 12:19:31 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [kworker/9:0:26908]
Aug 15 12:19:31 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:19:31 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O    4.13.16-4-pve #1
Aug 15 12:19:31 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:19:31 pve2 kernel: Workqueue: events free_work
Aug 15 12:19:31 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:19:31 pve2 kernel: RIP: 0010:smp_call_function_many+0x209/0x260
Aug 15 12:19:31 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:19:31 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:19:31 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:19:31 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:19:31 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:19:31 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:19:31 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:19:31 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:19:31 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:19:31 pve2 kernel: Call Trace:
Aug 15 12:19:31 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:19:31 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:19:31 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:19:31 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:19:31 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:19:31 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:19:31 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:19:31 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:19:31 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:19:31 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:19:31 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:19:31 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:19:31 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:19:31 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:19:31 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:19:31 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:19:31 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:19:31 pve2 kernel: Code: 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 8b 42 18 <a8> 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89 f7 e8 0f
Aug 15 12:19:58 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 21s! [kworker/9:0:26908]
Aug 15 12:19:58 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:19:58 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O L  4.13.16-4-pve #1
Aug 15 12:19:58 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:19:58 pve2 kernel: Workqueue: events free_work
Aug 15 12:19:58 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:19:58 pve2 kernel: RIP: 0010:smp_call_function_many+0x206/0x260
Aug 15 12:19:58 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:19:58 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:19:58 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:19:58 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:19:58 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:19:58 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:19:58 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:19:58 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:19:58 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:19:58 pve2 kernel: Call Trace:
Aug 15 12:19:58 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:19:58 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:19:58 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:19:58 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:19:58 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:19:58 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:19:58 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:19:58 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:19:58 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:19:58 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:19:58 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:19:58 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:19:58 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:19:58 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:19:58 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:19:58 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:19:58 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:19:58 pve2 kernel: Code: 63 d2 e8 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 <8b> 42 18 a8 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89
Aug 15 12:20:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Aug 15 12:20:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Aug 15 12:20:09 pve2 pvedaemon[9054]: <root@pam> successful auth for user 'root@pam'
Aug 15 12:20:25 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/9:0:26908]
Aug 15 12:20:25 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:20:25 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O L  4.13.16-4-pve #1
Aug 15 12:20:25 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:20:25 pve2 kernel: Workqueue: events free_work
Aug 15 12:20:25 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:20:25 pve2 kernel: RIP: 0010:smp_call_function_many+0x206/0x260
Aug 15 12:20:25 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:20:25 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:20:25 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:20:25 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:20:25 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:20:25 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:20:25 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:20:25 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:20:25 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:20:25 pve2 kernel: Call Trace:
Aug 15 12:20:25 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:20:25 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:20:25 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:20:25 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:20:25 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:20:25 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:20:25 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:20:25 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:20:25 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:20:25 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:20:25 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:20:25 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:20:25 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:20:25 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:20:25 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:20:25 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:20:25 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:20:25 pve2 kernel: Code: 63 d2 e8 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 <8b> 42 18 a8 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89
Aug 15 12:20:43 pve2 systemd[1]: systemd-timesyncd.service: State 'stop-sigterm' timed out. Killing.
Aug 15 12:20:43 pve2 systemd[1]: systemd-timesyncd.service: Killing process 7433 (systemd-timesyn) with signal SIGKILL.
Aug 15 12:20:53 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/9:0:26908]
Aug 15 12:20:53 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:20:53 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O L  4.13.16-4-pve #1
Aug 15 12:20:53 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:20:53 pve2 kernel: Workqueue: events free_work
Aug 15 12:20:53 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:20:53 pve2 kernel: RIP: 0010:smp_call_function_many+0x206/0x260
Aug 15 12:20:53 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:20:53 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:20:53 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:20:53 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:20:53 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:20:53 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:20:53 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:20:53 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:20:53 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:20:53 pve2 kernel: Call Trace:
Aug 15 12:20:53 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:20:53 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:20:53 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:20:53 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:20:53 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:20:53 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:20:53 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:20:53 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:20:53 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:20:53 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:20:53 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:20:53 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:20:53 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:20:53 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:20:53 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:20:53 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:20:53 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:20:53 pve2 kernel: Code: 63 d2 e8 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 <8b> 42 18 a8 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89
Aug 15 12:21:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Aug 15 12:21:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Aug 15 12:21:21 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/9:0:26908]
Aug 15 12:21:21 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:21:21 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O L  4.13.16-4-pve #1
Aug 15 12:21:21 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:21:21 pve2 kernel: Workqueue: events free_work
Aug 15 12:21:21 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:21:21 pve2 kernel: RIP: 0010:smp_call_function_many+0x206/0x260
Aug 15 12:21:21 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:21:21 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:21:21 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:21:21 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:21:21 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:21:21 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:21:21 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:21:21 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:21:21 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:21:21 pve2 kernel: Call Trace:
Aug 15 12:21:21 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:21:21 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:21:21 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:21:21 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:21:21 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:21:21 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:21:21 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:21:21 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:21:21 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:21:21 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:21:21 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:21:21 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:21:21 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:21:21 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:21:21 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:21:21 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:21:21 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:21:21 pve2 kernel: Code: 63 d2 e8 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 <8b> 42 18 a8 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89
Aug 15 12:21:50 pve2 kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/9:0:26908]
Aug 15 12:21:50 pve2 kernel: Modules linked in: ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink amd64_edac_mod ppdev edac_mce_amd ttm snd_pcm kvm_amd snd_timer snd drm_kms_helper kvm soundcore ipmi_si drm ipmi_devintf irqbypass k10temp serio_raw pcspkr ipmi_msghandler joydev input_leds i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp parport_pc parport mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor raid6_pq hid_generic usbkbd uas usbhid usb_storage hid pata_acpi e1000e(O) psmouse ptp pps_core sata_mv forcedeth sata_nv i2c_nforce2
Aug 15 12:21:50 pve2 kernel: CPU: 9 PID: 26908 Comm: kworker/9:0 Tainted: P      D    O L  4.13.16-4-pve #1
Aug 15 12:21:50 pve2 kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014  10/22/2009
Aug 15 12:21:50 pve2 kernel: Workqueue: events free_work
Aug 15 12:21:50 pve2 kernel: task: ffff889b55b31600 task.stack: ffffa9aed9b80000
Aug 15 12:21:50 pve2 kernel: RIP: 0010:smp_call_function_many+0x206/0x260
Aug 15 12:21:50 pve2 kernel: RSP: 0018:ffffa9aed9b83d20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Aug 15 12:21:50 pve2 kernel: RAX: 0000000000000003 RBX: 000000000000000c RCX: 0000000000000004
Aug 15 12:21:50 pve2 kernel: RDX: ffff889567d27798 RSI: 0000000000000000 RDI: ffff889b67001358
Aug 15 12:21:50 pve2 kernel: RBP: ffffa9aed9b83d58 R08: fffffffffffffff0 R09: 0000000000000dff
Aug 15 12:21:50 pve2 kernel: R10: ffffd0bff08e77c0 R11: 0000000000000342 R12: ffffffffb88776d0
Aug 15 12:21:50 pve2 kernel: R13: 0000000000000000 R14: ffff889b7fce3a80 R15: 000000000000000c
Aug 15 12:21:50 pve2 kernel: FS:  0000000000000000(0000) GS:ffff889b7fcc0000(0000) knlGS:0000000000000000
Aug 15 12:21:50 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 12:21:50 pve2 kernel: CR2: 00007fe4329685e0 CR3: 000000004520a000 CR4: 00000000000006e0
Aug 15 12:21:50 pve2 kernel: Call Trace:
Aug 15 12:21:50 pve2 kernel:  ? load_new_mm_cr3+0xe0/0xe0
Aug 15 12:21:50 pve2 kernel:  on_each_cpu+0x2d/0x60
Aug 15 12:21:50 pve2 kernel:  flush_tlb_kernel_range+0x4b/0x80
Aug 15 12:21:50 pve2 kernel:  ? vunmap_page_range+0x214/0x340
Aug 15 12:21:50 pve2 kernel:  __purge_vmap_area_lazy+0x52/0xc0
Aug 15 12:21:50 pve2 kernel:  free_vmap_area_noflush+0x7e/0x90
Aug 15 12:21:50 pve2 kernel:  remove_vm_area+0x77/0x90
Aug 15 12:21:50 pve2 kernel:  __vunmap+0x26/0xb0
Aug 15 12:21:50 pve2 kernel:  free_work+0x25/0x40
Aug 15 12:21:50 pve2 kernel:  process_one_work+0x1ee/0x410
Aug 15 12:21:50 pve2 kernel:  worker_thread+0x4b/0x420
Aug 15 12:21:50 pve2 kernel:  kthread+0x10c/0x140
Aug 15 12:21:50 pve2 kernel:  ? process_one_work+0x410/0x410
Aug 15 12:21:50 pve2 kernel:  ? kthread_create_on_node+0x70/0x70
Aug 15 12:21:50 pve2 kernel:  ? do_syscall_64+0x67/0x120
Aug 15 12:21:50 pve2 kernel:  ? SyS_exit_group+0x14/0x20
Aug 15 12:21:50 pve2 kernel:  ret_from_fork+0x22/0x40
Aug 15 12:21:50 pve2 kernel: Code: 63 d2 e8 3e af 35 00 3b 05 8c 9b 34 01 89 c1 0f 8d 80 fe ff ff 48 98 49 8b 16 48 03 14 c5 00 c4 96 b9 8b 42 18 a8 01 74 09 f3 90 <8b> 42 18 a8 01 75 f7 eb be 0f b6 4d d0 4c 89 ea 4c 89 e6 44 89
Aug 15 12:22:01 pve2 systemd[1]: Starting Proxmox VE replication runner...
Aug 15 12:22:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Aug 15 12:22:13 pve2 systemd[1]: systemd-timesyncd.service: Processes still around after SIGKILL. Ignoring.
 
What is ipmi_si?
I have managed to boot everything using one of the older kernels from advanced menu.
Any 4.15.18 kernels hinge on boot.
I managed to boot a 14.15.17
And it have been up for the last month. No issues. It is something in 4.15.18 kernel.
I even did the whole server overhaul
New motherboard ram and HBA.
In addition to more powerful psu.
Only dropping down to older kernel worked.
 
Sounds like exactly what I have, and whats in that link.

In your old kernel, check and see if ipmi_si is loading:
lsmod | grep ipmi

look in /etc/modprobe.d/*conf files and see if its configured at all
you can blacklist it with something like:
echo "blacklist ipmi_si" > /etc/modprobe.d/ipmi-blacklist.conf

then for good measure
update-initramfs -u -k all

then reboot to the newer kernel
 
no can do.
had a power failure over the weekend and shut down the server.
it is back at blinking cursor again.
I managed to take off all of my data when it was working.
but not backups vm configs and other configuration.
I guess I will have to start from scratch.

lucky I only have 2 LXC setup and all my data is safe.
just annoying.
I have 2 extra SSDs so I will preserve the pair with current setup and see if I can fix them at later time.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!