Hello, ~10-15 minutes ago the whole PVE server went down. There wasn't any unusual cpu/network usage.
Here's what in the log:
Feb 25 19:49:40 proxmox kernel: [425423.794455] PGD 0 P4D 0
Feb 25 19:49:40 proxmox kernel: [425423.794458] Oops: 0002 [#1] SMP NOPTI
Feb 25 19:49:40 proxmox kernel: [425423.794460] CPU: 16 PID: 849 Comm: kworker/16:1H Tainted: P O 5.3.18-2-pve #1
Feb 25 19:49:40 proxmox kernel: [425423.794462] Hardware name: System manufacturer System Product Name/PRIME TRX40-PRO, BIOS 0702 12/12/2019
Feb 25 19:49:40 proxmox kernel: [425423.794467] Workqueue: kblockd blk_mq_requeue_work
Feb 25 19:49:40 proxmox kernel: [425423.794471] RIP: 0010:_raw_spin_lock+0x10/0x30
Feb 25 19:49:40 proxmox kernel: [425423.794473] Code: 75 06 48 89 d8 5b 5d c3 e8 dd 27 63 ff eb f3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 5d c3 89 c6 e8 a1 13 63 ff 66 90 5d c3 66 66 2e
Feb 25 19:49:40 proxmox kernel: [425423.794476] RSP: 0018:ffffb91601b67e10 EFLAGS: 00010246
Feb 25 19:49:40 proxmox kernel: [425423.794478] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb91601b67e48
Feb 25 19:49:40 proxmox kernel: [425423.794480] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794481] RBP: ffffb91601b67e10 R08: 0000000000000000 R09: 00646b636f6c626b
Feb 25 19:49:40 proxmox kernel: [425423.794483] R10: 8080808080808080 R11: ffff9c85bd2294c4 R12: ffff9c85a8adbb80
Feb 25 19:49:40 proxmox kernel: [425423.794485] R13: 0000000000000000 R14: ffff9c85a8871d68 R15: 0ffff9c85bd23260
Feb 25 19:49:40 proxmox kernel: [425423.794487] FS: 0000000000000000(0000) GS:ffff9c85bd200000(0000) knlGS:0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 25 19:49:40 proxmox kernel: [425423.794490] CR2: 0000000000000000 CR3: 000000075b7f2000 CR4: 0000000000340ee0
Feb 25 19:49:40 proxmox kernel: [425423.794492] Call Trace:
Feb 25 19:49:40 proxmox kernel: [425423.794495] blk_mq_request_bypass_insert+0x20/0x70
Feb 25 19:49:40 proxmox kernel: [425423.794497] blk_mq_requeue_work+0xa6/0x160
Feb 25 19:49:40 proxmox kernel: [425423.794500] process_one_work+0x20f/0x3d0
Feb 25 19:49:40 proxmox kernel: [425423.794502] worker_thread+0x34/0x400
Feb 25 19:49:40 proxmox kernel: [425423.794504] kthread+0x120/0x140
Feb 25 19:49:40 proxmox kernel: [425423.794505] ? process_one_work+0x3d0/0x3d0
Feb 25 19:49:40 proxmox kernel: [425423.794507] ? __kthread_parkme+0x70/0x70
Feb 25 19:49:40 proxmox kernel: [425423.794509] ret_from_fork+0x22/0x40
Feb 25 19:49:40 proxmox kernel: [425423.794511] Modules linked in: veth tcp_diag inet_diag vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack xt_tcpudp ip_set_hash_net ip_set iptable_filter bpfilter bonding softdog openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 snd_usb_audio snd_usbmidi_lib snd_hwdep edac_mce_amd snd_rawmidi kvm_amd kvm snd_seq_device irqbypass mc tcp_bbr snd_pcm snd_timer snd crct10dif_pclmul crc32_pclmul soundcore ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd ccp eeepc_wmi joydev input_leds cryptd k10temp glue_helper asus_wmi sparse_keymap video pcspkr mac_hid wmi_bmof mxm_wmi zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) nfnetlink_log nfnetlink zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi
Feb 25 19:49:40 proxmox kernel: [425423.794540] scsi_transport_iscsi nct6775 hwmon_vid sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c uas usb_storage usbmouse hid_generic usbkbd usbhid hid ahci libahci igb i2c_algo_bit dca i2c_piix4 wmi
Feb 25 19:49:40 proxmox kernel: [425423.794564] CR2: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794566] ---[ end trace 0d4be7da105ef9bb ]---
Feb 25 19:49:40 proxmox kernel: [425423.794568] RIP: 0010:_raw_spin_lock+0x10/0x30
Feb 25 19:49:40 proxmox kernel: [425423.794569] Code: 75 06 48 89 d8 5b 5d c3 e8 dd 27 63 ff eb f3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 5d c3 89 c6 e8 a1 13 63 ff 66 90 5d c3 66 66 2e
Feb 25 19:49:40 proxmox kernel: [425423.794572] RSP: 0018:ffffb91601b67e10 EFLAGS: 00010246
Feb 25 19:49:40 proxmox kernel: [425423.794574] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb91601b67e48
Feb 25 19:49:40 proxmox kernel: [425423.794575] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794577] RBP: ffffb91601b67e10 R08: 0000000000000000 R09: 00646b636f6c626b
Feb 25 19:49:40 proxmox kernel: [425423.794579] R10: 8080808080808080 R11: ffff9c85bd2294c4 R12: ffff9c85a8adbb80
Feb 25 19:49:40 proxmox kernel: [425423.794580] R13: 0000000000000000 R14: ffff9c85a8871d68 R15: 0ffff9c85bd23260
Feb 25 19:49:40 proxmox kernel: [425423.794582] FS: 0000000000000000(0000) GS:ffff9c85bd200000(0000) knlGS:0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794584] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 25 19:49:40 proxmox kernel: [425423.794585] CR2: 0000000000000000 CR3: 000000075b7f2000 CR4: 0000000000340ee0
Feb 25 19:50:10 proxmox kernel: [425453.807926] nvme nvme0: I/O 625 QID 5 timeout, aborting
Feb 25 19:50:10 proxmox kernel: [425453.813494] nvme nvme0: Abort status: 0x0
Feb 25 19:50:40 proxmox kernel: [425484.011717] nvme nvme0: I/O 625 QID 5 timeout, reset controller
Feb 25 19:50:40 proxmox kernel: [425484.351369] nvme nvme0: 7/0/0 default/read/poll queues
Here's what in the log:
Feb 25 19:49:40 proxmox kernel: [425423.794455] PGD 0 P4D 0
Feb 25 19:49:40 proxmox kernel: [425423.794458] Oops: 0002 [#1] SMP NOPTI
Feb 25 19:49:40 proxmox kernel: [425423.794460] CPU: 16 PID: 849 Comm: kworker/16:1H Tainted: P O 5.3.18-2-pve #1
Feb 25 19:49:40 proxmox kernel: [425423.794462] Hardware name: System manufacturer System Product Name/PRIME TRX40-PRO, BIOS 0702 12/12/2019
Feb 25 19:49:40 proxmox kernel: [425423.794467] Workqueue: kblockd blk_mq_requeue_work
Feb 25 19:49:40 proxmox kernel: [425423.794471] RIP: 0010:_raw_spin_lock+0x10/0x30
Feb 25 19:49:40 proxmox kernel: [425423.794473] Code: 75 06 48 89 d8 5b 5d c3 e8 dd 27 63 ff eb f3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 5d c3 89 c6 e8 a1 13 63 ff 66 90 5d c3 66 66 2e
Feb 25 19:49:40 proxmox kernel: [425423.794476] RSP: 0018:ffffb91601b67e10 EFLAGS: 00010246
Feb 25 19:49:40 proxmox kernel: [425423.794478] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb91601b67e48
Feb 25 19:49:40 proxmox kernel: [425423.794480] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794481] RBP: ffffb91601b67e10 R08: 0000000000000000 R09: 00646b636f6c626b
Feb 25 19:49:40 proxmox kernel: [425423.794483] R10: 8080808080808080 R11: ffff9c85bd2294c4 R12: ffff9c85a8adbb80
Feb 25 19:49:40 proxmox kernel: [425423.794485] R13: 0000000000000000 R14: ffff9c85a8871d68 R15: 0ffff9c85bd23260
Feb 25 19:49:40 proxmox kernel: [425423.794487] FS: 0000000000000000(0000) GS:ffff9c85bd200000(0000) knlGS:0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 25 19:49:40 proxmox kernel: [425423.794490] CR2: 0000000000000000 CR3: 000000075b7f2000 CR4: 0000000000340ee0
Feb 25 19:49:40 proxmox kernel: [425423.794492] Call Trace:
Feb 25 19:49:40 proxmox kernel: [425423.794495] blk_mq_request_bypass_insert+0x20/0x70
Feb 25 19:49:40 proxmox kernel: [425423.794497] blk_mq_requeue_work+0xa6/0x160
Feb 25 19:49:40 proxmox kernel: [425423.794500] process_one_work+0x20f/0x3d0
Feb 25 19:49:40 proxmox kernel: [425423.794502] worker_thread+0x34/0x400
Feb 25 19:49:40 proxmox kernel: [425423.794504] kthread+0x120/0x140
Feb 25 19:49:40 proxmox kernel: [425423.794505] ? process_one_work+0x3d0/0x3d0
Feb 25 19:49:40 proxmox kernel: [425423.794507] ? __kthread_parkme+0x70/0x70
Feb 25 19:49:40 proxmox kernel: [425423.794509] ret_from_fork+0x22/0x40
Feb 25 19:49:40 proxmox kernel: [425423.794511] Modules linked in: veth tcp_diag inet_diag vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack xt_tcpudp ip_set_hash_net ip_set iptable_filter bpfilter bonding softdog openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 snd_usb_audio snd_usbmidi_lib snd_hwdep edac_mce_amd snd_rawmidi kvm_amd kvm snd_seq_device irqbypass mc tcp_bbr snd_pcm snd_timer snd crct10dif_pclmul crc32_pclmul soundcore ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd ccp eeepc_wmi joydev input_leds cryptd k10temp glue_helper asus_wmi sparse_keymap video pcspkr mac_hid wmi_bmof mxm_wmi zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) nfnetlink_log nfnetlink zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi
Feb 25 19:49:40 proxmox kernel: [425423.794540] scsi_transport_iscsi nct6775 hwmon_vid sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c uas usb_storage usbmouse hid_generic usbkbd usbhid hid ahci libahci igb i2c_algo_bit dca i2c_piix4 wmi
Feb 25 19:49:40 proxmox kernel: [425423.794564] CR2: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794566] ---[ end trace 0d4be7da105ef9bb ]---
Feb 25 19:49:40 proxmox kernel: [425423.794568] RIP: 0010:_raw_spin_lock+0x10/0x30
Feb 25 19:49:40 proxmox kernel: [425423.794569] Code: 75 06 48 89 d8 5b 5d c3 e8 dd 27 63 ff eb f3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 5d c3 89 c6 e8 a1 13 63 ff 66 90 5d c3 66 66 2e
Feb 25 19:49:40 proxmox kernel: [425423.794572] RSP: 0018:ffffb91601b67e10 EFLAGS: 00010246
Feb 25 19:49:40 proxmox kernel: [425423.794574] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb91601b67e48
Feb 25 19:49:40 proxmox kernel: [425423.794575] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794577] RBP: ffffb91601b67e10 R08: 0000000000000000 R09: 00646b636f6c626b
Feb 25 19:49:40 proxmox kernel: [425423.794579] R10: 8080808080808080 R11: ffff9c85bd2294c4 R12: ffff9c85a8adbb80
Feb 25 19:49:40 proxmox kernel: [425423.794580] R13: 0000000000000000 R14: ffff9c85a8871d68 R15: 0ffff9c85bd23260
Feb 25 19:49:40 proxmox kernel: [425423.794582] FS: 0000000000000000(0000) GS:ffff9c85bd200000(0000) knlGS:0000000000000000
Feb 25 19:49:40 proxmox kernel: [425423.794584] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 25 19:49:40 proxmox kernel: [425423.794585] CR2: 0000000000000000 CR3: 000000075b7f2000 CR4: 0000000000340ee0
Feb 25 19:50:10 proxmox kernel: [425453.807926] nvme nvme0: I/O 625 QID 5 timeout, aborting
Feb 25 19:50:10 proxmox kernel: [425453.813494] nvme nvme0: Abort status: 0x0
Feb 25 19:50:40 proxmox kernel: [425484.011717] nvme nvme0: I/O 625 QID 5 timeout, reset controller
Feb 25 19:50:40 proxmox kernel: [425484.351369] nvme nvme0: 7/0/0 default/read/poll queues