Hi,
Yesterday an updated PVE7 node crashed - Dell R440 - Xeon(R) Silver 4210 CPU @ 2.20GHz (2 Sockets) with 256 GB RAM
The weird thing is that network was still OK - and corosync said that all 6 nodes are reachable.
the GUI showed involved node grayed - its Vms continue to respond to ping but didn't work. HA didn't work too - No fencing for the node in trouble. HA policy is default (conditional).
Hope you have an idea, the node is completely uptodate (packages, BIOS, firmwares, ...)
I wasn't able to collect dmesg before the manual reboot. :/
Yesterday an updated PVE7 node crashed - Dell R440 - Xeon(R) Silver 4210 CPU @ 2.20GHz (2 Sockets) with 256 GB RAM
proxmox-ve: 7.0-2 (running kernel: 5.11.22-4-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-7
pve-kernel-helper: 7.0-7
pve-kernel-5.4: 6.4-5
pve-kernel-5.0: 6.0-11
pve-kernel-5.11.22-4-pve: 5.11.22-8
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-11
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.3-1
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-7
pve-kernel-helper: 7.0-7
pve-kernel-5.4: 6.4-5
pve-kernel-5.0: 6.0-11
pve-kernel-5.11.22-4-pve: 5.11.22-8
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.4.128-1-pve: 5.4.128-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-11
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.3-1
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-13
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293790] PGD 0 P4D 0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293801] Oops: 0000 [#1] SMP NOPTI
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293815] CPU: 38 PID: 426 Comm: kworker/38:1H Tainted: P IO 5.11.22-4-pve #1
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293840] Hardware name: Dell Inc. PowerEdge R440, BIOS 2.12.2 07/09/2021
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293862] Workqueue: kblockd blk_mq_timeout_work
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293884] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293902] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293953] RSP: 0018:ffffb37e8f34bd68 EFLAGS: 00010287
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293970] RAX: 0000000000000000 RBX: ffffb37e8f34bde8 RCX: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293991] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294011] RBP: ffffb37e8f34bda0 R08: 0000000000000000 R09: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294032] R10: 0000000000000008 R11: 0000000000000008 R12: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294053] R13: ffff944690e84800 R14: 0000000000000000 R15: 0000000000000001
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294074] FS: 0000000000000000(0000) GS:ffff9446000c0000(0000) knlGS:0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294115] CR2: 00000000000000c0 CR3: 000000287ce2c005 CR4: 00000000007726e0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294157] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294177] PKRU: 55555554
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294187] Call Trace:
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294198] ? bt_iter+0x54/0x90
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294212] blk_mq_queue_tag_busy_iter+0x1a2/0x2d0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294228] ? blk_mq_put_rq_ref+0x60/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294243] ? blk_mq_put_rq_ref+0x60/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294258] blk_mq_timeout_work+0x5f/0x120
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294274] process_one_work+0x220/0x3c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294292] worker_thread+0x53/0x420
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294306] ? process_one_work+0x3c0/0x3c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294321] kthread+0x12b/0x150
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294333] ? set_kthread_struct+0x50/0x50
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294347] ret_from_fork+0x1f/0x30
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294363] Modules linked in: veth 8021q garp mrp nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace nfs_ssc fscache ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw ipt_REJECT nf_reject_ipv4 xt_physdev xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_comment xt_set xt_mark ip_set_hash_net ip_set sctp ip6_udp_tunnel udp_tunnel iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common ipmi_ssif isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqbypass drm_kms_helper cec crct10dif_pclmul rc_core ghash_clmulni_intel i2c_algo_bit aesni_intel crypto_simd fb_sys_fops syscopyarea cryptd glue_helper dell_smbios rapl zfs(PO) intel_cstate dcdbas sysfillrect mei_me joydev dell_wmi_descriptor wmi_bmof intel_pch_thermal mei sysimgblt pcspkr input_leds efi_pstore acpi_ipmi zunicode(PO) ipmi_si
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294427] zzstd(O) ipmi_devintf zlua(O) zavl(PO) acpi_power_meter ipmi_msghandler icp(PO) mac_hid zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc drm ip_tables x_tables autofs4 btrfs blake2b_generic xor hid_generic usbmouse usbkbd usbhid hid raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32_pclmul ixgbe xfrm_algo xhci_pci ahci xhci_pci_renesas i2c_i801 dca mdio megaraid_sas tg3 lpc_ich i2c_smbus xhci_hcd libahci wmi
Sep 15 19:20:43 dc-prox-25 kernel: [103877.304501] CR2: 00000000000000c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.305254] ---[ end trace 913e8515690bbea4 ]---
Sep 15 19:20:43 dc-prox-25 kernel: [103877.338816] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.339725] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
Sep 15 19:20:43 dc-prox-25 kernel: [103877.341403] RSP: 0018:ffffb37e8f34bd68 EFLAGS: 00010287
Sep 15 19:20:43 dc-prox-25 kernel: [103877.342209] RAX: 0000000000000000 RBX: ffffb37e8f34bde8 RCX: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.342970] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.343738] RBP: ffffb37e8f34bda0 R08: 0000000000000000 R09: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.344516] R10: 0000000000000008 R11: 0000000000000008 R12: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.345298] R13: ffff944690e84800 R14: 0000000000000000 R15: 0000000000000001
Sep 15 19:20:43 dc-prox-25 kernel: [103877.346025] FS: 0000000000000000(0000) GS:ffff9446000c0000(0000) knlGS:0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.346750] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 19:20:43 dc-prox-25 kernel: [103877.347474] CR2: 00000000000000c0 CR3: 000000287ce2c005 CR4: 00000000007726e0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.348205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.348935] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 15 19:20:43 dc-prox-25 kernel: [103877.349660] PKRU: 55555554
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293801] Oops: 0000 [#1] SMP NOPTI
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293815] CPU: 38 PID: 426 Comm: kworker/38:1H Tainted: P IO 5.11.22-4-pve #1
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293840] Hardware name: Dell Inc. PowerEdge R440, BIOS 2.12.2 07/09/2021
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293862] Workqueue: kblockd blk_mq_timeout_work
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293884] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293902] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293953] RSP: 0018:ffffb37e8f34bd68 EFLAGS: 00010287
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293970] RAX: 0000000000000000 RBX: ffffb37e8f34bde8 RCX: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.293991] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294011] RBP: ffffb37e8f34bda0 R08: 0000000000000000 R09: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294032] R10: 0000000000000008 R11: 0000000000000008 R12: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294053] R13: ffff944690e84800 R14: 0000000000000000 R15: 0000000000000001
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294074] FS: 0000000000000000(0000) GS:ffff9446000c0000(0000) knlGS:0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294115] CR2: 00000000000000c0 CR3: 000000287ce2c005 CR4: 00000000007726e0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294157] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294177] PKRU: 55555554
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294187] Call Trace:
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294198] ? bt_iter+0x54/0x90
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294212] blk_mq_queue_tag_busy_iter+0x1a2/0x2d0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294228] ? blk_mq_put_rq_ref+0x60/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294243] ? blk_mq_put_rq_ref+0x60/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294258] blk_mq_timeout_work+0x5f/0x120
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294274] process_one_work+0x220/0x3c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294292] worker_thread+0x53/0x420
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294306] ? process_one_work+0x3c0/0x3c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294321] kthread+0x12b/0x150
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294333] ? set_kthread_struct+0x50/0x50
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294347] ret_from_fork+0x1f/0x30
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294363] Modules linked in: veth 8021q garp mrp nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace nfs_ssc fscache ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw ipt_REJECT nf_reject_ipv4 xt_physdev xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_comment xt_set xt_mark ip_set_hash_net ip_set sctp ip6_udp_tunnel udp_tunnel iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common ipmi_ssif isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqbypass drm_kms_helper cec crct10dif_pclmul rc_core ghash_clmulni_intel i2c_algo_bit aesni_intel crypto_simd fb_sys_fops syscopyarea cryptd glue_helper dell_smbios rapl zfs(PO) intel_cstate dcdbas sysfillrect mei_me joydev dell_wmi_descriptor wmi_bmof intel_pch_thermal mei sysimgblt pcspkr input_leds efi_pstore acpi_ipmi zunicode(PO) ipmi_si
Sep 15 19:20:43 dc-prox-25 kernel: [103877.294427] zzstd(O) ipmi_devintf zlua(O) zavl(PO) acpi_power_meter ipmi_msghandler icp(PO) mac_hid zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc drm ip_tables x_tables autofs4 btrfs blake2b_generic xor hid_generic usbmouse usbkbd usbhid hid raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c crc32_pclmul ixgbe xfrm_algo xhci_pci ahci xhci_pci_renesas i2c_i801 dca mdio megaraid_sas tg3 lpc_ich i2c_smbus xhci_hcd libahci wmi
Sep 15 19:20:43 dc-prox-25 kernel: [103877.304501] CR2: 00000000000000c0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.305254] ---[ end trace 913e8515690bbea4 ]---
Sep 15 19:20:43 dc-prox-25 kernel: [103877.338816] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
Sep 15 19:20:43 dc-prox-25 kernel: [103877.339725] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10 <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
Sep 15 19:20:43 dc-prox-25 kernel: [103877.341403] RSP: 0018:ffffb37e8f34bd68 EFLAGS: 00010287
Sep 15 19:20:43 dc-prox-25 kernel: [103877.342209] RAX: 0000000000000000 RBX: ffffb37e8f34bde8 RCX: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.342970] RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.343738] RBP: ffffb37e8f34bda0 R08: 0000000000000000 R09: 0000000000000002
Sep 15 19:20:43 dc-prox-25 kernel: [103877.344516] R10: 0000000000000008 R11: 0000000000000008 R12: ffff94468eee0000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.345298] R13: ffff944690e84800 R14: 0000000000000000 R15: 0000000000000001
Sep 15 19:20:43 dc-prox-25 kernel: [103877.346025] FS: 0000000000000000(0000) GS:ffff9446000c0000(0000) knlGS:0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.346750] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 19:20:43 dc-prox-25 kernel: [103877.347474] CR2: 00000000000000c0 CR3: 000000287ce2c005 CR4: 00000000007726e0
Sep 15 19:20:43 dc-prox-25 kernel: [103877.348205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 15 19:20:43 dc-prox-25 kernel: [103877.348935] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 15 19:20:43 dc-prox-25 kernel: [103877.349660] PKRU: 55555554
The weird thing is that network was still OK - and corosync said that all 6 nodes are reachable.
the GUI showed involved node grayed - its Vms continue to respond to ping but didn't work. HA didn't work too - No fencing for the node in trouble. HA policy is default (conditional).
Hope you have an idea, the node is completely uptodate (packages, BIOS, firmwares, ...)
I wasn't able to collect dmesg before the manual reboot. :/
Last edited: