I can't find this exact problem elsewhere on the forum. In essence I do have a problem with a host in a 3 node cluster whereby it will soft lock occasionally, this might be once every 6 months. I don't believe this is a bug i'm pretty sure the host itself does have a problem. I can still ping it but its non responsive
My issue though is proxmox is not handling this as expected. What I want is for all the VM's running to get spun up on another node, that node gets marked in the down state, and HA failover does its thing, everything carries on and i'll address the host when I can
Instead whats actually happening is the node gets marked as 'unknown', none of the VM's move - leaving all of them running on that host stranded in limbo - and I can't find any way to tell proxmox its actually dead and force the VM's to start on the other nodes. I either have to yank the power to the host, or physically reboot it
How do I fix this? Because its a huge problem
My issue though is proxmox is not handling this as expected. What I want is for all the VM's running to get spun up on another node, that node gets marked in the down state, and HA failover does its thing, everything carries on and i'll address the host when I can
Instead whats actually happening is the node gets marked as 'unknown', none of the VM's move - leaving all of them running on that host stranded in limbo - and I can't find any way to tell proxmox its actually dead and force the VM's to start on the other nodes. I either have to yank the power to the host, or physically reboot it
How do I fix this? Because its a huge problem