Cluster - HA enabled VMs not migrating when node fails

hansm

Well-Known Member
Feb 27, 2015
62
3
48
Hi,

We've a 3-node Proxmox cluster connected to our Ceph storage cluster. We're still testing all options and stability, our last test didn't succeed in retaining the high availability we expected. Let me start by saying we are very satisfied with Proxmox, all seems very good and stable. Good work!

The cluster is setup and all nodes can see eachother, quorum is OK, rgmanager is running, fencing is configured and tested, everything works like expected. All VM's are HA enabled.

- When we reboot a node, the VM's on that node are shutdown, migrated to the other nodes and started, all automatically. Great!
- When we shutdown the (cluster) network interface on a node (ifconfig vmbr199 down), the other nodes determine that the node is dead and fence it, VM's are migrated and started, dead node gets rebooted because of fencing and cluster state become OK again automatically.

So far so good. But...

When we cut off the power (hard, pull the plugs) the webinterface shows that node in red (offline), but VM's aren't started on other nodes, we waited 10 minutes!! Within this time we tried to start a HA enabled VM that was stopped before, but this VM couldn't start, keeps loading (starting VM 103... or something like that), after we powered the node back on the VM could start, also the VM's on the 'failed' node got migrated to the other nodes when it was coming back online.

This should work for HA enabled VM's right? When a node crashes because of severe hardware failure (power supplies, motherboard, etc.) other nodes should start the VM's that were running on that node. And yes, I have redundant power supplies, fencing is configured to Dell's DRAC cards.

If you need additional information, please let me know. Thanks in advance.
 
Most likely a problem with fencing. I guess your fencing devices do not work if you cut the power (ipmi, drac, ...)?
 
Most likely a problem with fencing. I guess your fencing devices do not work if you cut the power (ipmi, drac, ...)?
I don't think so. Fencing works, when I shutdown the management network interface on a node, this node will be fenced by the other nodes. They connect to it's iDRAC and give a power reset ot something, because the node is restarted.
So, basically, fencing works. But maybe you are right. When we cut the power it can't be fenced by the other hosts anymore, because the DRAC card isn't reachable anymore, could that be the cause? How should that be solved?
 
They connect to it's iDRAC

Yes, this is the cause of your problems. iDRAC will not work when you loose power.

You need a power based fencing device (UPS), or make sure you always have power on the iDRAC card (redundant power supplies).
 
Yes, this is the cause of your problems. iDRAC will not work when you loose power.

You need a power based fencing device (UPS), or make sure you always have power on the iDRAC card (redundant power supplies).
Hi Dietmar,
Thanks, I think you're right, but I don't fully understand. Can you please explain why fencing is still necessary if the failed node is already powered off? If the fencing device can't be reached it seems reasonable to assume the node is dead. Can't it be configured that way? So, when a node is unreachable for the remaining cluster node and the fencing device is unreachable too, than assuming that the node can be kicked out of the cluster and migrate the VM's? Am I missing something?

You're right, I could use a power fencing device, the power supplies are connected to APC devices which can do this (SSH and SNMP), but we've redundant power supplies in our servers, so I need 2 fencing devices (2 different APC power switches). It's more complex and more error prone IMO. It's not only the power or power supply which fails, it could be the motherboard and redundant power supplies don't help for that ;-)
 
Thanks, I think you're right, but I don't fully understand. Can you please explain why fencing is still necessary if the failed node is already powered off? If the fencing device can't be reached it seems reasonable to assume the node is dead.

No , that is not reasonable. It is more reasonable to assume that either the device is broker, or something is configured badly.
The only way to know the node is down is to ask the fencing device (which need to be online for that).

You're right, I could use a power fencing device, the power supplies are connected to APC devices which can do this (SSH and SNMP), but we've redundant power supplies in our servers, so I need 2 fencing devices (2 different APC power switches). It's more complex and more error prone IMO. It's not only the power or power supply which fails, it could be the motherboard and redundant power supplies don't help for that ;-)

I would use IPMI as first fencing device, and configure the 2 APC power switches as fallback (if ipmi fencing fails). That is more complex to configure, but redundant.
 
Hi Dietmar,
Thank you very much. You were of great help! I'm going to like Proxmox even more :)

Your explanation is clear and I'm going to configure the redundant fencing devices like you suggested.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!