how does HA actually work?

proxman

Renowned Member
Jul 1, 2012
10
0
66
Hi,

I followed http://pve.proxmox.com/wiki/High_Availability_Cluster on a 3-node cluster with dell hardware and fencing configured as of http://pve.proxmox.com/wiki/Fencing#IPMI_.28generic.29.
Node evacuation by stopping rgmanager works.
But when i simulate a "real" note failure by forcing a kernal panic with echo "c" > /proc/sysrq-trigger, it does not work.

- The BMC fencing seems not to work (how to test this, wiki only describes testing with apc switchable PDU)?
- The KVM guests don´t restart on another node.
- Cluster has serious problems after (sry, cant remember, need to test again what exactly was wrong).

Additional, is there a mechanism to not over-populate a node? When stopping rgmanager all vms seem to be evacuated to a random, but only single node.
Is there no check to see which hosts are the lowest utilized?

Thanks,
Proxman
(Sidenote, i for sure consider buying a service contract but first i want to be moderately sure that this will run sufficient smoothly)
(There is a real lack on a proper documentation, even if as a payed but drm-free pdf.)
 
I appreciate your announcement to buy support AFTER we helped to configure all. probably you should go the other way, which could be probably more successful and faster.

Fencing is essential, without reliable fencing no test will succeed - so check your fencing devices first.