problem with HA

dbloquel · Jul 7, 2014

Hi,

I set up my cluster to do fencing and I activated HA.

My first problem is that when I do a clean shutdown from the web interface of a VM or a CT, proxmox restart it automatically right after.
This forces me to remove the VM from the HA list, turn off the VM, do my work and then reactivate the HA.

Isn't proxmox able to see the difference between a crash and a clean shutdown or did I do domething wrong with my confirguration?

My second problem is more critical. Each time I do an update of the cluster.conf proxmox move all my HA VMs and CTs from a node to another even if everything is already running. This problem appears when I change de cluster.conf directly on the server (using the folowing method https://pve.proxmox.com/wiki/Fencing) or when I change it from the HA tab on the web GUI.
This problem is critical because when proxmox change from a node to another there is a break of a few seconds or minutes per VM/CT.

Moreover my first problem makes my second bigger. Each time I need to turn off a VM/CT I need to update the cluster.conf file and each time I do it proxmox restart everything and move it from a node to another.

In conclusion, in my case, the HA option of proxmox brings unavailability which makes me think that I did something wrong somewhere.

Here is my cluster.conf file (some informations have been changed or deleted but nothing useful):

<?xml version="1.0"?>
<cluster config_version="115" name="***">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt1" power_wait="5"/>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt2" power_wait="5"/>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt3" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="virt1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="fence_server_virt1"/>
</method>
</fence>
</clusternode>
.
. /** SAME here for the two others nodes **/
.
</clusternodes>
<rm>
<pvevm autostart="1" vmid="125"/>
.
. /**list of VMs/CTs
.
<pvevm autostart="1" vmid="386"/>
</rm>
</cluster>

mir · Jul 7, 2014

To shutdown a VM under HA you must use the stop button in the web GUI. This will shutdown the VM and mark it disabled to rgmanager.

dbloquel · Jul 7, 2014

mir said:
To shutdown a VM under HA you must use the stop button in the web GUI. This will shutdown the VM and mark it disabled to rgmanager.

ok but stop is like a reset, it's not a clean shutdown like the button shutdown in the web GUI does?

mir · Jul 7, 2014

No, for at HA enabled VM stop is like I have described. But the GUI is a little dangerous here since if you press stop on a VM which is not HA enabled it will perform an unclean shutdown.

dbloquel · Jul 8, 2014

Ok thank you for the explaination it clear now.
My first problem is now solved.

Any idea about my second problem of migration between nodes when the cluster.conf is being modified?

mir · Jul 8, 2014

dbloquel said:
Any idea about my second problem of migration between nodes when the cluster.conf is being modified?

Not really. It looks like in someway that cluster communication is broken when you update cluster.conf which results in fencing.

You could try adding failover domain(s) to your cluster.conf and see if this helps. Search for failover in the wiki.

Search

Search

problem with HA

dbloquel

New Member

mir

Famous Member

dbloquel

New Member

mir

Famous Member

dbloquel

New Member

mir

Famous Member

We value your privacy