problem with HA

dbloquel

New Member
Mar 28, 2014
4
0
1
Hi,

I set up my cluster to do fencing and I activated HA.

My first problem is that when I do a clean shutdown from the web interface of a VM or a CT, proxmox restart it automatically right after.
This forces me to remove the VM from the HA list, turn off the VM, do my work and then reactivate the HA.

Isn't proxmox able to see the difference between a crash and a clean shutdown or did I do domething wrong with my confirguration?

My second problem is more critical. Each time I do an update of the cluster.conf proxmox move all my HA VMs and CTs from a node to another even if everything is already running. This problem appears when I change de cluster.conf directly on the server (using the folowing method https://pve.proxmox.com/wiki/Fencing) or when I change it from the HA tab on the web GUI.
This problem is critical because when proxmox change from a node to another there is a break of a few seconds or minutes per VM/CT.

Moreover my first problem makes my second bigger. Each time I need to turn off a VM/CT I need to update the cluster.conf file and each time I do it proxmox restart everything and move it from a node to another.

In conclusion, in my case, the HA option of proxmox brings unavailability which makes me think that I did something wrong somewhere.

Here is my cluster.conf file (some informations have been changed or deleted but nothing useful):
<?xml version="1.0"?>
<cluster config_version="115" name="***">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt1" power_wait="5"/>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt2" power_wait="5"/>
<fencedevice agent="fence_server" ipaddr="***.***.***.***" name="fence_server_virt3" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="virt1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="fence_server_virt1"/>
</method>
</fence>
</clusternode>
.
. /** SAME here for the two others nodes **/
.
</clusternodes>
<rm>
<pvevm autostart="1" vmid="125"/>
.
. /**list of VMs/CTs
.
<pvevm autostart="1" vmid="386"/>
</rm>
</cluster>
 
Last edited:
To shutdown a VM under HA you must use the stop button in the web GUI. This will shutdown the VM and mark it disabled to rgmanager.
 
To shutdown a VM under HA you must use the stop button in the web GUI. This will shutdown the VM and mark it disabled to rgmanager.

ok but stop is like a reset, it's not a clean shutdown like the button shutdown in the web GUI does?
 
No, for at HA enabled VM stop is like I have described. But the GUI is a little dangerous here since if you press stop on a VM which is not HA enabled it will perform an unclean shutdown.
 
Ok thank you for the explaination it clear now.
My first problem is now solved.

Any idea about my second problem of migration between nodes when the cluster.conf is being modified?
 
Any idea about my second problem of migration between nodes when the cluster.conf is being modified?
Not really. It looks like in someway that cluster communication is broken when you update cluster.conf which results in fencing.

You could try adding failover domain(s) to your cluster.conf and see if this helps. Search for failover in the wiki.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!