any reason why a VM is not automatically added into HA?

kellogs · Aug 27, 2024

Coming from Vmware if a VM has been created in Vcenter, it would automaticallt failover to other node if the host is down.

Is there any reason why this behaviour is not default with Proxmox?

BobhWasatch · Aug 27, 2024

I guess because most people don't want HA on all of their VM's. Also, you need more than two nodes for HA to work (see documentation that is linked at the top of the PVE screen).

Maximiliano · Aug 27, 2024

Hello,

There are possible scenarios were HA can harm your production. Therefore it is not enabled by default.

If for example all the networks used by Corosync stop working then all nodes will be fenced so their VMs can be recovered on nodes with quorum, but since no one has quorum the guest can't be migrated and the entire cluster will be rebooted.

Additionally, It might be possible that a node loses corosync quorum without its guests having any kind of issue, in that case automatic fail over only adds downtime.

See our docs [1].

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#ha_manager_fencing

kellogs · Aug 29, 2024

thank you guys for the information. I have setup few VMs which we forgot to turn on HA and the host which they are on was dead in the water. We experienced downtime due to this but luckly there was a command line to migrate the VMID to another working node in cluster hence of this question. so for HA to be stable a stable corosync network is a must (we have a pair of stacked switches) and a quorom (we have total 15 nodes)

Blueloop · Aug 29, 2024

kellogs said:
Coming from Vmware if a VM has been created in Vcenter, it would automaticallt failover to other node if the host is down.

Is there any reason why this behaviour is not default with Proxmox?

VMware does things its way and Proxmox its way. You might note that a Proxmox HA cluster allows you to set VMs as start on boot, which VMware does not.

Also note that you don't need a vCentre (yay) ie an orchestration appliance and if you recall, you probably had several orchestration appliances, each guzzling shed loads of RAM vCPUs and disc space.

I've been using VMware since 2.x and things have changed somewhat. There used to be GSX and ESX as well as ESXi. They did do rather well with VMFS which turns out to have been a killer feature. MS Hyper-V clustering is ... a bit wank and a massive bodge and Proxmox and co can't do snapshots on iSCSI shared storage. However if you wave a decent Ceph cluster at Proxmox then you are golden.

You are almost certainly used to writing and following procedures so, note the differences and document what should be done. You also have way more control on how the nodes themselves work - its a normal Linux box. Do be careful with that! The killer feature for me is that you get the full equivalent of Enterprise + out of the box. Open vSwitch is very tasty but, again, you must take care to understand how it works. DVS on VMware is lovely but seriously expensive. Wack on Tanzu (containers and that) and you will need to sell a kidney.

Take your time and get to grips with a new way of doing stuff. Try to rethink what you got used to with VMware and be open minded - its all rather liberating \o/

Falk R. · Aug 29, 2024

Blueloop said:
VMware does things its way and Proxmox its way. You might note that a Proxmox HA cluster allows you to set VMs as start on boot, which VMware does not.

This is not correct, vsphere can also start VMs automatically. Since you are required to configure this on every ESXi, almost no one does it.

kellogs said:
thank you guys for the information. I have setup few VMs which we forgot to turn on HA and the host which they are on was dead in the water. We experienced downtime due to this but luckly there was a command line to migrate the VMID to another working node in cluster hence of this question. so for HA to be stable a stable corosync network is a must (we have a pair of stacked switches) and a quorom (we have total 15 nodes)

A dedicated network is generally recommended for Corosync.
I also have many setups without a dedicated network, but you should always have several redundant networks for a stable cluster.
With 15 nodes, you always have a quorum majority if up to 7 nodes fail. I don't see any need for an extra quorum.
With large clusters, you should always make sure that Corosync runs on a low-latency network.

esi_y · Aug 29, 2024

kellogs said:
so for HA to be stable a stable corosync network is a must (we have a pair of stacked switches)

The issue with stacking and/or LACP/MLAG (without BFD) is that the switchover can take >4 seconds which is enough to start experiencing quorum loss on such link, which (with HA guests) will almost invariably result in watchdog reboots of the said nodes. It is thus much better to have redundant corosync links:

https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_redundancy

Search

Search

any reason why a VM is not automatically added into HA?

kellogs

Active Member

BobhWasatch

Distinguished Member

Maximiliano

Proxmox Staff Member

kellogs

Active Member

Blueloop

Member

Falk R.

Distinguished Member

esi_y

Renowned Member

We value your privacy