When a cluster node is lost is it possible to restart its VM on another node?

guerby · Feb 18, 2021

Hi,

I'm testing some edge cases with proxmox VE 6.3: I have a cluster p1 with three nodes node1 node2 node3, all use only a shared NFS (from another machine outside the cluster) for VM disk storage.

VM 100 is running on node1. node2 and node3 have no VM running.

Let's assume node1 fails (poweroff because power supply failure) and won't be repaired immediately (no spare handy), so I want to manually start VM 100 on node2 but I don't want to remove node1 from the cluster (because it will be repaired in the coming days).

I noticed on node2 and node3 there is a copy of the VM 100 info on node2 and node3 in /etc/pve/node1/qemu-server/100.conf and the disk is on shared NFS so nothing is lost even if node1 if offline.

I didn't find an obvious way to do it via the proxmox VE WebUI, did I miss something?

Thanks!

UdoB · Feb 18, 2021

The function you are looking for is "High Availability". You can use it as you have three nodes, which is the minimum.

Please read https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_ha_manager

Best regards

guerby · Feb 18, 2021

UdoB said:
The function you are looking for is "High Availability". You can use it as you have three nodes, which is the minimum.

Please read https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_ha_manager

Best regards

Hi, thanks for your answer.

I read this document and I'm not sure I'll be able to test a realistic set of conditions with HA as proposed as my time is limited, that's why I asked how to "manually" restart a VM from a known for sure failed/offline node on another one.

On our current non proxmox hypervisor no VM are autostart and when a failure occurs (hypervisor hardware, network storage, network, out of ressources, etc...) we can afford some time for a human to analyze the situation and decide what to do, our users are ok with reasonable downtimes (server hardware nowadays is quite plentiful and reliable).

I know for experience than recovering from a failed automatic HA restart (because of misconfiguration - plenty of options to get wrong, or just unanticipated failure mode) can be really painful and cause very long downtimes (restore from backups because of corruption, etc...).

Hence I'm looking for a simpler manual process to recover from a failure.

fiona · Feb 19, 2021

Hi,

guerby said:
Hi, thanks for your answer.

I read this document and I'm not sure I'll be able to test a realistic set of conditions with HA as proposed as my time is limited, that's why I asked how to "manually" restart a VM from a known for sure failed/offline node on another one.

If there are no local resources for the VM and nodeA is down, you can simply move the configuration file from /etc/pve/nodes/nodeA/qemu-server/<ID>.conf to /etc/pve/nodes/nodeB/qemu-server/<ID>.conf and then start it on node B.

guerby said:
I know for experience than recovering from a failed automatic HA restart (because of misconfiguration - plenty of options to get wrong, or just unanticipated failure mode) can be really painful and cause very long downtimes (restore from backups because of corruption, etc...).

I'm not sure how corruption would occur from using HA? I think corruption is much more likely to occur when a VM or it's node crashes. When HA "steals" a VM to recover it, it basically does the same (moving the configuration file and starting it on the other node).

DC-CA1 · Jan 26, 2023

fiona said:
Hi,

If there are no local resources for the VM and nodeA is down, you can simply move the configuration file from /etc/pve/nodes/nodeA/qemu-server/<ID>.conf to /etc/pve/nodes/nodeB/qemu-server/<ID>.conf and then start it on node B.

I'm not sure how corruption would occur from using HA? I think corruption is much more likely to occur when a VM or it's node crashes. When HA "steals" a VM to recover it, it basically does the same (moving the configuration file and starting it on the other node).

do you plan to add this managable trough the GUI ?

a failed node can occur anytime and it would be more effective to handle this automaticaly / or been able to switch the configuratoin trough the GUI in case of a failed node, intseas of doing CLI to move the configuration file..

LnxBil · Jan 26, 2023

datacentx said:
do you plan to add this managable trough the GUI ?

Why? HA is already doing exactly what it is supposed to do ... and doing it for YEARS.

datacentx said:
a failed node can occur anytime and it would be more effective to handle this automaticaly / or been able to switch the configuratoin trough the GUI in case of a failed node, intseas of doing CLI to move the configuration file..

The CLI stuff is only the way to do if you do NOT have setup HA properly. With HA setup properly, you don't need to worry about it.... again ... has been like this for YEARS.

DC-CA1 · Jan 26, 2023

LnxBil said:
Why? HA is already doing exactly what it is supposed to do ... and doing it for YEARS.

The CLI stuff is only the way to do if you do NOT have setup HA properly. With HA setup properly, you don't need to worry about it.... again ... has been like this for YEARS.

That is good to know then.
we have HA enable on each VM and disabled the Failback as we need to mount GFS2 Shared on each Host manually per our policy to pint point why a node as actually Failed prior to bringing it back in service.
3 years ago we faced a similar issue and i tough my team had to manually move the VM to another Host ,

LnxBil · Jan 27, 2023

I tried GFS2 years ago and it was not stable enough in our tests. We've been running a dedicated shared storage via LVM on 5 nodes for 8 years without any (storage) problems. We of course had a few node failures, but everything migrated perfectly to the other nodes and all services were up again after a few minutes.

DC-CA1 · Jan 27, 2023

LnxBil said:
I tried GFS2 years ago and it was not stable enough in our tests. We've been running a dedicated shared storage via LVM on 5 nodes for 8 years without any (storage) problems. We of course had a few node failures, but everything migrated perfectly to the other nodes and all services were up again after a few minutes.

hi what kind of shared storage do you refer to ? over Network or it was SAS shared storage ?

LnxBil · Jan 28, 2023

datacentx said:
hi what kind of shared storage do you refer to ? over Network or it was SAS shared storage ?

I would prefer a shared storage with snapshot capability in PVE, but that is still a dream.

Depending on the hardware available, I almost exclusively used LVM (so block storage) in a cluster environment:
- first NBD decades ago
- then for a a couple of years DRBD
- also tried SAS shared storage, but that is/was limited to two machines, so no "real" cluster.
- and finally SAN (mostly FC, but also iSCSI)
(I also played around with ZFS-over-iSCSI, but this is currently not available as an HA option due to my own hardware restrictions)

On top of that I also tried GFS and OCFS2 als filesystems, but I had regular crashes with them (but that was in 2015).

jensie · Dec 1, 2023

I've setup HA for 1 VM and 1 LXC to test (also my main servers).
PVE1 is my main server on which these VM/LXC were running on. I shut down PVE1 to see what would happen.
PVE2 launched my LXC and PVE3 launched my VM. All works as expected.
Now I restarted PVE1 and expected that my LXC and VM would move back to this server, which didn't happen. I got the following log entries :

Code:

Dec 01 17:01:53 pve1 pmxcfs[1802]: [quorum] crit: quorum_initialize failed: 2
Dec 01 17:01:53 pve1 pmxcfs[1802]: [quorum] crit: can't initialize service
Dec 01 17:01:53 pve1 pmxcfs[1802]: [confdb] crit: cmap_initialize failed: 2
Dec 01 17:01:53 pve1 pmxcfs[1802]: [confdb] crit: can't initialize service
Dec 01 17:01:53 pve1 pmxcfs[1802]: [dcdb] crit: cpg_initialize failed: 2
Dec 01 17:01:53 pve1 pmxcfs[1802]: [dcdb] crit: can't initialize service
Dec 01 17:01:53 pve1 pmxcfs[1802]: [status] crit: cpg_initialize failed: 2
Dec 01 17:01:53 pve1 pmxcfs[1802]: [status] crit: can't initialize service

Not sure if this is the behaviour I should expect or not ? And when the VM/LXC will move back to my main server ?

LnxBil · Dec 2, 2023

jensie said:
Now I restarted PVE1 and expected that my LXC and VM would move back to this server, which didn't happen.

No, that is not expected. You have to do it manually. No logic in the world can anticipate what you want to do. Therefore, you need to do it manually.

spirit · Dec 2, 2023

jensie said:
I've setup HA for 1 VM and 1 LXC to test (also my main servers).
PVE1 is my main server on which these VM/LXC were running on. I shut down PVE1 to see what would happen.
PVE2 launched my LXC and PVE3 launched my VM. All works as expected.
Now I restarted PVE1 and expected that my LXC and VM would move back to this server, which didn't happen. I got the following log entries :

Code:

Dec 01 17:01:53 pve1 pmxcfs[1802]: [quorum] crit: quorum_initialize failed: 2 Dec 01 17:01:53 pve1 pmxcfs[1802]: [quorum] crit: can't initialize service Dec 01 17:01:53 pve1 pmxcfs[1802]: [confdb] crit: cmap_initialize failed: 2 Dec 01 17:01:53 pve1 pmxcfs[1802]: [confdb] crit: can't initialize service Dec 01 17:01:53 pve1 pmxcfs[1802]: [dcdb] crit: cpg_initialize failed: 2 Dec 01 17:01:53 pve1 pmxcfs[1802]: [dcdb] crit: can't initialize service Dec 01 17:01:53 pve1 pmxcfs[1802]: [status] crit: cpg_initialize failed: 2 Dec 01 17:01:53 pve1 pmxcfs[1802]: [status] crit: can't initialize service

Not sure if this is the behaviour I should expect or not ? And when the VM/LXC will move back to my main server ?

you need to create HA groups, with different priority (higher priority on PVE1). Like this will auto failback to original node .

This log is maybe when the node is starting and you don't have yet quorum ? (this is the pve-cluster service managing the /etc/pve directory).
If this log is not flooding and it's only at start, you can ignore it.

jensie · Dec 2, 2023

Thank you for your comments. @spirit How you can create a higher priority for PVE1 ?

spirit · Dec 2, 2023

jensie said:
Thank you for your comments. @spirit How you can create a higher priority for PVE1 ?

jensie · Dec 2, 2023

@spirit how can you adjust this after setting up the cluster ?

spirit · Dec 3, 2023

jensie said:
@spirit how can you adjust this after setting up the cluster ?

just click on the button ? ^_^

datacenter->ha->groups : create or modify an existing group

then edit services datacenter->ha , and associate vm to the group

you should really read the documention before playing with HA

https://pve.proxmox.com/pve-docs/chapter-ha-manager.html

Search

Search

When a cluster node is lost is it possible to restart its VM on another node?

guerby

Well-Known Member

UdoB

Distinguished Member

guerby

Well-Known Member

fiona

Proxmox Staff Member

DC-CA1

Member

LnxBil

Distinguished Member

DC-CA1

Member

LnxBil

Distinguished Member

DC-CA1

Member

LnxBil

Distinguished Member

jensie

Active Member

LnxBil

Distinguished Member

spirit

Distinguished Member

jensie

Active Member

spirit

Distinguished Member

jensie

Active Member

spirit

Distinguished Member

We value your privacy