Proxmox4-HA-not-working...Feedback

gosha · Dec 22, 2015

Hi All!

Tonight air conditioner broke down in the server room, one of the three servers to make an emergency shutdown. In the morning I saw that all the VMs from this server migrated on the other two servers corresponding HA-groups setting.
Real situation occured!
Proxmox HA cluster has shown that working as expected!

P.S.
But after booting that server - VMs failback not occured.

---
Best regards!
Gosha

tom · Dec 22, 2015

gosha said:
Proxmox HA cluster has shown that working as expected!

thanks for feedback!

t.lamprecht · Dec 22, 2015

gosha said:
But after booting that server - VMs failback not occured.

You must explicitly configure the VM to let it automatically fallback, we didn't want to init a mass migration when the server comes back online, this probably would do more harm than good.

What you can do to setup this behaviour is to add the VM to a group which has only the node in it, namely the one where you want to run the VM.
Be sure that 'nofailback' and 'restricted' are _NOT_ ticked. Then the VM will still be migrated and when the failed Node comes back online a failback will be executed.

gosha · Dec 22, 2015

t.lamprecht said:
Be sure that 'nofailback' and 'restricted' are _NOT_ ticked. Then the VM will still be migrated and when the failed Node comes back online a failback will be executed.

All my VMs HA-groups without 'nofailback' and 'restricted'.

---
Best regards!
Gosha

t.lamprecht · Dec 22, 2015

gosha said:
All my VMs without 'nofailback' and 'restricted'.

Yes, but the option purposely is called "nofailback" and not "failback", meaning that when not ticked a fail-back won't automatically happen but if ticked a fail-back will be prevented.

You need to configure the group to only have one preferred Node also to be sure that the VM always fail-backs to this one if possible, at the moment.

gosha · Dec 22, 2015

t.lamprecht said:
Yes, but the option purposely is called "nofailback" and not "failback", meaning that when not ticked a fail-back won't automatically happen but if ticked a fail-back will be prevented.

You need to configure the group to only have one preferred Node also to be sure that the VM always fail-backs to this one if possible, at the moment.

All my HA-groups not ticked both 'nofailback' and 'restricted'. This default setting.

t.lamprecht · Dec 22, 2015

gosha said:
All my HA-groups not ticked both 'nofailback' and 'restricted'. This default setting.

Yes, I know

I was only saying that if you want failback of a service you need to additionally add it to a group where only the preferred node is in it (additional to the default settings).

Like:

gosha · Dec 22, 2015

t.lamprecht said:
Yes, I know

I was only saying that if you want failback of a service you need to additionally add it to a group where only the preferred node is in it (additional to the default settings).

Like: View attachment 3225

Thanks!
I'll try in the next weekend - reset via iLO one server and check this.

---
Best regards!
Gosha

gosha · Dec 22, 2015

t.lamprecht said:
...you need to additionally add it to a group where only the preferred node is in it (additional to the default settings).

I created the new group for failback to node 1:

and try to add vm:101 to this group:

hm... maybe I somehow misunderstood...

one resource can not be placed in two groups?

alexskysilk · Dec 22, 2015

t.lamprecht said:
Try:

Code:

ha-manager disable vm:102 ha-manager enable vm:102

This worked.

Code:
Also can you please attach the logs from the CRM master at that time (from your post I guess its pve20).
Maybe filter it a bit, something like:

Code:

journalctl -u pve-ha-crm.service -u pve-ha-lrm.service -u pve-cluster.service > journal-`date +%Y-%m-%d-%H%M%S`.log

I cannot reproduce such issues so it's important to have the info so we can find and fix an eventual bug or help you with the configuration, thanks.

ran on both pve20 (the survivor) and pve22 (the victim).

alexskysilk · Dec 31, 2015

Update: I had a power outage this morning. I had one VM stuck the same way as described above, and I had to use Thomas's suggestion to remove and reenable HA on it to bring it back.

Why is this happening?

Search

Search

Proxmox4-HA-not-working...Feedback

gosha

Well-Known Member

tom

Proxmox Staff Member

t.lamprecht

Proxmox Staff Member

gosha

Well-Known Member

t.lamprecht

Proxmox Staff Member

gosha

Well-Known Member

t.lamprecht

Proxmox Staff Member

gosha

Well-Known Member

gosha

Well-Known Member

alexskysilk

Distinguished Member

Attachments

alexskysilk

Distinguished Member

We value your privacy