PVE5.2 HA Config.

jagan · Nov 1, 2018

We are testing PVE5.2 HA feature in LAB setup. Can anyone had successfully setup this ?

Earlier we were using PVE3.4 with SNMP fencing, but for PVE5.2 we understand that external fencing will not support. So we are trying to test with hardware watchdog. I couldn't find any clear documents to setup this.

Please help me, How to configure watchdog fencing ?

1. Like in PVE3.4 do we need to edit the cluster config. file to add watch dog fencing details.
2. If not, how to configure/load the watchdog
3. What is the watchdog action I have to set Power off or reboot ? I think we should set the action power off to avoid the fence node to re-join the cluster
4. What is the recommended time to set in watchdog to perform the above action
5. If hardware watchdog is not available, how to set fence time / reset time in softdog.
6. Where can I set the fence wait time. We have seen in SNMP fencing it took around 30sec to block the ceph port.
7. Why we have to enable "Power ON" in advance power management on board BIOS

dcsapak · Nov 2, 2018

did you already read:
https://pve.proxmox.com/wiki/High_Availability
and
https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x (<--- this is old, but contains some info about several hardware watchdogs)

jagan · Nov 2, 2018

Thanks for your reply.
Yes i read https://pve.proxmox.com/wiki/High_Availability and https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x

But i couldn't find answers to my basic doubts like,
1. Why we need to enable 'power On' state in advance power management
2. What is the hardware watch-dog policy, reboot / power down. To ensure the node should not come back online, we have to set Power off policy wright? but i had seen in forum to set power cycle / reboot.
3. What is the softdog power policy and how to set soft-dog timer to perform the fence action.
4. In BIOS i have configured as every 5 min to power cycle the server but the real fence_wait time is 60 Sec, will it cause any problem.

I am using Asus Z11PA-D8 MB which has got on-board ASMB9-iKVM management chip which allow me to enable watch-dog timer in BIOS.
Trying to test HA feature with this watch-dog, so i am trying understand from basics and it will work in backend.

I had enabled WATCHDOG_MODULE=ipmi_watchdog in /etc/default/pve-ha-manager. Now don't know how to check the fencing is working properly or not.

dcsapak · Nov 2, 2018

jagan said:
1. Why we need to enable 'power On' state in advance power management

where did you read this?

jagan said:
2. What is the hardware watch-dog policy, reboot / power down. To ensure the node should not come back online, we have to set Power off policy wright? but i had seen in forum to set power cycle / reboot.

this is your preference and depends on the watchdog, generally power cycle does no harm, since if it really is not connected to the other nodes, it cannot get quorum (so no vms get started there) and if it can reconnect, then it joins quorum and normal operations can continue?

jagan said:
3. What is the softdog power policy and how to set soft-dog timer to perform the fence action.

the default is powercycle and the timeout is 60 (not configurable)

jagan said:
4. In BIOS i have configured as every 5 min to power cycle the server but the real fence_wait time is 60 Sec, will it cause any problem.

what do you mean? every 5 min to power cycle?

you can test the fencing, by simply disconnecting a host from the network, it should lose quorum and self fence

jagan · Nov 2, 2018

dcsapak said:
where did you read this?

I read in Mastering proxmox 3rd edition.

this is your preference and depends on the watchdog, generally power cycle does no harm, since if it really is not connected to the other nodes, it cannot get quorum (so no vms get started there) and if it can reconnect, then it joins quorum and normal operations can continue?

What is "if it really is not connected" ? I didn't get you. When 3 nodes are connected in network, if one node is fenced (powercycle) it will automatically connect to other nodes when it comes online. If we powerdown instead of powercycle we eliminating the node permanently.

the default is powercycle and the timeout is 60 (not configurable)

So we should set same 60 sec for HW watchdog also?

what do you mean? every 5 min to power cycle?

I had configured in board BIOS to powercycle the server every 5mins

you can test the fencing, by simply disconnecting a host from the network, it should lose quorum and self fence

You mean after 60sec the node will reboot ?

dcsapak · Nov 2, 2018

our ha stack calculates all timeouts, etc with a 60 second watchdog. so if you set it to 5 minutes, yes this will make problems

jagan · Nov 2, 2018

dcsapak said:
our ha stack calculates all timeouts, etc with a 60 second watchdog. so if you set it to 5 minutes, yes this will make problems

But my board with watchdog has got min. time out from 5 mins, i can't set to 60 sec. is there any alternate way to use this watchdog for fencing.
And i didn't find anywhere how to load the softdog in PVE. How can i check the softdog is resetting the time out. Do i need to start any service to make the softdog to check the time out.

As you said softdog will do power cycle not power off, in this case if a failed node (server hang) come to online after 60 sec what will happen to HA-VMs.

dcsapak · Nov 2, 2018

jagan said:
But my board with watchdog has got min. time out from 5 mins, i can't set to 60 sec. is there any alternate way to use this watchdog for fencing.

i would use the softdog instead in that case

jagan said:
And i didn't find anywhere how to load the softdog in PVE. How can i check the softdog is resetting the time out. Do i need to start any service to make the softdog to check the time out.

the softdog is the default, if you do not anything else this should be enabled

jagan said:
As you said softdog will do power cycle not power off, in this case if a failed node (server hang) come to online after 60 sec what will happen to HA-VMs.

if your hanging node comes online again and can reconnect to the cluster, it will get updated from the cluster (e.g. where the vms are now) if you defined this as a preferred node for a vm,
the ha manager will live migrate the vms back

jagan · Nov 3, 2018

dcsapak said:
i would use the softdog instead in that case

Is softdog reliable than hardware watchdog ?

the softdog is the default, if you do not anything else this should be enabled

Where can i check the softdog is running. When i start the running HA enabled VMs the softdog starts it's timer is it ?

if your hanging node comes online again and can reconnect to the cluster, it will get updated from the cluster (e.g. where the vms are now) if you defined this as a preferred node for a vm, the ha manager will live migrate the vms back

So, you mean even fenced node comes online (after 60) there is no any impact on the VMs ? After successful fence the VMs will migrate to other nodes, if it comes back online also there will not be any data corrupt on migrated VMs.

In PVE3.4 using SNMP fencing with switch, after fencing i will power on the fenced node outside the network & clear all cluster configurations to re-join it again. But here you are telling that when a failed node comes back to online it does not affect the migrated VMs.

Please help me..

Thanks for your replies.

dcsapak · Nov 5, 2018

jagan said:
So, you mean even fenced node comes online (after 60) there is no any impact on the VMs ?

if it was fenced, it is already outside the quorate partition, to get back in, it has to reestablish cluster communication again that should not do any harm

jagan said:
After successful fence the VMs will migrate to other nodes, if it comes back online also there will not be any data corrupt on migrated VMs.

in case a host goes suddenly offline there is no way to 'migrate', but the vms will get rebooted on another node. your application/os etc. should be able to handle such a scenario

jagan said:
In PVE3.4 using SNMP fencing with switch, after fencing i will power on the fenced node outside the network & clear all cluster configurations to re-join it again. But here you are telling that when a failed node comes back to online it does not affect the migrated VMs.

the cluster stack changed significantly from 3.x to 4.x so, yes, this should not be necessary

jagan · Nov 5, 2018

Thanks for your information.
In-case of failed node comes online as it is if there is no any harm to ha vms (after reboot on another host) and we no need to clear the cluster configs to re-join the fenced node, why we need fencing concept at all ?

I am confused, can you please clearly explain me how new ha stack works compared to old pve3.x with SNMP fence and rgmanager.

jagan · Nov 5, 2018

I knew that if a node permanently fenced (down) we should clear the ha resources (VMS)& configs to re-join the same node back to the cluster (out side the network).

But here sofdog is doing power cycle (reboot) after fence wait time 60 sec. Which means it is not permanently eliminating the node from cluster. In my scenario if any node hang for >60 it will be fenced (reboot) by softdog. After fence, before reboot the ha vms are restarts on another node. Once it comes to online the existing VMS which are already present (before fence) will not create any harm ?

Why we are powering on fenced node out side the network to ensure that it should not communicate with other cluster nodes to avoid duplicate resources should not write on same disk image . But you are telling nothing will happen even though if it communicate with others without clearing the VMs and cluster configs

dcsapak · Nov 5, 2018

jagan said:
In-case of failed node comes online as it is if there is no any harm to ha vms (after reboot on another host) and we no need to clear the cluster configs to re-join the fenced node, why we need fencing concept at all ?

the fencing is necessary to make sure that nodes outside the quorate partition do not access shared resources

an example:

3 hosts have access to the same vm storage (each to their own vm)
one host loses network to the other ones

the remaining 2 hosts have a valid quorate partition and can do stuff (like starting vms)
now the single hosts fences because it is not in a quorate partition -> this makes sure that the single host does not access the vm disk anymore
after the appropriate timeout, the other hosts can be sure that the single host fenced itself, so it does not access the vm disks anymore and can start the vm
now the single host comes online again, sees that the other nodes started the vm and can rejoin the cluster without harm

jagan · Nov 5, 2018

Hi

What is quorate partition , how can I configure it.

Can we test pve5 ha in nested environment with softdog?

dcsapak · Nov 5, 2018

jagan said:
What is quorate partition , how can I configure it.

a quorate partition is the part of the hosts which has quorum
e.g. in a 5 node cluster, if at least 3 nodes can talk to each other, they are quorate and form the quorate partition
you cannot configure this

jagan said:
Can we test pve5 ha in nested environment with softdog?

sure

jagan · Nov 5, 2018

Thanks, I will test it in nested virtualization. Come back to you with results soon.

If the failed comes back online with in 60sec the vms will not reboot in another node wright?

Softdog is reliable to use in production environment ?
since my hardware watchdog has got min. starting timeout 300 sec.

Is there any plan to add external fencing in future.

dcsapak · Nov 6, 2018

jagan said:
If the failed comes back online with in 60sec the vms will not reboot in another node wright?

no, but please read https://pve.proxmox.com/wiki/High_Availability most of your questions are answered there

jagan said:
Softdog is reliable to use in production environment ?

yes

jagan said:
Is there any plan to add external fencing in future.

yes, but no timetable for that

jagan · Nov 7, 2018

Hello,

I have installed a 3 node (pve5.2-1) cluster and created Ceph (Luminous) cluster on PVE. Each node has one 1.81TB OSD disk.
In total the ceph Global size is 3*1.8=5.4TB, But when i create pool with 3/2 replica I am getting max. avail space as 1.7TB.
And when I create a VM on this storage nothing is showing in the "Content" tab. Please check my screenshots and how can I fix this issue.

dcsapak · Nov 7, 2018

jagan said:
In total the ceph Global size is 3*1.8=5.4TB, But when i create pool with 3/2 replica I am getting max. avail space as 1.7TB.

yes it show the real usable size (with 3/2 replica it replicates the data 3 times)

jagan said:
And when I create a VM on this storage nothing is showing in the "Content" tab. Please check my screenshots and how can I fix this issue.

mhmm this should work. are you on the newest version? what does pveversion -v say?

jagan · Nov 8, 2018

Hi,

please check my pveversion -v , do i need to upgrade ?

PVE5.2 HA Config.

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

We value your privacy