PVE 4 with HA

shafeeks · Jul 8, 2015

Hi

I am currently testing the HA with PVE 4. I have installed 3 Nodes + 1 NFS server for storage. The 3 nodes have been clustered and on each node there are 2 VMs. As per below.
Cluster 1 (Node1) - VM100, VM101
Cluster 2 (Node2) - VM102, VM103
Cluster 3 (Node3) - VM104, VM105

All the disk images of the VM10x are on the NFS storage.

I have configured the HA of the datacenter. In this TAB, I have created a group name group1C3 with nodes: Cluster1, Cluster2, Cluster3. And in the HA -> Resources, I have added VM104 and VM105 and assigned it.

I would like to test how the HA works when the cluster 3 (Node3) goes down. A reboot on the cluster 3, does not migrate evenly the VM104 & VM105 to the available cluster. But if I change the group members to Cluster 1 & Cluster 2 and save it. It immediately migrates the VM104 to cluster 1 and VM105 to Cluster 2. The PVE4 has soft fencing, so no config needs to be done here.
Am I missing something in the configuration ??

In fact when the Node3 goes down, it should migrate envenly all the vms to the available nodes and goes back to its initial node when the Node3 comes up. Is this right?

Thanks for you help on this

Regards
Shafeek

shafeeks · Jul 9, 2015

Hi

I have done another test this morning. I reboot all node and stop all VMs. Then change the order of the nodes to node2, node1, node3 and save and reboot. I don't think that changing the order will help.

When the cluster is on, I have done a reboot on the Node3 and this time it executes a job and all the VMs in Node3 have been migrated to Node1 & Node2. In the log, I found the pve-ha-crm do the migration and fence the Node 3 and VMS as follows:

Jul 9 08:31:51 cluster1 pve-ha-crm[1094]: migrate service 'vm:104' to node 'cluster2' (running)Jul 9 08:31:51 cluster1 pve-ha-crm[1094]: service 'vm:104': state changed from 'started' to 'migrate' (node = cluster3, target = cluster2)
Jul 9 08:31:51 cluster1 pve-ha-crm[1094]: migrate service 'vm:105' to node 'cluster1' (running)
Jul 9 08:31:51 cluster1 pve-ha-crm[1094]: service 'vm:105': state changed from 'started' to 'migrate' (node = cluster3, target = cluster1)
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: members: 1/721, 2/751
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: starting data syncronisation
Jul 9 08:34:52 cluster1 pmxcfs[721]: [status] notice: members: 1/721, 2/751
Jul 9 08:34:52 cluster1 pmxcfs[721]: [status] notice: starting data syncronisation
Jul 9 08:34:52 cluster1 corosync[1082]: [TOTEM ] A new membership (10.146.0.181:364) was formed. Members left: 3
Jul 9 08:34:52 cluster1 corosync[1082]: [QUORUM] Members[2]: 1 2
Jul 9 08:34:52 cluster1 corosync[1082]: [MAIN ] Completed service synchronization, ready to provide service.
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: received sync request (epoch 1/721/0000000A)
Jul 9 08:34:52 cluster1 pmxcfs[721]: [status] notice: received sync request (epoch 1/721/0000000A)
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: received all states
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: leader is 1/721
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: synced members: 1/721, 2/751
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: start sending inode updates
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: sent all (0) updates
Jul 9 08:34:52 cluster1 pmxcfs[721]: [dcdb] notice: all data is up to date
Jul 9 08:34:52 cluster1 pmxcfs[721]: [status] notice: received all states
Jul 9 08:34:52 cluster1 pmxcfs[721]: [status] notice: all data is up to date
Jul 9 08:35:01 cluster1 pve-ha-crm[1094]: node 'cluster3': state changed from 'online' => 'unknown'
Jul 9 08:35:26 cluster1 pveproxy[25202]: proxy detected vanished client connection
Jul 9 08:35:51 cluster1 pve-ha-crm[1094]: service 'vm:104': state changed from 'migrate' to 'fence'
Jul 9 08:35:51 cluster1 pve-ha-crm[1094]: service 'vm:105': state changed from 'migrate' to 'fence'
Jul 9 08:35:51 cluster1 pve-ha-crm[1094]: node 'cluster3': state changed from 'unknown' => 'fence'

I repeat this process 2 - 3 time other tests to be sure that it works again and it does no longer work. I checked the logs and found that the

Jul 9 09:35:14 cluster1 pve-ha-crm[1090]: service 'vm:104': state changed from 'started' to 'freeze'
Jul 9 09:35:14 cluster1 pve-ha-crm[1090]: service 'vm:105': state changed from 'started' to 'freeze'
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: members: 1/720, 2/746
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: starting data syncronisation
Jul 9 09:35:21 cluster1 pmxcfs[720]: [status] notice: members: 1/720, 2/746
Jul 9 09:35:21 cluster1 pmxcfs[720]: [status] notice: starting data syncronisation
Jul 9 09:35:21 cluster1 corosync[1078]: [TOTEM ] A new membership (10.146.0.181:412) was formed. Members left: 3
Jul 9 09:35:21 cluster1 corosync[1078]: [QUORUM] Members[2]: 1 2
Jul 9 09:35:21 cluster1 corosync[1078]: [MAIN ] Completed service synchronization, ready to provide service.
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: received sync request (epoch 1/720/00000006)
Jul 9 09:35:21 cluster1 pmxcfs[720]: [status] notice: received sync request (epoch 1/720/00000006)
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: received all states
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: leader is 1/720
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: synced members: 1/720, 2/746
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: start sending inode updates
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: sent all (0) updates
Jul 9 09:35:21 cluster1 pmxcfs[720]: [dcdb] notice: all data is up to date
Jul 9 09:35:21 cluster1 pmxcfs[720]: [status] notice: received all states
Jul 9 09:35:21 cluster1 pmxcfs[720]: [status] notice: all data is up to date
Jul 9 09:35:24 cluster1 pve-ha-crm[1090]: node 'cluster3': state changed from 'online' => 'unknown'

NB: When I switched on the the node 3, the VMs remain on Node 1 & 2 and it does not migrate to Node 3. Do we need to do this manually??

PVE version 4.0 Beta 24

Thanks for your helps!!

dietmar · Jul 9, 2015

shafeeks said:
NB: When I switched on the the node 3, the VMs remain on Node 1 & 2 and it does not migrate to Node 3. Do we need to do this manually??

Does it help if you define a HA group for those VMs?

shafeeks · Jul 9, 2015

dietmar said:
Does it help if you define a HA group for those VMs?

No it does not. If I do not define any group in the HA, it does nothing when a node goes down.

Moreover, I noticed the following:
1. I remove the VMs from the group. Recreate the group with the 3 nodes and assign the VMs to the group.
2. Reboot the node 3 to test the HA.

It works once ie it migrate all VMs evenly from node 3 to node 1 & 2. But for more testing, I turn off the node 3 more than 7 times to test the HA and it does not work. In fact, I noticed that when the node goes down the script for HA & fencing does not get executed when HA & Cluster Resource Manager daemon goes down.

Another important things is that when the node 3 comes up again, the VMs pertaining to node 3 does NOT migrate back to Node 3. It remains on node 1 & 2. Do we need to define a preferred node for the VMs so that they return to their respective nodes when they comes up.

Thanks

Shafeek

dietmar · Jul 11, 2015

You need to wait up to 120 seconds till HA manager moves VMs (DLM timeout). Maybe you just need to wait a little bit longer?

shafeeks · Jul 17, 2015

Hi,

dietmar said:
You need to wait up to 120 seconds till HA manager moves VMs (DLM timeout). Maybe you just need to wait a little bit longer?

I shutdown the server for an hour still without any result. I would like to draw your attention that when the server goes down the script for HA does not execute the migration to other nodes(only 1 script about 1m30sec is executed but the script for migration 3m5sec does not execute).

Thanks

Shafeek

wolfgang · Jul 21, 2015

Hi,
how do you turn of the node 3?
because I can't reproduce the problem!

My settup is how you described it.
and what do you mean exactly with

shafeeks said:
(only 1 script about 1m30sec is executed but the script for migration 3m5sec does not execute).

shafeeks · Jul 22, 2015

Hello,

wolfgang said:
Hi,
how do you turn of the node 3?
because I can't reproduce the problem!

I turn off the Node 3 by clicking on the node name and click on shutdown through the management panel of PVE. Moreover I launched the command shutdown, halt and reboot also.

My settup is how you described it.
and what do you mean exactly with

When executing the above shutdown or reboot commands, the node (example node 3) goes down through the init scripts. During this process it executes 2 process:
1. PVE Local HA - Resources Manager Daemon (1m30sec)
2. PVE Local HA - Cluster Resources Manager (3m5sec) -> this script needs to migrate evenly all the VMs from the current node (node 3) to all available nodes (node 1 & 2). But this script does not get executed and the cluster goes down. Once it is down these VMs are no longer accessible.

I can provide you an access to the PVE manager for a look if wish to have a look.

Thanks

Shafeek

shafeeks · Jul 22, 2015

shafeeks said:
Hello,

When executing the above shutdown or reboot commands, the node (example node 3) goes down through the init scripts. During this process it executes 2 process:
1. PVE Local HA - Resources Manager Daemon (1m30sec)
2. PVE Local HA - Cluster Resources Manager (3m5sec) -> this script needs to migrate evenly all the VMs from the current node (node 3) to all available nodes (node 1 & 2). But this script does not get executed and the cluster goes down.

It should says:
1. "[***] A stop job is running for PVE Local HA ressources Manager Daemon (xx/ 1min 30s)"
2. "[***] A stop job is running for PVE Local HA ressources Manager Daemon (xx/ 3min 5s)"

This enable the execution of VMs migrations. Actually it does not do it.

Thanks

Shafeek

dietmar · Jul 22, 2015

shafeeks said:
This enable the execution of VMs migrations. Actually it does not do it.

No, a planned shutdown does not migrate VMs.

shafeeks · Jul 23, 2015

Hello,

dietmar said:
No, a planned shutdown does not migrate VMs.

But last week testing, when I shutdown 1 node the VMs migration was working. (See the log on the above post)

I am using the sef-fencing software watchdog as on these node there is no hardware fencing through IPMI, ILO or others.
Each node has only 1 network card connected to a switch + a NFS server.

In this case, how can I tested the HA? Is there any service to stop or any particular thing to do so as I can test it?
Thanks

Shafeek

wolfgang · Jul 23, 2015

Hi,
you can test it like this.
- unplug network.
- power off node by power cord.
- use "systemctl poweroff -f"
- use "systemctl stop corosync.service"

shafeeks · Jul 23, 2015

Hi,

I checked the HA by launching the the command "systemctl stop corosync.service". It just works great and the migration is done.

But I noticed when I start the corosync.service, the quorum is built again but it is unable to migrate the VM back to the node. I checked the log of the node on which the service was turned off and I got the bellow error message:

Jul 23 10:40:55 cluster2 pmxcfs[719]: [status] notice: starting data syncronisation
Jul 23 10:40:55 cluster2 pmxcfs[719]: [status] notice: received sync request (epoch 1/751/00000007)
Jul 23 10:40:55 cluster2 pmxcfs[719]: [dcdb] notice: received all states
Jul 23 10:40:55 cluster2 pmxcfs[719]: [dcdb] notice: leader is 1/751
Jul 23 10:40:55 cluster2 pmxcfs[719]: [dcdb] notice: synced members: 1/751, 3/698
Jul 23 10:40:55 cluster2 pmxcfs[719]: [dcdb] notice: waiting for updates from leader
Jul 23 10:40:56 cluster2 pmxcfs[719]: [dcdb] notice: update complete - trying to commit (got 10 inode updates)
Jul 23 10:40:56 cluster2 pmxcfs[719]: [dcdb] notice: all data is up to date
Jul 23 10:40:56 cluster2 pmxcfs[719]: [status] notice: received all states
Jul 23 10:40:56 cluster2 pmxcfs[719]: [status] notice: all data is up to date
Jul 23 10:40:58 cluster2 pve-ha-lrm[1102]: successfully aquired lock 'ha_agent_cluster2_lock'
Jul 23 10:40:58 cluster2 pve-ha-lrm[1102]: status change lost_agent_lock => active
Jul 23 10:40:58 cluster2 watchdog-mux[893]: exit watchdog-mux with active connections
Jul 23 10:40:58 cluster2 kernel: [67670.595657] watchdog watchdog0: watchdog did not stop!
Jul 23 10:41:08 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:41:18 cluster2 pveproxy[19307]: worker exit
Jul 23 10:41:18 cluster2 pveproxy[821]: worker 19307 finished
Jul 23 10:41:18 cluster2 pveproxy[821]: starting 1 worker(s)
Jul 23 10:41:18 cluster2 pveproxy[821]: worker 19945 started
Jul 23 10:41:18 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:41:27 cluster2 pvedaemon[18514]: worker exit
Jul 23 10:41:27 cluster2 pvedaemon[1084]: worker 18514 finished
Jul 23 10:41:27 cluster2 pvedaemon[1084]: starting 1 worker(s)
Jul 23 10:41:27 cluster2 pvedaemon[1084]: worker 19957 started
Jul 23 10:41:28 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:41:38 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:41:48 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:41:58 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:08 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:18 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:28 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:38 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:48 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:42:58 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:43:08 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:43:18 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe
Jul 23 10:43:28 cluster2 pve-ha-lrm[1102]: watchdog update failed - Broken pipe

Do you have any idea about this problem?

Thanks

Shafeek

dietmar · Jul 23, 2015

What kind of watchdog do you use? The watchdog should reboot that node, but the log shows no reboot?

shafeeks · Jul 24, 2015

Hi,

dietmar said:
What kind of watchdog do you use? The watchdog should reboot that node, but the log shows no reboot?

I am using the sef-fencing software watchdog since the nodes are PCs.

I test again this morning, once the node (that was brought down) comes up, the VMs does not comes back to its original node and no reboot was done. I end up with errors in log:
Jul 24 14:17:17 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe
Jul 24 14:17:27 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe
Jul 24 14:17:37 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe

Thanks

Shafeek

dietmar · Jul 24, 2015

Please try to find out why the watchdog does not reboot the node - any watchdog logs in syslog?

shafeeks · Jul 24, 2015

dietmar said:
Please try to find out why the watchdog does not reboot the node - any watchdog logs in syslog?

Just before getting broken pipe of watchdog, I got this on the syslog:

Jul 24 14:07:56 node2 watchdog-mux[898]: client watchdog expired - disable watchdog updates
Jul 24 14:12:07 node2 watchdog-mux[898]: exit watchdog-mux with active connections
Jul 24 14:12:07 node2 kernel: [ 1441.792768] watchdog watchdog0: watchdog did not stop!
Jul 24 14:12:17 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe
Jul 24 14:12:27 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe
Jul 24 14:12:37 node2 pve-ha-lrm[1104]: watchdog update failed - Broken pipe

The only errors I found for watchdog

Thanks for you help

Shafeek

dietmar · Jul 24, 2015

shafeeks said:
Jul 24 14:12:07 node2 kernel: [ 1441.792768] watchdog watchdog0: watchdog did not stop!

The node should reboot 60 seconds after that message. Are you sure it does not reboot?

shafeeks · Jul 24, 2015

Hi Dietmar,

dietmar said:
The node should reboot 60 seconds after that message. Are you sure it does not reboot?

I test it twice from the moment you told me this and waited for 5mins each time before I reply.

I confirm that the node does not reboot after that message.

Thanks

Shafeek

dietmar · Jul 24, 2015

shafeeks said:
I confirm that the node does not reboot after that message.

Really strange. Does your have some hardware watchdog (maybe ipmi). If so , please try to use that.

To test/load the ipmi watchdog module you can use:

# modprobe ipmi_watchdog

PVE 4 with HA

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Retired Staff

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Retired Staff

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member