Fencing, HA with Dell doesn't work when unplug A/C

francisco.

Member
Sep 8, 2014
36
0
6
Madrid
Hi forum,

My cluster is composed by 5 nodes, fence devices and cluster.conf is completely configured. Fence_node works fine on nodes.

I have problem when I unplug directly A/C one node. VMs is not migrated to other node. My cluster.conf:

<?xml version="1.0"?>
<cluster config_version="101" name="cluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_idrac" ipaddr="10.0.19.1" login="admin" name="node1-idrac" passwd="XXXXXXXX"/>
<fencedevice agent="fence_idrac" ipaddr="10.0.19.2" login="admin" name="node2-idrac" passwd="XXXXXXXX"/>
<fencedevice agent="fence_idrac" ipaddr="10.0.19.3" login="admin" name="node3-idrac" passwd="XXXXXXXX"/>
<fencedevice agent="fence_idrac" ipaddr="10.0.19.4" login="admin" name="node4-idrac" passwd="XXXXXXXX"/>
<fencedevice agent="fence_idrac" ipaddr="10.0.19.5" login="admin" name="node5-idrac" passwd="XXXXXXXX"/>
</fencedevices>
<clusternodes>
<clusternode name="pve1" nodeid="1" votes="1">
<fence>
<method name="1">
<device action="off" name="node1-idrac"/>
</method>
</fence>
</clusternode>
<clusternode name="pve2" nodeid="2" votes="1">
<fence>
<method name="1">
<device action="off" name="node2-idrac"/>
</method>
</fence>
</clusternode>
<clusternode name="pve3" nodeid="3" votes="1">
<fence>
<method name="1">
<device action="off" name="node3-idrac"/>
</method>
</fence>
</clusternode>
<clusternode name="pve4" nodeid="4" votes="1">
<fence>
<method name="1">
<device action="off" name="node4-idrac"/>
</method>
</fence>
</clusternode>
<clusternode name="pve5" nodeid="5" votes="1">
<fence>
<method name="1">
<device action="off" name="node5-idrac"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<failoverdomains>
<failoverdomain name="failover" nofailback="0" ordered="0" restricted="0">
<failoverdomainnode name="pve1"/>
</failoverdomain>
</failoverdomains>
<pvevm autostart="1" vmid="2040"/>
<pvevm autostart="1" vmid="2041"/>
<pvevm autostart="1" vmid="2042"/>
<pvevm autostart="1" vmid="2043"/>
<pvevm autostart="1" vmid="2044"/>
<pvevm autostart="1" vmid="2045"/>
<pvevm autostart="1" vmid="2046"/>
<pvevm autostart="1" vmid="2047"/>
<pvevm autostart="1" vmid="2048"/>
<pvevm autostart="1" vmid="2049"/>
<pvevm autostart="1" vmid="2050"/>
<pvevm autostart="1" vmid="2051"/>
<pvevm autostart="1" vmid="2052"/>
<pvevm autostart="1" vmid="2053"/>
<pvevm autostart="1" vmid="2054"/>
<pvevm autostart="1" vmid="2055"/>
<pvevm autostart="1" vmid="2056"/>
<pvevm autostart="1" vmid="2057"/>
<pvevm autostart="1" vmid="2058"/>
<pvevm autostart="1" vmid="2059"/>
<pvevm autostart="1" vmid="2060"/>
<pvevm autostart="1" vmid="2061"/>
<pvevm autostart="1" vmid="2062"/>
<pvevm autostart="1" vmid="2063"/>
<pvevm autostart="1" vmid="2064"/>
<pvevm autostart="1" vmid="2066"/>
<pvevm autostart="1" vmid="2067"/>
<pvevm autostart="1" vmid="2068"/>
<pvevm autostart="1" vmid="2069"/>
<pvevm autostart="1" vmid="2070"/>
<pvevm autostart="1" vmid="2071"/>
<pvevm autostart="1" vmid="2072"/>
<pvevm autostart="1" vmid="2073"/>
<pvevm autostart="1" vmid="2074"/>
<pvevm autostart="1" vmid="2075"/>
<pvevm autostart="1" vmid="2076"/>
<pvevm autostart="1" vmid="2077"/>
<pvevm autostart="1" vmid="2078"/>
<pvevm autostart="1" vmid="2079"/>
<pvevm autostart="1" vmid="2080"/>
<pvevm autostart="1" vmid="2081"/>
<pvevm autostart="1" vmid="2082"/>
<pvevm autostart="1" vmid="2083"/>
<pvevm autostart="1" vmid="2084"/>
<pvevm autostart="1" vmid="2085"/>
<pvevm autostart="1" vmid="2086"/>
<pvevm autostart="1" vmid="2087"/>
<pvevm autostart="1" vmid="2088"/>
<pvevm autostart="1" vmid="2089"/>
<pvevm autostart="1" vmid="2090"/>
<pvevm autostart="1" vmid="2091"/>
<pvevm autostart="1" vmid="2092"/>
<pvevm autostart="1" vmid="2093"/>
<pvevm autostart="1" vmid="2094"/>
<pvevm autostart="1" vmid="2095"/>
<pvevm autostart="1" vmid="2096"/>
<pvevm autostart="1" vmid="2097"/>
<pvevm autostart="1" vmid="2098"/>
</rm>
</cluster>


Regards.
 
You fence device dose not work if you unplug power! (idrac needs power)

And your VMs will migrate as soon as you plug in AC!

Christophe.

Thanks for you repplies guys.

So, best fence devices are "APC power" no?

Another question, If I need reboot,poweroff or migrate VM I have to disable HA for that VM?

In my cluster.conf I forgot to put domain for each pvevm :rolleyes:
 
Last edited:
Thanks for you repplies guys.

So, best fence devices are "APC power" no?

If your goal is "demo mode", you can turn off your node with power button. iDRAC keeps power.
In real life, your fencing device needs power.

Another question, If I need reboot,poweroff or migrate VM I have to disable HA for that VM?

No, but keep in ming that you absolutely need to have quorum in your cluster.

Without quorum, no start, move or stop VM.

Quorum is lost as soon as you lose half of your cluster nodes.

Side effect : a booting cluster can NOT have quorum until half +1 of nodes are up and running. So, VMs HA managed will NOT start on the first nodes, even if "start on boot" is selected, until quorum is reached.

Christophe.
 
No, but keep in ming that you absolutely need to have quorum in your cluster.

Without quorum, no start, move or stop VM.

Quorum is lost as soon as you lose half of your cluster nodes.

Side effect : a booting cluster can NOT have quorum until half +1 of nodes are up and running. So, VMs HA managed will NOT start on the first nodes, even if "start on boot" is selected, until quorum is reached.

Christophe.

I got quorum but I need know If I migrate a VM to other node I must disable HA for that VM.

Francisco.
 
No. You can migrate a HA managed VM. You do not need to disable HA on that VM.

Christophe.

Christophe I cannot do it. When I execute migrate, reboot, poweroff or another via web gui, enable HA.

Code:
Executing HA migrate for VM 9021 to node proxmox01
Trying to migrate pvevm:9021 to proxmox01...

This task never finishes.

Corosync.log

Code:
Dec 03 11:46:50 corosync [QUORUM] Members[5]: 1 2 3 4 5 6

Saludos.
 
Well, something is probably wrong in your cluster config. Live migration of HA managed VM is a feature.

What does "clustat" says, on each node?

Christophe.
 
Well, something is probably wrong in your cluster config. Live migration of HA managed VM is a feature.

What does "clustat" says, on each node?

Christophe.

Something strange It happens.

PROXMOX01
Code:
root@proxmox01:~# clustat 
Cluster Status for francisco @ Wed Dec  3 12:34:31 2014
Member Status: Quorate


 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 proxmox01                                                                1 Online, Local, rgmanager
 proxmox02                                                                2 Online, rgmanager
 proxmox03                                                                3 Online, rgmanager
 proxmox04                                                                4 Online, rgmanager
 proxmox05                                                                5 Online, rgmanager
 proxmox06                                                                6 Online, rgmanager


 Service Name                                                     Owner (Last)                                                     State         
 ------- ----                                                     ----- ------                                                     -----         
 pvevm:1001                                                       (none)                                                           stopped       
 pvevm:1002                                                       (none)                                                           stopped       
 pvevm:1003                                                       (none)                                                           stopped       
 pvevm:2035                                                       (none)                                                           stopped       
 pvevm:2036                                                       (none)                                                           stopped       
 pvevm:2038                                                       (none)                                                           stopped       
 pvevm:2039                                                       (none)                                                           stopped       
 pvevm:2040                                                       (none)                                                           stopped       
 pvevm:2041                                                       (none)                                                           stopped       
 pvevm:2042                                                       (none)                                                           stopped       
 pvevm:2043                                                       pve3                                                             starting      
 pvevm:2044                                                       (none)                                                           stopped       
 pvevm:2045                                                       (none)                                                           stopped       
 pvevm:2046                                                       (none)                                                           stopped       
 pvevm:2047                                                       (none)                                                           stopped       
 pvevm:2048                                                       (none)                                                           stopped       
 pvevm:2049                                                       (none)                                                           stopped       
 pvevm:2050                                                       (none)                                                           stopped       
 pvevm:2051                                                       (none)                                                           stopped       
 pvevm:2052                                                       (none)                                                           stopped       
 pvevm:2053                                                       (none)                                                           stopped       
 pvevm:2054                                                       (none)                                                           stopped       
 pvevm:2055                                                       (none)                                                           stopped       
 pvevm:2056                                                       (none)                                                           stopped       
 pvevm:2057                                                       (none)                                                           stopped       
 pvevm:2058                                                       (none)                                                           stopped       
 pvevm:2059                                                       (none)                                                           stopped       
 pvevm:2060                                                       (none)                                                           stopped       
 pvevm:2061                                                       (none)                                                           stopped       
 pvevm:2062                                                       (none)                                                           stopped       
 pvevm:2063                                                       (none)                                                           stopped       
 pvevm:2064                                                       (none)                                                           stopped       
 pvevm:2065                                                       (none)                                                           stopped       
 pvevm:2066                                                       pve2                                                             starting      
 pvevm:2067                                                       (none)                                                           stopped       
 pvevm:2068                                                       (none)                                                           stopped       
 pvevm:2069                                                       (none)                                                           stopped       
 pvevm:2070                                                       (none)                                                           stopped       
 pvevm:2071                                                       (none)                                                           recoverable   
 pvevm:2072                                                       (none)                                                           stopped       
 pvevm:2073                                                       (none)                                                           stopped       
 pvevm:2074                                                       (none)                                                           stopped       
 pvevm:2075                                                       pve2                                                             started       
 pvevm:2076                                                       (none)                                                           stopped       
 pvevm:2077                                                       (none)                                                           recoverable   
 pvevm:2078                                                       (none)                                                           recoverable   
 pvevm:2079                                                       pve2                                                             started       
 pvevm:2080                                                       (none)                                                           recoverable   
continue more VMS
.
.
.

In my Proxmox01 have 3 VMs only. It's my failover domain..

PROXMOX02-PROXMOX06 same that:

Code:
root@proxmox03:~# clustat 
Service states unavailable: Temporary failure; try again
Cluster Status for francisco @ Wed Dec  3 12:32:23 2014
Member Status: Quorate


 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 proxmox01                                                                1 Online
 proxmox02                                                                2 Online
 proxmox03                                                                3 Online, Local
 proxmox04                                                                4 Online
 proxmox05                                                                5 Online
 proxmox06                                                                5 Online

Last version Proxmox Enterprise

Code:
proxmox-ve-2.6.32: 3.2-136 (running kernel: 2.6.32-32-pve)pve-manager: 3.3-1 (running version: 3.3-1/a06c9f73)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-2.6.32-30-pve: 2.6.32-130
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.1-34
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-23
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-9
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Why only Proxmox01 prints "rgmanager" after clustat?

All services are enable in each node.

Regards.
 
Last edited:
Because it is not started on all other nodes...

Can you try to start rgmanager manually on proxmox02 to proxmox06?

Christophe.

It's started.

Code:
root@proxmox05:~# /etc/init.d/rgmanager status
rgmanager (pid 510789 510788) is running...

If service is restarted all resources are stopped and moved to other nodes. I do not want it now because we are offering service.

This evening I will do that.

Thank Chris.

Regards.
 
Cannot restar rgmanager, never finishes.

So, I got problem when I stop Cman.

Code:
root@proxmox01:# /etc/init.d/cman stop
Stopping cluster: 
   Leaving fence domain... found dlm lockspace /sys/kernel/dlm/rgmanager
fence_tool: cannot leave due to active systems
[FAILED]

Whan could I do?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!