Hi,
I have configured a 3 node cluster with HA, two hoster nodes and a third dumb node for quorum. The two hoster have a fencing device, the third node doesn’t have one.
Node1 : quorum node
Node2: hoster 1
Node3: hoster 2
Here is the cluster.conf
Here is the fence_test1 script. This is just for lab use, I just want to validate the fencing procedure. Later I will use iDrac + Managed switch :
Here is the package version file:
Questions:
1/ Can my Node1 be part of the fence_domain without having a fencing_device for itself ?
2/ When I fence my node3 VM/CT are migrated to node3. Then I put node3 back online and I put node2 offline. At this time, VM/CT are not migrated to node3 but to node1. Why is it like this ? Node1 is not par of the failover domain.
3/ When a node restart after being fenced, it tries to fence other node. Why is it possible because it does not have qorum ?
4/ Ideally, when node2 will be fenced, VM are going to migrate to node3. If node 2 is still off, and node3 goes down, do node1 will try to fence node3 too ?
5/ Is it possible to increase the "dead timer" ? I would like to say to the cluster that I tolerate that a node goes down for X seconds without migrating all the VM.
Any help would be great !
Thank you
I have configured a 3 node cluster with HA, two hoster nodes and a third dumb node for quorum. The two hoster have a fencing device, the third node doesn’t have one.
Node1 : quorum node
Node2: hoster 1
Node3: hoster 2
Here is the cluster.conf
Code:
<?xml version="1.0"?>
<cluster config_version="9" name="local">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
<clusternodes>
<clusternode name="node1" nodeid="1" votes="1">
</clusternode>
<clusternode name="node2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="fence2"/>
</method>
</fence>
</clusternode>
<clusternode name="node3" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="fence3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice agent="fence_test1" ipaddr="192.168.48.202" login="root" name="fence2" passwd="root"/>
<fencedevice agent="fence_test1" ipaddr="192.168.48.203" login="root" name="fence3" passwd="root"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="Node2-Node3" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="node2" priority="10"/>
<failoverdomainnode name="node3" priority="15"/>
</failoverdomain>
</failoverdomains>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>
Here is the fence_test1 script. This is just for lab use, I just want to validate the fencing procedure. Later I will use iDrac + Managed switch :
Code:
#!/bin/bash
HOSTNAME=$(hostname)
date=`date +"%T %d-%m-%Y"`
if [ $# -gt 0 ]; then
ipaddr="test"
else
while read LINE;
do
# split input by =, and parse arguments
param=`echo $LINE | awk -F "=" '{print $1}'`
if [ $param = 'ipaddr' ]; then
address=`echo $LINE | awk -F "=" '{print $2}'`;
elif [ $param = 'login' ]; then
user=`echo $LINE | awk -F "=" '{print $2}'` ;
elif [ $param = 'passwd' ]; then
pass=`echo $LINE | awk -F "=" '{print $2}'` ;
elif [ $param = 'nodename' ]; then
name=`echo $LINE | awk -F "=" '{print $2}'` ;
fi
done
fi
echo "$date: Fenced $name ($address). User:$user, Pwd:$pass" | mail -s "$HOSTNAME has fenced $name" [EMAIL="my_email@gmail.com"]my_email@gmail.com[/EMAIL]
exit 0
Here is the package version file:
Code:
[COLOR=#000000][FONT=monospace]proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)pve-kernel-2.6.32-23-pve: 2.6.32-109lvm2: 2.02.98-pve4clvm: 2.02.98-pve4corosync-pve: 1.4.5-1openais-pve: 1.1.4-3libqb0: 0.11.1-2redhat-cluster-pve: 3.2.0-2resource-agents-pve: 3.9.2-4fence-agents-pve: 4.0.0-1pve-cluster: 3.0-7qemu-server: 3.1-1pve-firmware: 1.0-23libpve-common-perl: 3.0-6libpve-access-control: 3.0-6libpve-storage-perl: 3.0-10pve-libspice-server1: 0.12.4-1vncterm: 1.1-4vzctl: 4.0-1pve3vzprocps: 2.0.11-2vzquota: 3.1-2pve-qemu-kvm: 1.4-17ksm-control-daemon: 1.1-1glusterfs-client: 3.4.0-2
[/FONT][/COLOR]
1/ Can my Node1 be part of the fence_domain without having a fencing_device for itself ?
2/ When I fence my node3 VM/CT are migrated to node3. Then I put node3 back online and I put node2 offline. At this time, VM/CT are not migrated to node3 but to node1. Why is it like this ? Node1 is not par of the failover domain.
3/ When a node restart after being fenced, it tries to fence other node. Why is it possible because it does not have qorum ?
4/ Ideally, when node2 will be fenced, VM are going to migrate to node3. If node 2 is still off, and node3 goes down, do node1 will try to fence node3 too ?
5/ Is it possible to increase the "dead timer" ? I would like to say to the cluster that I tolerate that a node goes down for X seconds without migrating all the VM.
Any help would be great !
Thank you
Last edited: