two node cluster fence error

gogo3k · Dec 27, 2014

Hi, I'am just installed proxmox 3.3 on two servers and i want to make ha cluster. I wrote many websites and watched many video tutorails on YT and i have massive problem with fencing. My servers don't have fence devices (i use tyan gt25). I created cluster and works fine i see two nodes from GUI, but when i try to validate cluster.conf i get many errors.

Code:

root@prox1:~# ccs_config_validate -l /etc/pve/cluster.conf.new
Relax-NG validity error : Extra element fencedevices in interleave
tempfile:4: element fencedevices: Relax-NG validity error : Element cluster fail
ed to validate content
tempfile:19: element device: validity error : IDREF attribute name references an
 unknown ID "fenceB"
Configuration fails to validate

This is my cluster.conf.new:

Code:

<?xml version="1.0"?>
<cluster name="test" config_version="5">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1" expected_votes="1"/>


  <fencedevices>
  <fencedevice agent="fence_ilo" ipaddr="192.168.1.121" login="root" password="pass" name="fenceA"/>
  <fencedevice agent="fence_ilo" ipaddr="192.168.1.122" login="root" password="pass" name="fenceB"/>
  </fencedevices>

  

  <clusternodes>
  <clusternode name="prox1" votes="1" nodeid="1">
  
  <fence>
  <method name="1">
  <device name="fenceA" action="reboot"/>
  </method>
  </fence>
  
  </clusternode>

  <clusternode name="prox2" votes="1" nodeid="2">
 
  <fence>
  <method name="1">
  <device name="fenceB" action="reboot"/>
  </method>
  </fence>
  </clusternode>

</clusternodes>
</cluster>

My pveversion:

Code:

proxmox-ve-2.6.32: 3.2-136 (running kernel: 2.6.32-32-pve)
pve-manager: 3.3-1 (running version: 3.3-1/a06c9f73)
pve-kernel-2.6.32-32-pve: 2.6.32-136
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.1-34
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-23
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-5
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

I must made stupid mistake but, i don't see anything. Sorry for my english because I'am learning it now.

mir · Dec 27, 2014

Try making this change for both cluster nodes:

<method name="1">
to
<method name="fence">

gogo3k · Dec 27, 2014

I changed it, and got same error.

acidrop · Dec 28, 2014

I suggest you to not use HA on two node cluster.It is better to use manual mode in such setup.

mir · Dec 28, 2014

Are the node names in the hosts file (/etc/hosts)?

gogo3k · Dec 28, 2014

Yes i have nodes in hosts file.
@acidrop what do you mean? I need some VM's to work all time long with minimal downtime, and without my actions. VM's must migrate automatic for me.

NdK73 · Dec 29, 2014

Then use a 3-node cluster! Even if the third node is just a "desktop" machine w/ no VMs running (probably even a Raspberry Pi could be used! but the work required makes a real node less expensive).
With 2 nodes, you risk that both gets fenced or both remain active, depending on the failure. And that means downtime (nodes mutually fenced) or even data loss (VM active on both nodes).

gogo3k · Dec 29, 2014

In this envirnoment is no place for other machine and all workstations is in use. So i must make 2node HA workable.

NdK73 · Dec 30, 2014

So there's no 100% reliable solution. See http://en.wikipedia.org/wiki/Two_Generals'_Problem . If you're really lucky, things work as expected. If you're not so lucky, nodes cross-fence and go down at the same time. If you're unlucky, both nodes activate the same VM and you suffer data corruption (that's what you get setting "expected votes = 1"...)!

Try looking at "quorum disk", or try installing a pseudo third node on a raspberry pi. Or, at least, use redundant communication between nodes (e.g via serial port) to minimize the chance of the data corruption scenario.

gogo3k · Dec 30, 2014

Nodes are conected to switch via lacp, as a storage i have ceph on nodes, so it is a little rendundant. Can someone answer me to problem from first post?

Mr.Holmes · Dec 30, 2014

Hello gogo3k

gogo3k said:
Nodes are conected to switch via lacp, as a storage i have ceph on nodes, so it is a little rendundant. Can someone answer me to problem from first post?

The error in your config file is using

Code:

password="pass"

but correct is

Code:

passwd="pass"

But: as already mentioned in the postings before:

* 2-node HA cluster is useless in that form - you must have at least a quorum disk, but better a small third node

* you wrote

gogo3k said:
My servers don't have fence devices

But HA is not possible without fencing!

Two possibilities:

- simple one: let the 2 node cluster as it is, without HA and simplify your cluster.conf as follows:

Code:

<?xml version="1.0"?>
<cluster name="test" config_version="5">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>

  <clusternodes>
  <clusternode name="prox1" votes="1" nodeid="1">

  </clusternode>

  <clusternode name="prox2" votes="1" nodeid="2">
 
  </clusternode>

</clusternodes>
</cluster>

- professional one: establish a third node, install a fence device, your cluster.conf will look approximately like this:

Code:

<?xml version="1.0"?>
<cluster config_version="19" name="ims-cluster">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <fencedevices>
    <fencedevice agent="fence_ilo" ipaddr="192.168.1.10" login="user" name="fence" passwd="xxxx" power_wait="15"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="prox1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="fence" port="1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="prox2" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="fence" port="2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="prox3" nodeid="3" votes="1">
      <fence>
        <method name="1">
          <device name="fence" port="3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
<rm>
   <failoverdomains>
     <failoverdomain name="fodom" ordered ="0" restricted="1" >
      <failoverdomainnode name="prox1" />
      <failoverdomainnode name="prox2" />
     </failoverdomain>
   </failoverdomains>

  
    <pvevm autostart="1" vmid="100" domain="fodom"/>
    <pvevm autostart="1" vmid="101" domain="fodom"/>
    <pvevm autostart="1" vmid="102" domain="fodom"/>
    <pvevm autostart="1" vmid="103" domain="fodom"/>
   
 
   
   
</rm>
</cluster>

Success!

Mr.Holmes

gogo3k · Dec 31, 2014

I have here QNAP NAS, can i make quorum disk on it?

Mr.Holmes · Dec 31, 2014

gogo3k said:
I have here QNAP NAS, can i make quorum disk on it?

The answer will be "Yes": quorum disk can be made everywhere where you can write files.

But again: it would be much easier and more efficient too if you install a quorum node - a simple machine is sufficient, see e.g. http://www.amazon.com/Zotac-Celeron-1-5GHz-Barebone-ZBOX-ID18-U/dp/B00FN5ZIOK

mir · Dec 31, 2014

I use a QNAP for that purpose. Install quorum disk on an iscsi exposed disk.

Search

Search

two node cluster fence error

gogo3k

New Member

mir

Famous Member

gogo3k

New Member

acidrop

Renowned Member

mir

Famous Member

gogo3k

New Member

NdK73

Renowned Member

gogo3k

New Member

NdK73

Renowned Member

gogo3k

New Member

Mr.Holmes

Active Member

gogo3k

New Member

Mr.Holmes

Active Member

mir

Famous Member

We value your privacy