Fence configuration

MartinL

New Member
Nov 21, 2014
17
2
3
I have a cluster with 2 proxmox nodes and a quorum disk on a server for external storage. All seems all-right, but when i tried some failure to see VM in HA behavior, the VM is migrated only when the defective node is rebooted without any failure, it's not really useful.
After investigated, i found problems fence.
The first node use IPMI and the second use iLO
Here is "fence_tool -n ls" output, all seems good
Code:
fence domain
member count  2
victim count  0
victim now    0
master nodeid 2
wait state    none
members       1 2 
all nodes
nodeid 1 member 1 victim 0 last fence master 2 how member
nodeid 2 member 1 victim 0 last fence master 1 how member
But in "fence_check -vv" output, i found error on both nodes' fence device
Code:
fence_check run at Fri Dec  5 16:50:24 CET 2014 pid: 15669
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Get node list: PM2 pm4
Testing PM2 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node PM2
Found 1 method(s) to test for node PM2
Testing PM2 method 1 status
Testing PM2 method 1: FAILED
status PM2 dev 0.0 agent fence_ipmilan result: status error
Testing pm4 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node pm4
Found 1 method(s) to test for node pm4
Testing pm4 method 1 status
Testing pm4 method 1: FAILED
status pm4 dev 0.0 agent fence_ilo result: status error
cleanup: 5

Here is "/etc/pve/cluster.conf"
Code:
<?xml version="1.0"?>
<cluster config_version="15" name="IMEREOS">
  <cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <quorumd allow_kill="0" interval="1" label="pm_quorum" tko="10" votes="1"/>
  <totem token="54000"/>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="172.20.0.121" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxxxxxxx" power_wait="5"/>
    <fencedevice agent="fence_ilo" ipaddr="172.20.0.141" login="Administrator"  name="iloPM4" passwd="xxxxxxxxxxxxxx"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="PM2" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pm4" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device action="reboot" name="iloPM4"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <pvevm autostart="0" vmid="107"/>
    <pvevm autostart="1" vmid="102"/>
  </rm>
</cluster>
I never use fence before so ... :confused:
 
Last edited:
I found some of my mistakes:
-the "ipmitool" package was missing on the second node
-i used "fence_ilo" instead "fence_ilo3"
-the package "gnutls-bin" was missing on both nodes (it used by fence_ilo)

Fencing for the node is using IPMI work fine know but not for one use iLO:
Code:
root@PM2:~# fence_ilo3 -l Administrator -p xxxxxxxx -a 1xxxxxxxx1 -o status -vv
INFO:root:Executing: /usr/bin/ipmitool -I lanplus -H 1xxxxxxxxx1 -U Administrator -P xxxxxxxx -C 0 -p 623 -L ADMINISTRATOR chassis power status

DEBUG:root:1  Error in open session response message : invalid authentication algorithm

Error: Unable to establish IPMI v2 / RMCP+ session
Unable to get Chassis Power Status


ERROR:root:Failed: Unable to obtain correct plug status or plug is not available
 
I found some of my mistakes:
-the "ipmitool" package was missing on the second node
-i used "fence_ilo" instead "fence_ilo3"
-the package "gnutls-bin" was missing on both nodes (it used by fence_ilo)

Fencing for the node is using IPMI work fine know but not for one use iLO:
Code:
root@PM2:~# fence_ilo3 -l Administrator -p xxxxxxxx -a 1xxxxxxxx1 -o status -vv
INFO:root:Executing: /usr/bin/ipmitool -I lanplus -H 1xxxxxxxxx1 -U Administrator -P xxxxxxxx -C 0 -p 623 -L ADMINISTRATOR chassis power status

DEBUG:root:1  Error in open session response message : invalid authentication algorithm

Error: Unable to establish IPMI v2 / RMCP+ session
Unable to get Chassis Power Status


ERROR:root:Failed: Unable to obtain correct plug status or plug is not available


Change -C 0 to -C 1