I have a cluster with 2 proxmox nodes and a quorum disk on a server for external storage. All seems all-right, but when i tried some failure to see VM in HA behavior, the VM is migrated only when the defective node is rebooted without any failure, it's not really useful.
After investigated, i found problems fence.
The first node use IPMI and the second use iLO
Here is "fence_tool -n ls" output, all seems good
But in "fence_check -vv" output, i found error on both nodes' fence device
Here is "/etc/pve/cluster.conf"
I never use fence before so ...
After investigated, i found problems fence.
The first node use IPMI and the second use iLO
Here is "fence_tool -n ls" output, all seems good
Code:
fence domain
member count 2
victim count 0
victim now 0
master nodeid 2
wait state none
members 1 2
all nodes
nodeid 1 member 1 victim 0 last fence master 2 how member
nodeid 2 member 1 victim 0 last fence master 1 how member
Code:
fence_check run at Fri Dec 5 16:50:24 CET 2014 pid: 15669
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Get node list: PM2 pm4
Testing PM2 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node PM2
Found 1 method(s) to test for node PM2
Testing PM2 method 1 status
Testing PM2 method 1: FAILED
status PM2 dev 0.0 agent fence_ipmilan result: status error
Testing pm4 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node pm4
Found 1 method(s) to test for node pm4
Testing pm4 method 1 status
Testing pm4 method 1: FAILED
status pm4 dev 0.0 agent fence_ilo result: status error
cleanup: 5
Here is "/etc/pve/cluster.conf"
Code:
<?xml version="1.0"?>
<cluster config_version="15" name="IMEREOS">
<cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<quorumd allow_kill="0" interval="1" label="pm_quorum" tko="10" votes="1"/>
<totem token="54000"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="172.20.0.121" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxxxxxxx" power_wait="5"/>
<fencedevice agent="fence_ilo" ipaddr="172.20.0.141" login="Administrator" name="iloPM4" passwd="xxxxxxxxxxxxxx"/>
</fencedevices>
<clusternodes>
<clusternode name="PM2" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="pm4" nodeid="2" votes="1">
<fence>
<method name="1">
<device action="reboot" name="iloPM4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="0" vmid="107"/>
<pvevm autostart="1" vmid="102"/>
</rm>
</cluster>
Last edited: