I'll put all the places I've been and things I've tried already at the end of the message, but the tl'dr is. I _THINK_ I have fencing setup correctly, but I keep getting this error:
So as you can see, I don't really get a good idea of exactly what is failing.
From the syslog I can see this:
Command line it works:
Excerpt from my cluster.conf file:
This URL said to put a timeout in there:
http://forum.proxmox.com/threads/14162-Error-fencing-proxmox-cluster-node
I tried that, as you can see below. But I can't tell that it actually slowed anything down. I also didn't find any XML documentation indicating this was even a valid parameter
This URL said that there is a bug in the code, and it is referencing the wrong array value:
http://forum.proxmox.com/threads/15423-ProxMox-3-0-IMS-fencing-problem
I have modified the code as per the instructions, with no effect.
I also can't prove that fencing_snmp.py is even being called since I am using Drac
I have updated all my software to the latest versions as of 2/25/14.
I have verified the client is actually talking to the iDrac (watching SSH traffic from the host to the target).
I tried setting secure="0" to see if I could force telnet, and view the comms between the two boxes, but it still only goes over SSH.
Even when SSH is disabled on the Drac, and telnet is available.
I set this all up with these instructions:
https://pve.proxmox.com/wiki/Fencing
Including the modifications for iDrac6 which I have performed.
Usernames, passwords and some IP address changed to protect the non-functional.
I'm open to suggestions, and if that fails, I'll accelerate my purchase of a license to get dedicated support for this issue. Just seems to me as though I shouldn't be the first person encountering this on reasonably popular hardware.
Thanks for your suggestions.
Code:
root@moxie1:/etc/pve# fence_node -vv moxie3
fence moxie3 dev 0.0 agent fence_drac5 result: error from agent
agent args: nodename=moxie3 agent=fence_drac5 cmd_prompt=admin1-> ipaddr=10.14.0.12 login=user passwd=pass secure=1
fence moxie3 failed
From the syslog I can see this:
Code:
Feb 26 21:37:57 moxie1 fence_drac5: Parse error: Ignoring unknown option 'nodename=moxie3
Feb 26 21:38:03 moxie1 fence_node[41998]: fence moxie3 failed
Command line it works:
Code:
fence_drac5 -x -l user -p pass -a 10.14.0.12 -o status -n 1 -v -c "admin1->"
user@10.14.0.12's password:
/admin1-> racadm serveraction powerstatus
Server power status: ON
/admin1->Status: ON
Excerpt from my cluster.conf file:
Code:
<?xml version="1.0"?>
<cluster name="SuperCluster" config_version="8">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>
<fencedevices>
<fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="10.14.0.10" login="user" name="node1-drac" passwd="pass" secure="1"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="10.14.0.11" login="user" name="node2-drac" passwd="pass" secure="1"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="10.14.0.12" login="user" name="node3-drac" passwd="pass" secure="1"/>
</fencedevices>
<clusternodes>
<clusternode name="moxie1" votes="1" nodeid="1">
<fence>
<method name="1">
<device name="node1-drac"/>
</method>
</fence>
</clusternode>
<clusternode name="moxie2" votes="1" nodeid="2">
<fence>
<method name="1">
<device name="node2-drac"/>
</method>
</fence>
</clusternode>
continues
This URL said to put a timeout in there:
http://forum.proxmox.com/threads/14162-Error-fencing-proxmox-cluster-node
I tried that, as you can see below. But I can't tell that it actually slowed anything down. I also didn't find any XML documentation indicating this was even a valid parameter
Code:
<fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="10.14.0.12" login="user" name="node3-drac" passwd="pass" secure="1" timeout='35'/>
This URL said that there is a bug in the code, and it is referencing the wrong array value:
http://forum.proxmox.com/threads/15423-ProxMox-3-0-IMS-fencing-problem
I have modified the code as per the instructions, with no effect.
I also can't prove that fencing_snmp.py is even being called since I am using Drac
I have updated all my software to the latest versions as of 2/25/14.
I have verified the client is actually talking to the iDrac (watching SSH traffic from the host to the target).
I tried setting secure="0" to see if I could force telnet, and view the comms between the two boxes, but it still only goes over SSH.
Even when SSH is disabled on the Drac, and telnet is available.
I set this all up with these instructions:
https://pve.proxmox.com/wiki/Fencing
Including the modifications for iDrac6 which I have performed.
Usernames, passwords and some IP address changed to protect the non-functional.
I'm open to suggestions, and if that fails, I'll accelerate my purchase of a license to get dedicated support for this issue. Just seems to me as though I shouldn't be the first person encountering this on reasonably popular hardware.
Thanks for your suggestions.