Two Node Cluster error command clusvcadm -e pvevm failed: exit code 1

Alfio

Renowned Member
Jul 29, 2013
21
2
68
Hi, I'm trying to setup a two node cluster for demonstration to my students in my country about proxmox, I was trying with the following scenario:

Two node cluster running proxmox 2.3.-12.
One node running Openfiler 2.3 version.

I already had made the cluster setup of the two nodes based in http://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster but I don't want to use DRBD I want to use ISCSI connection.

I added a LUN to my cluster and both of the server in the cluster see that LUN. After that i modified the file /etc/pve/cluster.conf.new (cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new) and increased by one the version when i made changes. When I change it in the primary server the configuration get copy to the second server.

I created a VM and added to the HA, but when I try to start the machine and error occurred :

clusvcadm -e pvevm failed: exit code 1

I know the important about fences devices, I'm trying to do it by a manual script because its just for demonstration and testing purposes. Here are my configurations files

root@proxmox01:~# cat /etc/hostname
proxmox01

root@proxmox02:/etc/pve# cat /etc/hostname
proxmox02


/etc/pve/cluster.conf



<?xml version="1.0"?>
<cluster config_version="9" name="cluster">
<cman two_node="1" expected_votes="1"> </cman>
<fencedevices>
<fencedevice agent="fence_manual" name="human"/>
</fencedevices>
<clusternodes>
<clusternode name="proxmox01" nodeid="1" votes="1">
<fence>
<method name="single">
<device name="human" nodename="proxmox01"/>
</method>
</fence>
</clusternode>
<clusternode name="proxmox02" nodeid="2" votes="1">
<fence>
<method name="single">
<device name="human" nodename="proxmox02"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="0" vmid="101"/>
</rm>
</cluster>



Any help will be appreciated
 
There is no agent called 'fence_manual' (or did you create one?)

Is there any hint in syslog or /var/log/cluster/rgmanager.log?
 
Hi, thanks for your answer here it is my /var/log/cluster/rgmanager.log in both servers

root@proxmox01:~# cat /var/log/cluster/rgmanager.log
Jul 28 18:59:42 rgmanager Waiting for quorum to form
Jul 28 19:11:23 rgmanager Quorum formed

root@proxmox02:/etc/pve# cat /var/log/cluster/rgmanager.log
Jul 28 21:49:38 rgmanager Waiting for quorum to form
Jul 28 21:54:42 rgmanager Quorum formed

About the agent called 'fence_manual' I thought it was a valid parameter, I just saw it in one example and I copy it.

I was doing some research And I saw this script:

################################################## ################################################## ###########
#
# Possible parameters from STDIN described below
#
#
# action = operation (on, off, reboot, monitor, list, status)
# on - needs to be implemented -> WOL
# off - not supported, will be automatic hardware reset
# reboot - will we CTL+ALT+DEL
# monitor - not supported
# list - not supported
# status - not supported
# option = obsolet parameter - use action instead
# not parsed
# ipaddr = hostname or ip
# login = username or login name
# passwd = password
# passwd_script = for script outside of cluster config
# not implemented
# port = if port needs to be specified
# not implemented
# nodename = if agent fences by node name, choose between nodename and port - preference although is port
# not parsed
#
################################################## ################################################## ###########

# define some variables
timestamp=date
logfile=/var/log/fence_hetzner.log

# First check if we are running from command line
if [ $# -gt 0 ]
then
while getopts "a:l:p:eek::" opt;
do
# parse valid arguments
case $opt in
a) address=$OPTARG ; echo $address ;;
l) user=$OPTARG ; echo $user ;;
p) pass=$OPTARG ; echo $pass ;;
o) action=$OPTARG ; echo $action ;;
# description needs to be implemented
*) echo "Usage: $0 -a -l -p -o" ; exit 1 ;;
esac
done
# ok, so we are getting parameters from fenced
else
while read LINE;
do
# split input by =, and parse arguments
param=`echo $LINE | awk -F "=" '{print $1}'`
case $param in
ipaddr) address=`echo $LINE | awk -F "=" '{print $2}'` ;;
login) user=`echo $LINE | awk -F "=" '{print $2}'` ;;
passwd) pass=`echo $LINE | awk -F "=" '{print $2}'` ;;
action) action=`echo $LINE | awk -F "=" '{print $2}'` ;;
esac
done
fi

# translate action to hetzner webservice supported
case $action in
off) action=hw ;;
reboot) action=sw ;;
esac

# write command to be issued to logfile
echo "[$date] Issuing: curl -u $user":"$pass https://robot-ws.your-server.de/reset/$address -d type=$action \n" >> $logfile

# function to parse return code from webservice
if [ $1 = 200 ]
then
echo "[$date] STATUS 200 - Reset ok" >> $logfile
return 0
else
# write some basic error message to logfile
case $1 in
400) echo "[$date] ERROR 400 - Invalid Input" >> $logfile ;;
404) echo "[$date] ERROR 404 - Server with $address not found" >> $logfile ;;
409) echo "[$date] ERROR 409 - Manual reset already active" >> $logfile ;;
500) echo "[$date] ERROR 500 - Reset failed due to internal error" >> $logfile ;;
esac
return 1
fi
}

# issue our constructed call to parse function
check_http_response $(curl --silent -o /dev/null -w '%{http_code}' -u $user:$pass https://robot-ws.your-server.de/reset/$address -d type=$action)

# exit with return code defined by function
exit $?

But I have no clue how to implement it, there is a way to do a test cluster without the hardware fencing devices?

I was looking in this link http://pve.proxmox.com/wiki/Fencing#Enable_fencing_on_all_nodes and I read that you can do a fencing with an SNMP managed switch and the way this is done is to call the switch using the proper command to disable one or more port(s) on the switch and doing so effectively avoid the node from being able to start a VM or CT on the shared storage since no route will exists to the shared storage from the node. But I'm using a virtual environment in my laptop to do the test.

My question again, there is a way to do a test cluster without the hardware fencing devices? or someone can point me in the way to implement the manual script?

Thanks in advanced.
 
There is no agent called 'fence_manual' (or did you create one?)

Is there any hint in syslog or /var/log/cluster/rgmanager.log?

Hi dietmar, thanks for you answer, I keep looking about your answer and you were right about the agent fence, I change it to fence_ack_manual and everything are working as i expected, For example I disconnected the server 01 and in the server 02 i run fence_ack_manual and the VM machines starts in the second server. Works great. Thanks for the help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!