Proxmox VE 3.0 Clustering

asaguru

New Member
Jun 3, 2013
10
0
1
Chennai, Tamil Nadu, India
Hi,
I am trying to create Proxmox cluster between 3 nodes, i can able to start the rgmanger and can't able to start the nodes also, could you please help me on this
please provide document for how to create failover domains and auto failover.

root@pvnode2:~# clustat
Cluster Status for prodpve @ Mon Jul 8 18:04:53 2013
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
pvnode1 1 Online
pvnode2 2 Online, Local
pvnode3 3 Online

root@pvnode2:~# /etc/init.d/rgmanager status
rgmanager is stopped

root@pvnode2:~# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="5" name="prodpve">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<clusternodes>
<clusternode name="pvnode1" nodeid="1" votes="1"/>
<clusternode name="pvnode2" nodeid="2" votes="1"/>
<clusternode name="pvnode3" votes="1" nodeid="3"/></clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>
 
Hi Tom,
Thanks for you reply.
i have added fencing
now i can't able to start the vm node .

<?xml version="1.0"?>
<cluster config_version="7" name="test">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.1" login="admin" name="node1-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.2" login="admin" name="node2-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.3" login="admin" name="node3-drac" passwd="XXXX" secure="1"/>
</fencedevices>
<clusternodes>
<clusternode name="pvnode1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="node1-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="node2-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode3" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="node3-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>
----------------error log----------------------
UPID-pvnode1-00005FD0-00654D92-51DBE5BF-qmstart-100-root@pam- 1 51DBE5BF No accelerator found!
UPID-pvnode1-00005F64-0065444B-51DBE5A7-qmstart-100-root@pam- 1 51DBE5A7 No accelerator found!
UPID-pvnode1-00005BF4-00652FF4-51DBE573-qmstart-100-root@pam- 1 51DBE573 No accelerator found!
UPID-pvnode1-0000589C-0065213F-51DBE54E-qmcreate-100-root@pam- 1 51DBE54E OK
UPID-pvnode1-0000587F-00650ED6-51DBE51F-qmdestroy-100-root@pam- 1 51DBE51F OK
UPID-pvnode1-0000586E-0065089B-51DBE50F-qmcreate-100-root@pam- 1 51DBE50F OK
UPID-pvnode1-00005821-0064D69E-51DBE48F-qmdestroy-100-root@pam- 1 51DBE48F OK
UPID-pvnode1-000057D5-006499BB-51DBE3F3-qmstart-100-root@pam- 1 51DBE3F3 No accelerator found!
UPID-pvnode1-00005758-00645E24-51DBE35A-qmstart-100-root@pam- 1 51DBE35A No accelerator found!
UPID-pvnode1-00005740-00645D33-51DBE358-hastart-100-root@pam- 1 51DBE35F command 'clusvcadm -e pvevm-100 -m pvnode1' failed- exit code 255
UPID-pvnode1-00005728-00645AF9-51DBE352-qmstart-100-root@pam- 1 51DBE352 No accelerator found!
UPID-pvnode1-000056E3-00645902-51DBE34D-qmstart-100-root@pam- 1 51DBE34D No accelerator found!
UPID-pvnode1-0000569D-00645735-51DBE349-qmstart-100-root@pam- 1 51DBE349 No accelerator found!
UPID-pvnode1-00005658-00645580-51DBE344-qmstart-100-root@pam- 1 51DBE344 No accelerator found!
UPID-pvnode1-0000562E-00644F4D-51DBE334-qmstart-100-root@pam- 1 51DBE334 No accelerator found!
UPID-pvnode1-000052BD-0063D0D1-51DBE1F0-hastart-100-root@pam- 1 51DBE1F0 command 'clusvcadm -e pvevm-100 -m pvnode1' failed- exit code 1
UPID-pvnode1-00000D0D-0003651D-51DAEB1A-qmstart-100-root@pam- 1 51DAEB1B No accelerator found!
UPID-pvnode1-00000D09-000362BC-51DAEB14-qmcreate-100-root@pam- 1 51DAEB15 OK
UPID-pvnode1-00000957-0000067A-51DAE27A-startall--root@pam- 1 51DAE27A OK
-------------------------------
 
Hi Tom,
Thanks for your help,
now i can start the vmhost, but i am trying to do the HA , i just powered of one server (node2) but it is not starting up on another node and i am creating multiple vmhost but all the nodes are starting only on node2.
<?xml version="1.0"?>
<cluster config_version="10" name="test">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.1" login="admin" name="node1-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.2" login="admin" name="node2-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.3" login="admin" name="node3-drac" passwd="XXXX" secure="1"/>
</fencedevices>
<clusternodes>
<clusternode name="pvnode1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="node1-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="node2-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode3" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="node3-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>


------------------
root@pvnode2:~# clustat
Cluster Status for testpv @ Thu Jul 11 11:07:49 2013
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
pvnode2 1 Online, Local, rgmanager
pvnode3 2 Online, rgmanager
pvnode1 3 Online, rgmanager

Service Name Owner (Last) State
------- ---- ----- ------ -----
pvevm:100 pvnode2 started
pvevm:101 pvnode2 started

again Thanks for your help , if we fix this HA issue i can move some of my production vmware server to proxmox.
 
if you poweroff a server, you disable also the fencing device on this server, as you use an integrated device.

to be able to do such a test, you need an independent fencing device, e.g. APC power fencing.
 
I have a problem with my Proxmox install now after trying to enable Clustering. I followed the wiki to enable clustering and now the primary server has permissions problems on the /etc/pve folder. This is now preventing backups of Qemu servers, as it fails to write the temp file in the nodes folder.
Is there a way to correct the permissions? See screen shot attached.

Screen Shot 2013-07-11 at 11.14.52.png
 
Hi Tom,
I am trying to achieve HA, if one server is down all the vmhost on that server need to start in another node,
i got you point on independent fencing. if i don't have it how can i config HA .
problems i am facing on my current setup is not able to do HA and all the nodes are currently running in one node even i created in different node.
even i trying to move them to another node using clusvcadm -r pvevm:101 -m pvnode3 these node 101 is created on pvnode3

-----
# tail /var/log/cluster/rgmanager.log
Jul 11 16:00:24 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:25 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:26 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:27 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager Service pvevm:101 is stopped
Jul 11 16:00:32 rgmanager #70: Failed to relocate pvevm:101; restarting locally
Jul 11 16:00:32 rgmanager Recovering failed service pvevm:101
Jul 11 16:00:32 rgmanager [pvevm] Move config for VM 101 to local node
Jul 11 16:00:33 rgmanager Service pvevm:101 started
---------------------
error from console

Executing HA migrate for VM 101 to node pvnode3
Trying to migrate pvevm:101 to pvnode3...Failure
TASK ERROR: command 'clusvcadm -M pvevm:101 -m pvnode3' failed: exit code 255


i am using nfs share for creating the disk for vmhosts. is any failover group i have to add .how to add it . i am expecting your help on this. i need to show the HA then only i am move the proxmox to testing phase for production.


Regards,
Alagu.S
 
Last edited:
to thread hijacking please. open a new thread.
 
Hi Tom,
I am trying to achieve HA, if one server is down all the vmhost on that server need to start in another node,
i got you point on independent fencing. if i don't have it how can i config HA .
problems i am facing on my current setup is not able to do HA and all the nodes are currently running in one node even i created in different node.
even i trying to move them to another node using clusvcadm -r pvevm:101 -m pvnode3 these node 101 is created on pvnode3

-----
# tail /var/log/cluster/rgmanager.log
Jul 11 16:00:24 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:25 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:26 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:27 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager Service pvevm:101 is stopped
Jul 11 16:00:32 rgmanager #70: Failed to relocate pvevm:101; restarting locally
Jul 11 16:00:32 rgmanager Recovering failed service pvevm:101
Jul 11 16:00:32 rgmanager [pvevm] Move config for VM 101 to local node
Jul 11 16:00:33 rgmanager Service pvevm:101 started
---------------------
error from console

Executing HA migrate for VM 101 to node pvnode3
Trying to migrate pvevm:101 to pvnode3...Failure
TASK ERROR: command 'clusvcadm -M pvevm:101 -m pvnode3' failed: exit code 255


i am using nfs share for creating the disk for vmhosts. is any failover group i have to add .how to add it . i am expecting your help on this. i need to show the HA then only i am move the proxmox to testing phase for production.


Regards,
Alagu.S



Hi Tom,
Any advise & help on my issue.
 
There could be several reasons for this, find it out (read logs, etc.).

Debugging HA setup is complex, especially for beginners.
 
Hi Tom,

I am have 3 dell servers(pvnode1,pvnode2,pvnode3), and having NFS share for 50GB, i am trying to create HA between these 3nodes. i am creating vmhost on pvnode2 and pvnode3, when i am creating vmguest on these nodes i will only start on pvnode2 not in pvnode3.

i am trying to move the vmguest to pvnode3 i am getting below error on we interface.
--------------------------
Executing HA migrate for VM 101 to node pvnode3
Trying to migrate pvevm:101 to pvnode3...Failure
TASK ERROR: command 'clusvcadm -M pvevm:101 -m pvnode3' failed: exit code 255
----------------------------
below is my cluster.conf
<?xml version="1.0"?>
<cluster config_version="10" name="test">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.1" login="admin" name="node1-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.2" login="admin" name="node2-drac" passwd="XXXX" secure="1"/>
<fencedevice agent="fence_drac5" ipaddr="10.0.1.3" login="admin" name="node3-drac" passwd="XXXX" secure="1"/>
</fencedevices>
<clusternodes>
<clusternode name="pvnode1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="node1-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="node2-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="pvnode3" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="node3-drac" nodename="node1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>

is that i need to configure any failover domain, if need please let me how config it in proxmox...
My clustat output.

root@pvnode2:~# clustat
Cluster Status for testpv @ Thu Jul 11 17:47:30 2013
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
pvnode2 1 Online, Local, rgmanager
pvnode3 2 Online, rgmanager
pvnode1 3 Online, rgmanager

Service Name Owner (Last) State
------- ---- ----- ------ -----
pvevm:100 pvnode2 started
pvevm:101 pvnode2 started

--------------------corosync.log
Jul 10 18:32:43 corosync [CLM ] CLM CONFIGURATION CHANGE
Jul 10 18:32:43 corosync [CLM ] New Configuration:
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.7)
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.229)
Jul 10 18:32:43 corosync [CLM ] Members Left:
Jul 10 18:32:43 corosync [CLM ] Members Joined:
Jul 10 18:32:43 corosync [CLM ] CLM CONFIGURATION CHANGE
Jul 10 18:32:43 corosync [CLM ] New Configuration:
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.4)
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.7)
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.229)
Jul 10 18:32:43 corosync [CLM ] Members Left:
Jul 10 18:32:43 corosync [CLM ] Members Joined:
Jul 10 18:32:43 corosync [CLM ] r(0) ip(10.0.0.4)
Jul 10 18:32:43 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Jul 10 18:32:43 corosync [QUORUM] Members[3]: 1 2 3
Jul 10 18:32:43 corosync [QUORUM] Members[3]: 1 2 3
Jul 10 18:32:44 corosync [CPG ] chosen downlist: sender r(0) ip(10.0.0.229) ; members(old:2 left:0)
Jul 10 18:32:44 corosync [MAIN ] Completed service synchronization, ready to provide service.
Jul 10 18:35:08 corosync [QUORUM] Members[3]: 1 2 3
Jul 11 08:54:29 corosync [TOTEM ] Retransmit List: 10cf1
Jul 11 08:54:29 corosync [TOTEM ] Retransmit List: 10cf2
--------rgmanager log
Jul 11 16:00:24 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:25 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:26 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:27 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager [pvevm] Task still active, waiting
Jul 11 16:00:28 rgmanager Service pvevm:101 is stopped
Jul 11 16:00:32 rgmanager #70: Failed to relocate pvevm:101; restarting locally
Jul 11 16:00:32 rgmanager Recovering failed service pvevm:101
Jul 11 16:00:32 rgmanager [pvevm] Move config for VM 101 to local node
Jul 11 16:00:33 rgmanager Service pvevm:101 started

i have provided only error reported log not all the sucess logs

expecting you valuable time for this issue help...

Regards,
Alagu.S
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!