Two node active/passive cluster with DRBD.Is it fencing necessary?

acidrop · Feb 4, 2013

Hello,

I have a two node cluster setup with two drbd resources in active/passive mode.This is a test setup.
The first drbd resource (r0) is primary at node A and the second (r1) is primary on node B.
I have created two different vm, one on each node, located at drbd r0 and r1 (lvm) respectively.

The problem is that if I poweroff for example node A and promote r0 as primary on node B and try to manually move vm conf files from node A to node B by giving:

Code:

mv /etc/pve/nodes/proxmox1/qemu-server/100.conf /etc/pve/nodes/proxmox2/qemu-server/

and I get: mv:

Code:

cannot move `100.conf' to `/etc/pve/nodes/proxmox2/qemu-server/100.conf': Device or resource busy

I noticed also that rgmanager is not running on the nodes even if I restart the service.
Also on syslog I get:

Code:

[FONT=arial][COLOR=#000000]Feb  4 13:22:53 proxmox2 pmxcfs[1364]: [status] crit: cpg_send_message failed: 9[/COLOR][/FONT]

Code:

root@proxmox1:~# pvecm nNode  Sts   Inc   Joined               Name
   1   M   3744   2013-02-04 12:55:24  proxmox1
   2   M   3744   2013-02-04 12:55:24  proxmox2
root@proxmox1:~# clustat
Cluster Status for pvecluster1 @ Mon Feb  4 13:34:39 2013
Member Status: Quorate


 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 proxmox1                                                            1 Online, Local
 proxmox2                                                            2 Online

Code:

root@proxmox1:~# pveversion -v
pve-manager: 2.2-32 (pve-manager/2.2/3089a616)
running kernel: 2.6.32-17-pve
proxmox-ve-2.6.32: 2.2-83
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-14-pve: 2.6.32-74
pve-kernel-2.6.32-17-pve: 2.6.32-83
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-34
qemu-server: 2.0-71
pve-firmware: 1.0-21
libpve-common-perl: 1.0-40
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.3-7
ksm-control-daemon: 1.1-1

My cluster.conf :

Code:

<?xml version="1.0"?><cluster config_version="5" name="pvecluster1">
  <cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
    <clusternodes>
    <clusternode name="proxmox1" nodeid="1" votes="1">
    </clusternode>
    <clusternode name="proxmox2" nodeid="2" votes="1"/>
  </clusternodes>
  <rm/>
</cluster>

Is it necessary to setup fencing in this scenario?

Thank you

tom · Feb 4, 2013

I doubt that active/passive can be configured. our HA stack cannot manage DRBD active/passiv.

acidrop · Feb 4, 2013

Actually I don't want vm to be managed by HA automatically. I just want if a node fails to be able to move the vm located on it's drbd resource manually to the other node.
Is it possible?

udo · Feb 4, 2013

acidrop said:
Actually I don't want vm to be managed by HA automatically. I just want if a node fails to be able to move the vm located on it's drbd resource manually to the other node.
Is it possible?

Yes, if you reach quorum again.
Whitout quorum /etc/pve is write protected - so you can't move config-files.
With "pvecm expected 1" you should be able to move the configs also on the remaining node.

But why not active/active? You will have much more fun with live--migration... Normaly live migration is used often instead of recover an failing node!

Udo

acidrop · Feb 4, 2013

udo said:
Yes, if you reach quorum again.
Whitout quorum /etc/pve is write protected - so you can't move config-files.
With "pvecm expected 1" you should be able to move the configs also on the remaining node.

Thank you! that did the trick...

udo said:
But why not active/active? You will have much more fun with live--migration... Normaly live migration is used often instead of recover an failing node!

Udo

I don't have a fencing mechanism for now so i do not want to risk data loss.I have tried also active/active and it's fun though

Am I safe with active/passive without fencing or still there is a possibility to loose data?

udo · Feb 4, 2013

acidrop said:
Thank you! that did the trick...

I don't have a fencing mechanism for now so i do not want to risk data loss.I have tried also active/active and it's fun though
Am I safe with active/passive without fencing or still there is a possibility to loose data?

Hi,
don't forget to change expected to 2 again after resolv the failed node.

Without quorum-disk fencing will not work with an 2-node cluster - because the first thing is quorum-lost.

You need manual doing, and if you know what you doing, there is with active/active not an higher risk than with active/passive.
Risky is to work with two nodes and expected nodes = 1!! In this case you can wrote to the same data from both machines (if one is not realy dead and you (or ha) start the VM on the other node).

Udo

Search

Search

Two node active/passive cluster with DRBD.Is it fencing necessary?

acidrop

Renowned Member

tom

Proxmox Staff Member

acidrop

Renowned Member

udo

Distinguished Member

acidrop

Renowned Member

udo

Distinguished Member

We value your privacy