Two node active/passive cluster with DRBD.Is it fencing necessary?

acidrop

Renowned Member
Jul 17, 2012
204
6
83
Hello,

I have a two node cluster setup with two drbd resources in active/passive mode.This is a test setup.
The first drbd resource (r0) is primary at node A and the second (r1) is primary on node B.
I have created two different vm, one on each node, located at drbd r0 and r1 (lvm) respectively.

The problem is that if I poweroff for example node A and promote r0 as primary on node B and try to manually move vm conf files from node A to node B by giving:

Code:
mv /etc/pve/nodes/proxmox1/qemu-server/100.conf /etc/pve/nodes/proxmox2/qemu-server/

and I get: mv:
Code:
cannot move `100.conf' to `/etc/pve/nodes/proxmox2/qemu-server/100.conf': Device or resource busy

I noticed also that rgmanager is not running on the nodes even if I restart the service.
Also on syslog I get:

Code:
[FONT=arial][COLOR=#000000]Feb  4 13:22:53 proxmox2 pmxcfs[1364]: [status] crit: cpg_send_message failed: 9[/COLOR][/FONT]

Code:
root@proxmox1:~# pvecm nNode  Sts   Inc   Joined               Name
   1   M   3744   2013-02-04 12:55:24  proxmox1
   2   M   3744   2013-02-04 12:55:24  proxmox2
root@proxmox1:~# clustat
Cluster Status for pvecluster1 @ Mon Feb  4 13:34:39 2013
Member Status: Quorate


 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 proxmox1                                                            1 Online, Local
 proxmox2                                                            2 Online

Code:
root@proxmox1:~# pveversion -v
pve-manager: 2.2-32 (pve-manager/2.2/3089a616)
running kernel: 2.6.32-17-pve
proxmox-ve-2.6.32: 2.2-83
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-14-pve: 2.6.32-74
pve-kernel-2.6.32-17-pve: 2.6.32-83
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-34
qemu-server: 2.0-71
pve-firmware: 1.0-21
libpve-common-perl: 1.0-40
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.3-7
ksm-control-daemon: 1.1-1

My cluster.conf :

Code:
<?xml version="1.0"?><cluster config_version="5" name="pvecluster1">
  <cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
    <clusternodes>
    <clusternode name="proxmox1" nodeid="1" votes="1">
    </clusternode>
    <clusternode name="proxmox2" nodeid="2" votes="1"/>
  </clusternodes>
  <rm/>
</cluster>

Is it necessary to setup fencing in this scenario?

Thank you

 
I doubt that active/passive can be configured. our HA stack cannot manage DRBD active/passiv.
 
Actually I don't want vm to be managed by HA automatically. I just want if a node fails to be able to move the vm located on it's drbd resource manually to the other node.
Is it possible?
 
Actually I don't want vm to be managed by HA automatically. I just want if a node fails to be able to move the vm located on it's drbd resource manually to the other node.
Is it possible?
Yes, if you reach quorum again.
Whitout quorum /etc/pve is write protected - so you can't move config-files.
With "pvecm expected 1" you should be able to move the configs also on the remaining node.

But why not active/active? You will have much more fun with live--migration... Normaly live migration is used often instead of recover an failing node!

Udo
 
Yes, if you reach quorum again.
Whitout quorum /etc/pve is write protected - so you can't move config-files.
With "pvecm expected 1" you should be able to move the configs also on the remaining node.

Thank you! that did the trick...

But why not active/active? You will have much more fun with live--migration... Normaly live migration is used often instead of recover an failing node!

Udo

I don't have a fencing mechanism for now so i do not want to risk data loss.I have tried also active/active and it's fun though :)
Am I safe with active/passive without fencing or still there is a possibility to loose data?
 
Thank you! that did the trick...



I don't have a fencing mechanism for now so i do not want to risk data loss.I have tried also active/active and it's fun though :)
Am I safe with active/passive without fencing or still there is a possibility to loose data?

Hi,
don't forget to change expected to 2 again after resolv the failed node.

Without quorum-disk fencing will not work with an 2-node cluster - because the first thing is quorum-lost.

You need manual doing, and if you know what you doing, there is with active/active not an higher risk than with active/passive.
Risky is to work with two nodes and expected nodes = 1!! In this case you can wrote to the same data from both machines (if one is not realy dead and you (or ha) start the VM on the other node).

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!