DRBD or Ceph Storage on a Two Node Proxmox HA Cluster?

TJ101 · Mar 24, 2014

Hi all,

I would appreciate some advice

I have been looking at the following link:-

http://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster

This link tells you how to set-up a 2 node Proxmox High Availability cluster using DRBD (for network based RAID 1 replicated storage in Primary/Primary mode over LVM)

I've always been a bit wary of using DRBD after a few problems with split brain.

Now with the launch of Proxmox 3.2 we have the ability to use Ceph storage.

Can we use Ceph on just two nodes?

If so, has anyone tried this and can offer some inputs?

m.ardito · Mar 24, 2014

see many comments (and answers) here
http://forum.proxmox.com/threads/17638-Proxmox-VE-Ceph-Server-released-(beta)

Marco

e100 · Mar 25, 2014

Don't be wary of DRBD, splt-brain is a non-issue with a little planning and giving up a tiny bit of flexibility.
On the DRBD wiki article it suggests to have two DRBD volumes.
The VMs you want running on node A would be store on one DRBD volume and VMs for node B stored on the other DRBD volume.
When split brain happens you just tell DRBD to discard the data on the node that was not running the VMs stored on that DRBD volume and it will resync without issue.
It is also pretty rare to have a split brain.

CEPH is great too but from my testing DRBD performs much better running on the same hardware.
I also think CEPH might be a better choice if you want to use the HA feature in Proxmox.
If a split brain happened and before it was repaired HA kicked in and the VM is started on the other node the VM would be using stale data, this is the only problem with DRBD and HA that I know of.

Hope that helps

TJ101 · Mar 25, 2014

Hi

Thanks for the very useful input.

I followed this tutorial http://pve.proxmox.com/wiki/DRBD for creating the DRBD storage which resulted in the creation of just one DRBD volume.

After reading your comments I would like to split the DRBD volumes, one for each node.

It looks like I will need to remove the existing LVM volume from Proxmox, DRBD and split my physical disk in to two partitions.

Do you think this is a worthwhile exercise?

Can you provide any inputs as to how I safely reverse the tutorial steps above and start again

On another connected topic which I have raised in the thread below, I have a problem with iscsi and a third Quorum disk when using two nodes.

Please see:-

http://forum.proxmox.com/threads/18...vailability-cluster-problems-with-Quorum-Disk

iscsiadm -m node -T iqn.BLAHBLAH -p IPOFiSCSISERVER -l is not persistent on Node B and does not survive a reboot.
When I check the /etc/iscsi/iscsid.conf on Node A, I see it is set to node.startup=automatic, whereas on Node B this is set to manual.

Presumably this setting has been changed by the Proxmox Cluster manager?

I therefore created the iscsi target in the Proxmox web interface and assigned it to both Proxmox Cluster nodes.

When the Proxmox cluster starts on Node A, the scsi target mounts before Proxmox checks for the Quorum disk due to the node.startup = automatic

However on Node B, Proxmox checks for the Quorum Disk before it mounts the iscsi target due to the node.startup = manual and the fact the storage is mounted after the initial check.

After Node B boots, If I run /etc/init.d/cman reload I can then see the Quorum disk running clustat.

Is it a problem if I change node.startup = manual to automatic on Node B?

udo · Mar 25, 2014

Hi,
I would suggest ceph only for 3 nodes! (but for an drbd-cluster an small third node will also be an good decision).

Udo

e100 · Mar 26, 2014

TJ101 said:
After reading your comments I would like to split the DRBD volumes, one for each node.
It looks like I will need to remove the existing LVM volume from Proxmox, DRBD and split my physical disk in to two partitions.
Do you think this is a worthwhile exercise?

This section on the wiki explains why it would be best to have two volumes and how to recover from a split-brain when you have one or two volumes.
http://pve.proxmox.com/wiki/DRBD#Recovery_from_communication_failure

IMHO it is worth the effort to start over and create two volumes. It will take less time to start over than recovering from two split-brains when using a single DRBD volume.

TJ101 · Mar 26, 2014

Hi, Could you offer a procedure to safely reverse a single DRBD partition having followed the guide?

TJ101 · Mar 26, 2014

Hi,

Further to my last post I think I have worked out a procedure to remove the existing DRBD resource and will post my results once completed.

I am however faced with another problem.

Before proceeding I need to remove a VM currently using the DRBD storage.

Even though I have shut-down the VM, I can not remove it

When I try to remove the VM, I see VM XXX destroy in the tasks log with an Error: unexpected status.

When I look at the output of the task I see:-

trying to aquire cfs lock 'storage-drbd-storage' ...TASK ERROR: got lock request timeout

I can start and stop the VM normally.

How do I resolve this?

e100 · Mar 27, 2014

If you used internal metadata storage when setting up DRBD (you did if you followed the wiki) then you will most likely need to zero the end of the partition before being able to setup a new DRBD volume. When you split the area into two partitions the one that has the end of the original will be the one that has a problem. or maybe it is possible to use a 'force' option when you run across the problem of existing metadata.

Regarding your cfs lock error, do you have quorum? I believe that is thrown when /etc/pve is read only which happens when quorum is lost.
This will tell you if you have quorum or not:

Code:

clustat

TJ101 · Mar 27, 2014

Hiya,

What if I remove the LVM storage from Proxmox, then DRBD by shutting down the service and then remove the underlying physical disk Partition?

e100 · Mar 27, 2014

http://www.drbd.org/users-guide/ch-internals.html#s-internal-meta-data

DRBD metadata is stored at the end of the device (partition in your case)
You can remove the storage from proxmox, stop DRBD and remove the partition containing DRBD but the data will still be there.
When you make two new partitions, the second one will have the old DRBD metadata at the end of its partition. (assuming you make the partitions in the same physical space as the single partition you had before)
That left over metadata will cause some issues when trying to setup DRBD.
This is not a big deal but when you run some commands and they fail because of existing metadata you now know what the problem is so you can fix it.

Search

Search

DRBD or Ceph Storage on a Two Node Proxmox HA Cluster?

TJ101

New Member

m.ardito

Active Member

e100

Renowned Member

TJ101

New Member

udo

Distinguished Member

e100

Renowned Member

TJ101

New Member

TJ101

New Member

e100

Renowned Member

TJ101

New Member

e100

Renowned Member