OpenVZ on DRBD: how to failover?

Stephan J.

New Member
Apr 24, 2012
17
1
1
Hello,

First off, thanks for this amazing software. It's quite a pleasure to work with it, even though my head is spinning from learning all this new stuff.

We have a ha-cluster with three nodes. Two nodes will be running containers and should be able to take over the others containers in case of failure.
I setup drbd-partitions on these two that sync vice-versa. So far so good.

What I can't wrap my head around and can't find an answer is: how do I (automatically) failover a VZ-Container that is stored on a drbd-partition?
Can this not be done? I've searched for hours but can't find an answer.

What I tried/am trying:
drbd primary-primary:
doesn't seem to work because clustered filesystems don't work with OpenVZ. Tried with gluster 3.2, but container won't start and seems to be quite slow.

drbd primary-secondary:
I changed cluster.conf and drbd-partition failover works but the container-migration fails. How can I add a resource in cluster.conf to migrate the vz-configuration and startup the container after drbd has successfully failed over?

I'd really appreciate some input, even a link to some manual. I'll gladly rtfm, but currently I have a bit information overload and a steamy head.

Thanks a lot and kind regards

-Stephan
 
To answer myself:
I have figured it out in the meantime. I edited /etc/pve/cluster.conf with the appropriate resources and can now failover manually (via shell) and automatically. Reading the Red Hat Cluster Docs helped.
Migration via Web-Interface does not work, unfortunately, but I guess we can live with that. Automatic failover is the important thing.

Thanx Proxmox. Love it!
 
Can you post more details on your set up ? For instance the scripts?

Also - are you using primary/primary drbd?
 
I didn't change any scripts, but here is our cluster.conf, with an example service. (three nodes, two of which share drbd/containers and provide failover for eachother). Each VZ-Container gets its own DRBD-partition (pve-lvm->DRBD-Partition->ext4->VZ) and is restarted on the other node if anything goes wrong. I can also relocate with clusvcadm -r service:ha_host -m s02. DRBD is setup active-passive.

It would, of course, be cool to have live migration, but I haven't figured out how to do that. Feedback/Input is very welcome.

Code:
<?xml version="1.0"?>
<cluster config_version="67" name="pve-cluster">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi1" passwd="xxx" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi2" passwd="xxx" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi3" passwd="xxx" power_wait="5"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="s01" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="s02" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="s03" nodeid="3" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <failoverdomains>
      <failoverdomain name="fo_s01" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="s01" priority="1"/>
        <failoverdomainnode name="s02" priority="2"/>
      </failoverdomain>
      <failoverdomain name="fo_s02" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="s02" priority="1"/>
        <failoverdomainnode name="s01" priority="2"/>
      </failoverdomain>
    </failoverdomains>
    <resources/>
    <service autostart="1" domain="fo_s01" exclusive="0" name="ha_host" recovery="relocate">
      <drbd name="res_host" resource="host">
        <fs device="/dev/drbd/by-res/host" fstype="ext4" mountpoint="/mnt/vz-host" name="fs_host" options="noatime">
          <pvevm autostart="0" vmid="101"/>
        </fs>
      </drbd>
    </service>
  </rm>
</cluster>

regards -Stephan
 
Thanks for the reply. I guess it is not possible then with my current setup?

If I configure the pvevm outside the service, migration fails (also when I add 'depend="service:ha_host"' to the <pvevm..>).
 
I didn't change any scripts, but here is our cluster.conf, with an example service. (three nodes, two of which share drbd/containers and provide failover for eachother). Each VZ-Container gets its own DRBD-partition (pve-lvm->DRBD-Partition->ext4->VZ) and is restarted on the other node if anything goes wrong. I can also relocate with clusvcadm -r service:ha_host -m s02. DRBD is setup active-passive.

regards -Stephan

Stephan,

can you please clarify whether it is necessary to put each Container into its own partition ? It is a waste of disk space.

Thanks

Geejay
 
Stephan,

can you please clarify whether it is necessary to put each Container into its own partition ? It is a waste of disk space.

Thanks

Geejay
Hi Geejay,

Yes, unfortunately, it is necessary IMHO, because each DRBD-Partition is its own ressource in the cluster. If a container needs to be started on another node, the DRBD-ressource also needs to be moved and only the partition for the particular container, not affecting all others.

I would very much prefer to have all containers on one large partition but that does not seem to be feasible AFAIK. I'm always open to suggestions :)

regards

-Stephan
 
If a container needs to be started on another node, the DRBD-ressource also needs to be moved and only the partition for the particular container, not affecting all others.

Of course if it is your requirement to move individual containers then you have to do that. We are using currently a setup with DRBD and heartbeat, without Proxmox, and have all containers in one partition. Usually it is not necessary to move containers, when it is necessary one can copy them with rsync.

Geejay
 
Stephan,
I dont understand it. Can you please explain how your secondary node becomes primary in case of failure ?
As far as I tested, the Redhat cluster manager cannot promote the secondary node up to primary.

Please explain how this technically works within the Redhat cluster manager.

Thanks

Geejay
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!