OpenVZ on DRBD: how to failover?

Stephan J. · Apr 27, 2012

Hello,

First off, thanks for this amazing software. It's quite a pleasure to work with it, even though my head is spinning from learning all this new stuff.

We have a ha-cluster with three nodes. Two nodes will be running containers and should be able to take over the others containers in case of failure.
I setup drbd-partitions on these two that sync vice-versa. So far so good.

What I can't wrap my head around and can't find an answer is: how do I (automatically) failover a VZ-Container that is stored on a drbd-partition?
Can this not be done? I've searched for hours but can't find an answer.

What I tried/am trying:
drbd primary-primary:
doesn't seem to work because clustered filesystems don't work with OpenVZ. Tried with gluster 3.2, but container won't start and seems to be quite slow.

drbd primary-secondary:
I changed cluster.conf and drbd-partition failover works but the container-migration fails. How can I add a resource in cluster.conf to migrate the vz-configuration and startup the container after drbd has successfully failed over?

I'd really appreciate some input, even a link to some manual. I'll gladly rtfm, but currently I have a bit information overload and a steamy head.

Thanks a lot and kind regards

-Stephan

Stephan J. · May 10, 2012

To answer myself:
I have figured it out in the meantime. I edited /etc/pve/cluster.conf with the appropriate resources and can now failover manually (via shell) and automatically. Reading the Red Hat Cluster Docs helped.
Migration via Web-Interface does not work, unfortunately, but I guess we can live with that. Automatic failover is the important thing.

Thanx Proxmox. Love it!

bread-baker · May 11, 2012

Can you post more details on your set up ? For instance the scripts?

Also - are you using primary/primary drbd?

Stephan J. · May 17, 2012

I didn't change any scripts, but here is our cluster.conf, with an example service. (three nodes, two of which share drbd/containers and provide failover for eachother). Each VZ-Container gets its own DRBD-partition (pve-lvm->DRBD-Partition->ext4->VZ) and is restarted on the other node if anything goes wrong. I can also relocate with clusvcadm -r service:ha_host -m s02. DRBD is setup active-passive.

It would, of course, be cool to have live migration, but I haven't figured out how to do that. Feedback/Input is very welcome.

Code:

<?xml version="1.0"?>
<cluster config_version="67" name="pve-cluster">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi1" passwd="xxx" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi2" passwd="xxx" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="xx.xx.xx.xx" lanplus="1" login="ipmi" name="ipmi3" passwd="xxx" power_wait="5"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="s01" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="s02" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="s03" nodeid="3" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <failoverdomains>
      <failoverdomain name="fo_s01" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="s01" priority="1"/>
        <failoverdomainnode name="s02" priority="2"/>
      </failoverdomain>
      <failoverdomain name="fo_s02" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="s02" priority="1"/>
        <failoverdomainnode name="s01" priority="2"/>
      </failoverdomain>
    </failoverdomains>
    <resources/>
    <service autostart="1" domain="fo_s01" exclusive="0" name="ha_host" recovery="relocate">
      <drbd name="res_host" resource="host">
        <fs device="/dev/drbd/by-res/host" fstype="ext4" mountpoint="/mnt/vz-host" name="fs_host" options="noatime">
          <pvevm autostart="0" vmid="101"/>
        </fs>
      </drbd>
    </service>
  </rm>
</cluster>

regards -Stephan

dietmar · May 17, 2012

Stephan J. said:
It would, of course, be cool to have live migration, but I haven't figured out how to do that. Feedback/Input is very welcome.

rgmanager does not support live migration if the VM is configured inside a service.

Stephan J. · May 17, 2012

Thanks for the reply. I guess it is not possible then with my current setup?

If I configure the pvevm outside the service, migration fails (also when I add 'depend="service:ha_host"' to the <pvevm..>).

geejay · Mar 12, 2013

Stephan J. said:
I didn't change any scripts, but here is our cluster.conf, with an example service. (three nodes, two of which share drbd/containers and provide failover for eachother). Each VZ-Container gets its own DRBD-partition (pve-lvm->DRBD-Partition->ext4->VZ) and is restarted on the other node if anything goes wrong. I can also relocate with clusvcadm -r service:ha_host -m s02. DRBD is setup active-passive.

regards -Stephan

Stephan,

can you please clarify whether it is necessary to put each Container into its own partition ? It is a waste of disk space.

Thanks

Geejay

Stephan J. · Mar 20, 2013

geejay said:
Stephan,

can you please clarify whether it is necessary to put each Container into its own partition ? It is a waste of disk space.

Thanks

Geejay

Hi Geejay,

Yes, unfortunately, it is necessary IMHO, because each DRBD-Partition is its own ressource in the cluster. If a container needs to be started on another node, the DRBD-ressource also needs to be moved and only the partition for the particular container, not affecting all others.

I would very much prefer to have all containers on one large partition but that does not seem to be feasible AFAIK. I'm always open to suggestions

regards

-Stephan

geejay · Mar 20, 2013

Stephan J. said:
If a container needs to be started on another node, the DRBD-ressource also needs to be moved and only the partition for the particular container, not affecting all others.

Of course if it is your requirement to move individual containers then you have to do that. We are using currently a setup with DRBD and heartbeat, without Proxmox, and have all containers in one partition. Usually it is not necessary to move containers, when it is necessary one can copy them with rsync.

Geejay

geejay · Mar 27, 2013

Stephan,
I dont understand it. Can you please explain how your secondary node becomes primary in case of failure ?
As far as I tested, the Redhat cluster manager cannot promote the secondary node up to primary.

Please explain how this technically works within the Redhat cluster manager.

Thanks

Geejay

Search

Search

OpenVZ on DRBD: how to failover?

Stephan J.

New Member

Stephan J.

New Member

bread-baker

Member

Stephan J.

New Member

dietmar

Proxmox Staff Member

Stephan J.

New Member

geejay

New Member

Stephan J.

New Member

geejay

New Member

geejay

New Member