HA Failover domains and services

chrisjunkie

New Member
Hi guys,

Have had proxmox running for about a week or so now - love the interface! I have gone about modifying the way HA works with two nodes according to this amazing tutorial I stumbled across before finding Proxmox: https://alteeve.com/w/Red_Hat_Cluster_Service_2_Tutorial

Anyway, I have Fencing, failover domains, clustered lvm + drbd and services all working very nicely. When the node boots, DRBD is started, waits for the second node then they both start clvm and go on their merry way. My problem is trying to stop the VMs starting if one host died but the other stayed alive.

Scenario:
1) Purposely crash node1: echo c > /proc/sysrq-trigger
2) Node2 restarts Node1's VMs locally on itself after fencing and restarting node1
3) Node1 comes back up and Node2 tries to migrate node1's VMs back BEFORE node1's storage has come up
4) migration never finishes... (but the VM never crashes. Stays running on original host)

When doing this with libvirt, one just needs to have resources laid out like this:
Code:
<resources>
            <script file="/etc/init.d/drbd" name="drbd"/>
            <script file="/etc/init.d/clvmd" name="clvmd"/>
            <script file="/etc/init.d/gfs2" name="gfs2"/>
            <script file="/etc/init.d/libvirtd" name="libvirtd"/>
</resources>
  ....
<service name="storage_an01" autostart="1" domain="only_an01" exclusive="0" recovery="restart">
                        <script ref="drbd">
                                <script ref="clvmd">
                                        <script ref="gfs2">
                                                <script ref="libvirtd"/>
                                        </script>
                                </script>
                        </script>
 </service>
This ensures that libvirt is only able to start and be able to accept migrations AFTER drbd and clvm have started (ignore GFS, we don't use that in Proxmox) What I want to be able to do is something similar but can't for the life of me figure out which pve init script does the VM side of things.


For some more info, here's cluster.conf with hostnames changed etc:

Code:
<?xml version="1.0"?>
<cluster config_version="13" name="cluster01">
  <cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_ilo" ipaddr="192.168.200.201" login="fence" name="vmx01fence" passwd="ClusterFence"/>
    <fencedevice agent="fence_ilo" ipaddr="192.168.200.202" login="fence" name="vmx01fence" passwd="ClusterFence"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="vmx01" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device action="reboot" name="vmx01fence"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="vmx01" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device action="reboot" name="vmx01fence"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fence_daemon post_join_delay="60"/>
  <rm log_level="5">
    <resources>
      <script file="/etc/init.d/drbd" name="drbd"/>
      <script file="/etc/init.d/clvm" name="clvm"/>
    </resources>
    <failoverdomains>
      <failoverdomain name="only_vmx01" nofailback="1" ordered="0" restricted="1">
        <failoverdomainnode name="vmx01"/>
      </failoverdomain>
      <failoverdomain name="only_vmx01" nofailback="1" ordered="0" restricted="1">
        <failoverdomainnode name="vmx01"/>
      </failoverdomain>
      <failoverdomain name="vmx01_primary" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="vmx01" priority="1"/>
        <failoverdomainnode name="vmx01" priority="2"/>
      </failoverdomain>
      <failoverdomain name="vmx01_primary" nofailback="0" ordered="1" restricted="1">
        <failoverdomainnode name="vmx01" priority="2"/>
        <failoverdomainnode name="vmx01" priority="1"/>
      </failoverdomain>
    </failoverdomains>
    <service autostart="1" domain="only_vmx01" exclusive="0" name="storage_vmx01" recovery="restart">
      <script ref="drbd">
        <script ref="clvm"/>
      </script>
    </service>
    <service autostart="1" domain="only_vmx01" exclusive="0" name="storage_vmx01" recovery="restart">
      <script ref="drbd">
        <script ref="clvm"/>
      </script>
    </service>
    <pvevm autostart="0" vmid="100" domain="vmx01_primary" exclusive="0" recovery="restart" max_restarts="2" restart_expire_time="600"/>
    <pvevm autostart="0" vmid="101" domain="vmx01_primary" exclusive="0" recovery="restart" max_restarts="2" restart_expire_time="600"/>
    <pvevm autostart="0" vmid="102" domain="vmx01_primary" exclusive="0" recovery="restart" max_restarts="2" restart_expire_time="600"/>
  </rm>
</cluster>

pveversion -v:
Code:
pve-manager: 2.0-45 (pve-manager/2.0/8c846a7b)
running kernel: 2.6.32-10-pve
proxmox-ve-2.6.32: 2.0-63
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.88-2pve2
clvm: 2.02.88-2pve2
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-1
pve-cluster: 1.0-25
qemu-server: 2.0-28
pve-firmware: 1.0-15
libpve-common-perl: 1.0-21
libpve-access-control: 1.0-17
libpve-storage-perl: 2.0-15
vncterm: 1.0-2
vzctl: 3.0.30-2pve2
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-8
ksm-control-daemon: 1.1-1

Can anyone PLEASE shed some light? I am pulling hair out here :p Is this sort of functionality to be expected in future proxmox releases?

Cheers
Chris
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!