Two Node HA Cluster. "Rebooting loop".

megap

New Member
Oct 1, 2014
20
0
1
Hello forum.

I'm configuring a two node HA cluster with DRBD.

For fencing devices I'm using idrac7 on both nodes.

My cluster.conf file is:

Code:
 cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="19" name="cluster">
  <cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"  ipaddr="192.168.1.211" login="root" name="node1-drac"  passwd="root" secure="1"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"  ipaddr="192.168.1.212" login="root" name="node2-drac"  passwd="root" secure="1"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="node1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="node1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node2" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="node2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <pvevm autostart="1" vmid="100"/>
    <pvevm autostart="1" vmid="101"/>
  </rm>
</cluster>

My cluster info is:

Code:
Cluster Status for cluster @ Thu Apr 30 11:48:51 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node1                                                           1 Online, Local, rgmanager
 node2                                                           2 Online, rgmanager

 Service Name                                              Owner (Last)                                              State
 ------- ----                                              ----- ------                                              -----
 pvevm:100                                                 node1                                                 started
 pvevm:101                                                 node1                                                 started

My Fence_tool ls:

Code:
fence_tool ls
fence domain
member count  2
victim count  0
victim now    0
master nodeid 2
wait state    none
members       1 2

I disabled acpi=off on nano /boot/grub/grub.cfg

When I disconnect ethernet cable from node1, fencing device is working ok and node1 is rebooted ok and VM are migrated to node2. But the problem is when node1 is available again, node2 is rebooted and VM migrated to node1 and all the time is the same. Both nodes enter in a fencing rebooting loop.

How could I solve this fencing reboot loop without using a quorum disk?

Thank you in advance
 
Is not any possibility with only 2 nodes?

Hi.

Somebody knows if is possible to create a 3 nodes clusters, where 2 nodes share a drbd disk and VMs, and the 3rd node only used for quorum and one local VM?

thanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!