HA Cluster IP no longer accessible after outage

cscracker

New Member
Jun 13, 2015
9
1
1
I have my 4-node HA cluster up and running, and it's been working great. Last night I had a power issue (malfunctioning UPS) which caused my cluster network to go down. All nodes lost quorum. After restoring the network, everything came back up just fine, except that now I can no longer access the cluster IP. Each node works fine, they all talk to each other, and show no visible errors. I can migrate machines and everything looks happy, so I'm not sure what to do. Just in case, I rebooted each node in the cluster one at a time and it's still not accessible. Where should I look from here?

Here's my cluster.conf (passwords changed):
Code:
root@pve1:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="7" name="c6100-cluster-1">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" ipaddr="10.99.99.21" lanplus="1" login="root" name="ipmi1" passwd="asdf" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="10.99.99.22" lanplus="1" login="root" name="ipmi2" passwd="asdf" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="10.99.99.23" lanplus="1" login="root" name="ipmi3" passwd="asdf" power_wait="5"/>
    <fencedevice agent="fence_ipmilan" ipaddr="10.99.99.24" lanplus="1" login="root" name="ipmi4" passwd="asdf" power_wait="5"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="pve1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pve2" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pve3" nodeid="3" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi3"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pve4" nodeid="4" votes="1">
      <fence>
        <method name="1">
          <device name="ipmi4"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <service autostart="1" exclusive="0" name="HAClusterIP" recovery="relocate">
      <ip address="192.168.1.25"/>
    </service>
    <pvevm autostart="1" vmid="105"/>
    <pvevm autostart="1" vmid="103"/>
  </rm>
</cluster>
 
Yes, all the services listed in the web interface are running on all servers, as well as pveproxy and pvedaemon.