How likely is "split-brain" when a cluster is not using DRBD?

Gringo.Frenzy

New Member
Sep 11, 2013
5
0
1
We have a two-node cluster (PVE 2.2) that was set up purely for the convenience of managing both nodes from a single interface.
We do not have any VMs that are managed by HA, so I was wondering: would I ever suffer from "split-brain" effect if I set "Expected Votes = 1" on a permanent basis?

I need to remove Node 2 from the cluster so that I can rebuild it to PVE 3.4 and then I will either manually move all VM confs and images across, or rebuild the servers from scratch.
This is likely to take me a few days, so I'd like to leave Node 1 operational while I do this. (Don't want to do in-place upgrade as we have an IO bottleneck that I belive is caused by PVE config)

Thanks for any advice.
 
Last edited:
Ok, so I did some more digging on my cluster config. This is what I found:

1) DRBD is NOT in use on either of my servers; there is no /etc/drbd.conf file, there are no drbd devices in /dev, and when I run 'vgscan' the only volume group I see is

Code:
root@Node-1:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2

root@Node-2:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2

I know that none of VMs are configured to failover, so maybe this isn't a surprise.

2) As a (perhaps reckless) test, I decided to pull the net connection on the back of Node-2 while I am connected to the web GUI for Node-1.
This doesn't seem to cause any noticeable problems, but the side effect is that all VMs vanish from the Web GUI.
(The Nodes remain visible but the only items which remain visible below them are the mounted NFS shares).
When I plug the network connection back in, the VMs return to the DataCenter view within a few minutes.


3) We seem to have a very basic (incomplete?) cluster.conf file

Code:
<?xml version="1.0"?>
<cluster name="Cluster01" config_version="3">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>

  <clusternodes>
  <clusternode name="Node-2" votes="1" nodeid="1"/>
  <clusternode name="Node-1" votes="1" nodeid="2"/></clusternodes>

</cluster>

Note that there is no defined fencing, and no 'two_node=1' parameter.

So my question is this:
Given that we are NOT using DRBD
AND there is no defined fencing
AND I will be formatting Node-2 very soon (all the VMs on it are disposable test machines)
AND that I will format Node-1 as soon as Node-2 has been rebuilt and had VMs transferred to it
WHAT would happen if I assigned 2 quorum votes to Node-1 and then took Node-2 offline?
Would the VMs remain visible in the web GUI?
Will I be able to start, stop, and backup?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!