3-Node Cluster Setup with non-switched interfaces

akaramanlidis

New Member
Mar 13, 2018
6
0
1
Germany
Good morning everyone,

i just configured my back interfaces (10GbE) of my three new servers.
To provide enought redundancy and availability i connected my three servers the following way:

Hardware Config:
- Every Node has 2 10GbE SFP+ Ports that should be used only for Cluster communication, migration ... These Ports are non switched.
- NFS Traffic goes throught 2 x Gigabit interfaces (yes, they are slower, but in our setup that actually makes sense at the moment)
- Traffic for internet, proxmox frontend, services goes thought 2 x Gigabit Interfaces as well.

Lets say the nodes are named node1, node2, node3.
The 10GbE Ports are connected in the following way:

node1 is direct attached via TwinAx Cables to node2. They communicate in 10.0.0.0/28
node1 is direct attached via TwinAx Cables to node3. They communicate in 10.0.0.16/28
node3 is direct attached via TwinAx Cables to node2. They communicate in 10.0.0.32/28

Now I'm at the point of creating the Cluster, but i ran in a little logic problem here.

If i create the cluster from node1 (10.10.0.1 ens1f0), i have to use the IP's 10.0.0.2 (node2 ens1f1) to join node2 to the cluster. If i want to join node3 (10.0.0.18 ens1f0) i will have the source-IP 10.0.0.17 (node1 ens1f1).

Now i would have a working cluster that proxmox could work with.
But i can't imagine how a quorum could possibly be relieable.
If node3 looks for node2 (or node2 for node3) with the initial configured IP for the ProxMox Cluster they shouldn't be able to communicate because they cannot reach each other from the IP's in the Node-"Database" because only node1 can communicate with node2 and node3 this way.

In a switched environment this isn't (obviously) a problem at all.

For better understanding i will try to attach a pic with cabeling and IP's.
proxmox2.png
Is it even possible to setup a proxmox HA-Cluster with than kind of network-cabeling / Layer?
Or, more basic. Is it possible to setup a 3 Node Cluster without a switched network environment?

Thanks in advance,
Alex
 
Hi,

For this:

node1 is direct attached via TwinAx Cables to node2. They communicate in 10.0.0.0/28
node1 is direct attached via TwinAx Cables to node3. They communicate in 10.0.0.16/28
node3 is direct attached via TwinAx Cables to node2. They communicate in 10.0.0.32/28




Basically you have a triangle cabling with nod 1,2,3. My oppinion is YES, you could, if you will try to use ospf on each node. Each node will have 2 links, (with/without different ospf priorities/costs) to any other nodes. So in case of any link will be broken, ospf will stop to use the broken link and it will re-route the traffic using the second link(in msec time). When the broken link will be working again, then all will be like before the broken link event.
 
Last edited:
That did work the way i expected it to.
Thanks!

Yes it work, but you will not have any routing redundance, if you will use the routing method. Ospf will give you network path redundancy(in msec). Even if you move a VM from node x to node y(online migration) and in middle of this process one network link is broken.

By the way, you can buy a cheap 10 Gb switch(16 ports).
 
Last edited:
I don't really understand why i have the need to have a routing redundancy with bond-mode broadcast.
It should be fault-trolerant that way.

The company, that i need to build this setup in, hasn't got much resources to buy or lease a bunch of Switches.
So we made the decition to buy two TP-Link T1700-28TQ. For our needs that was the best practise.
The only systems that need 10GbE in FRONT Traffic are our NFS-Cluster and the Backup-Storage system.
BACK Traffic is only live-migration. And that really hurts, because the change-rate in our RAM is super high.
1Gbit/s is almost not capable of copying the RAM to another Hypervisor because of that massive amout of changes.

Our storage-system isn't really good too. Slow SATA disks, no cache, DRBD protocol C and slow HPe Raid Controllers.
I plan to improve that as soon as i can. But at the moment, even 1 hypervisor is able to get the storage system to it's limits.
 
I was talking about the routing variant who is present here(not bonding):

https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server

For about 400$ you can buy CRS317-1G-16S+RM (16 x 10 Gbit/SFP+)

And i was talking about "Methode 1" provided in the same link.

Yeah, we could have bought two of this switches, but than we would ran into the problem not to have any RJ45 Ports for other infrastructure in other VLAN's. As i said, the only services that need switched Interfaces are NFS from the Storage Servers, which take 2 SFP+ Ports per Switch and the Backup-Nodes in case of the need to restore something.
Two TP-Link T1700-28TQ were the best choice for us in terms of price, features and usability.
 
Last edited:
It simply doesn't have to work, because we won't have more than 3 nodes running.

The only problem was, that we had 90% memory allocated with an active/passive Setup and a small tiebreaker.
Now, with 3 normal servers, we have 256GB usable RAM instead of 128GB and we can upgrade it to 344GB without any issues.
That should be enought for a long time.
And besides, if we really need to have more that 3 nodes, we'll calculate new 10GbE switches in as well.
But this setup should be covering us for at least 2 years.

After that our leasing contract ends anyways, so i really don't see the need to buy a couple of 10GbE switches at the moment.
 
It simply doesn't have to work, because we won't have more than 3 nodes running.

The only problem was, that we had 90% memory allocated with an active/passive Setup and a small tiebreaker.
Now, with 3 normal servers, we have 256GB usable RAM instead of 128GB and we can upgrade it to 344GB without any issues.
That should be enought for a long time.
And besides, if we really need to have more that 3 nodes, we'll calculate new 10GbE switches in as well.
But this setup should be covering us for at least 2 years.

After that our leasing contract ends anyways, so i really don't see the need to buy a couple of 10GbE switches at the moment.

I was looking for advice on a technical level... how would you do that, if you had to connect 5-6 nodes with dual port 10gbe interfaces without a switch.
 
Okay, in that case, that is not possible.

Either a Switch (two for redundancy) or at least 3 Dual Port NIC's per Server. (Which will be more expensive than Switches)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!