[SOLVED] Separate Cluster Network Question

noja

Active Member
Feb 2, 2021
5
2
43
43
So I've had intermittent issues over a long time of one of my four nodes being fenced and going unresponsive. I think I've narrowed down the issues to some basic network issues that have never really dealt with and I'm trying to figure out best practices.

I am told that "storage communication should never be on the same network as corosync!" - generally speaking, this is exactly what I've done with all of my nodes - lots of traffic, all on the same 172.16.120.0 network.

I've tried to rectify this issue, but I'm not sure if what I've done will be sufficient.

My cluster nodes all have two nics - 1-1G and 1-2.5G. Initially, all traffic has been put through the 2.5G nics. To try and separate out the Corosync network, I moved the management interface from the 2.5G nic to a new vlan connected bridge port bridge on the 1G nic. (I followed this post).

Now, I believe that all of my corosync traffic will be moving across that second 1G nic while VMs and data traffic remain on the 2.5G nic.

Will this setup achieve that goal of not having storage communication on the same network as corosync? The VMs and management interface are still on the 172.16.120.0 network, but management is on a completely different nic.

If it helps, here is my /etc/pve/corosync.conf:
Code:
ogging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox4
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 172.16.120.36
  }
  node {
    name: prox5
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 172.16.120.37
  }
  node {
    name: prox7
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 172.16.120.34
  }
  node {
    name: pve1z440
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 172.16.120.39
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: congregation
  config_version: 12
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

And here is /etc/network/interfaces:
Code:
auto lo
iface lo inet loopback

iface enp1s0 inet manual

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet manual
        bridge-ports enp1s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr2
iface vmbr2 inet manual
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr2v120
iface vmbr2v120 inet static
        address 172.16.120.36/24
        gateway 172.16.120.1
        bridge-ports eno1.120
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

And finally, here is what a node generally looks like:

1771105525818.png
 
With that config corosync will use 172.16.120.X for cluster communication, which you have set on vmbr2v120 that in turn uses eno1.

Will this setup achieve that goal of not having storage communication on the same network as corosync? The VMs and management interface are still on the 172.16.120.0 network, but management is on a completely different nic.
If you have VMs running on 172.16.120.X, and access the PVE nodes using 172.16.120.X then no it will not.

Do you have a VLAN managed switch, and are the nodes connected to the same switch? If yes then easiest solution would be to put management on its own VLAN on 1G-NIC & corosync(on it's own non-routed VLAN) on 1G-NIC. And then separate out VM traffic and storage traffic(depending on what kind of storage you are using) to the 2.5G-NIC.
Example config below:
Rich (BB code):
auto nic0
iface nic0 inet manual
#MGMT+COROSYNC

auto nic1
iface nic1 inet manual
#VM+STORAGE+MIGRATION

auto vmbr0
iface vmbr0 inet manual
        bridge-ports nic1
        bridge-stp off
        bridge-fd 0
#VM-BRIDGE

auto storage
iface storage inet static
        address 172.16.60.11/24
        mtu 9000
        vlan-id 60
        vlan-raw-device bond0
#STORAGE(NFS)

auto coro0
iface coro0 inet static
        address 172.16.26.11/24
        vlan-id 26
        vlan-raw-device nic0
#COROSYNC

auto mgmt
iface mgmt inet static
        address 172.16.24.11/24
        gateway 172.16.24.1
        vlan-id 24
        vlan-raw-device nic0
#MGMT

You can then also, if you want, use the storage network or any other network that you run VMs on as a secondary link for corosync. E.g
Rich (BB code):
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: arx
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 172.16.26.11
    ring1_addr: 172.16.60.11
  }
  node {
    name: pax
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 172.16.26.12
    ring1_addr: 172.16.60.12
  }
  node {
    name: via
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 172.16.26.13
    ring1_addr: 172.16.60.13
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: imperium
  config_version: 9
  interface {
    linknumber: 0
  }
  interface {
    linknumber: 1
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}
 
  • Like
Reactions: Johannes S
Hey @AntonJ , I never replied to your post, but I wanted to say thank you for taking the time. You pushed me to read more and more on this and I've been trying to tinker into networking a lot more. While I've finally grasped the networking side, my issue has stemmed from not for sure knowing how to transition my existing network setup to the more recommended setup as you've described. I think I am at the point now of just saving all my vms and containers, wiping the hosts and starting fresh. Its honestly probably best considering all the crap I've done to these hosts over the years.

In the mean time, I grabbed a few old pcs I had around and created a new cluster with separate corosync/storage networks from the beginning to practice. I also moved migration to its own vlan too. Also, the idea of running corosync on a non-routed vlan was something I'd never considered and has been working well for me on the new cluster so far.

For the test cluster, here is network/interfaces (again I have a 2.5G nic and 1G nic)
Code:
auto lo
iface lo inet loopback

auto nic0
iface nic0 inet manual
        mtu 9000
#2.5G

iface nic1 inet manual
#1G

auto vmbr0
iface vmbr0 inet static
        address 172.16.50.55/24
        gateway 172.16.50.1
        bridge-ports nic0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 9000

auto onegig
iface onegig inet manual
        bridge-ports nic1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto coro
iface coro inet static
        address 172.16.19.55/24
        vlan-id 1002
        vlan-raw-device onegig
#corosync

auto migration
iface migration inet static
        address 10.0.1.55/24
        vlan-id 1001
        vlan-raw-device vmbr0
#fastnic - 1001

auto storage
iface storage inet static
        address 10.20.20.56/24
        mtu 9000
        vlan-id 20
        vlan-raw-device vmbr0

source /etc/network/interfaces.d/*

Also, here is corosync.conf

Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxlab1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 172.16.19.55
    ring1_addr: 10.0.1.55
  }
  node {
    name: proxlab2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 172.16.19.56
    ring1_addr: 10.0.1.56
  }
  node {
    name: proxlab3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 172.16.19.57
    ring1_addr: 10.0.1.57
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: testlab
  config_version: 3
  interface {
    linknumber: 0
      }
  interface {
    linknumber: 1
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2

Again, thanks for taking the time to reply! It was really helpful.
 
  • Like
Reactions: Johannes S
@noja Looks good, well done!

The setup is similar to the one we use at $JOB serving hundreds of VMs, so it's quite well-tested.

If I were to adjust something in your setup, I would not use bridge-vlan-aware yes and would instead use SDN to create the networks needed for the guests. It's easier to manage SDN than to manage VLAN tags on each and every VM.

Keep on learning!

BR
Anton
 
Last edited:
  • Like
Reactions: Johannes S