[SOLVED] Corosyncs redundant ring falls faulty every 15s

swe

New Member
Sep 10, 2021
11
0
1
Hi,

we have a 3 Nodes Cluster with 2 corosync rings, using this configuration.

Code:
logging {
  debug: off
  to_syslog: yes
}
nodelist {
  node {
    name: proxmox-1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.1.1.40
    ring1_addr: 10.15.15.50
  }
  node {
    name: proxmox-2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.1.1.50
    ring1_addr: 10.15.15.51
  }
  node {
    name: proxmox-3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.1.1.60
    ring1_addr: 10.15.15.52
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: el-gordo
  config_version: 3
  ip_version: ipv4
  secauth: on
  version: 2
  rrp_mode: passive
  interface {
    ringnumber: 0
  }
  interface {
    ringnumber: 1
  }
}

The Log Files tells me nearly every 15 seconds, that ring 1is faulty.

Code:
Sep 10 15:10:00 proxmox-3 corosync[2953]: error   [TOTEM ] Marking ringid 1 interface 10.15.15.52 FAULTY
Sep 10 15:10:00 proxmox-3 corosync[2953]:  [TOTEM ] Marking ringid 1 interface 10.15.15.52 FAULTY
Sep 10 15:10:01 proxmox-3 corosync[2953]: notice  [TOTEM ] Automatically recovered ring 1
Sep 10 15:10:01 proxmox-3 corosync[2953]:  [TOTEM ] Automatically recovered ring 1

On other threats i found here, the speed of the network connection (he used a 100MBit and 1 GBit) was ment to be the problem.

The networks we use are a gigabit one (working without problems) and a 40 GBit, which is the failing one.

Doesanybody have an idea ???
 
Fails everytime the same NIC (IP) ?
Have you checked your Switch for Porterrors?
 
3 Nodes direkt with 1 Subnet? This cannot work.
You need 3 Subnets.
1. Srv1 to Srv2
2. Srv2 to Srv3
3. Srv3 to Srv1

When Srv1 speak to Srv2 then fail the connect zu Srv3, because over the Active Adapter is only Srv2 reachable. This works as aspected.
 
Sorry.
maybe i explained bad.
Ring1:
Each Server has two 40G Ports, connected like you wrote.
Each Server has two 40G Ports, connected like you wrote.
server1 -> Server2 and server3
The adresses of the servers are 10.15.15.50, 10.15.15.51 and 10.15.15.52

Ring0:
Each server is connected to an unmanaged switch.
IPs
10.1.1.40, 10.1.1.50 and 10.1.1.60


Both networks are working without problems, expect the corosync ring, which fails.
 
You probably have a bond over the two network ports.
When Srv1 talks to Srv2 over port1, it also tries to reach Srv3 over this port.
This leads to such errors and bad performance.

You have 2 possibilities:
1. you use the open vSwitch and then you have to activate spanning tree because of the plugged loop. Disadvantage is that then a link is blocked. E.g. Srv1 --> Srv3, then the traffic always goes via Srv2.

2. you dissolve the bond and build your own subnets. So the server always knows over which adapter it reaches the right server.

Solution 2 is also recommended for many other HCI solutions like Microsoft S2D.
 
Okay, i think i understood the problem with the bonding.
Changing the network configuration like you suggested in solution 2, leads to additional rings in the corosync.conf?
 
Hi,

I've just been doing some reading on my own.
My info is correct for a normal network.

The Totem Redundant Ring Protocol can do more and works like the Token Ring used to.

You can leave your IP configuration.
I have attached a correct example configuration.

totem { version: 2 secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 10.0.0.0 mcastaddr: 226.94.1.1 mcastport: 5405 ttl: 1 } interface { ringnumber: 1 bindnetaddr: 172.16.0.0 mcastaddr: 226.94.1.2 mcastport: 5407 ttl: 1 } }
 
  • Like
Reactions: swe

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!