I have a new proxmox cluster undergoing testing.
Nodes are connected via 10GbT switch with multiple VLANs trunked to each node.
Due to multicast not working correctly on the switches the networking group uses (I don't control the network), I specify "transport: udpu" in /etc/pve/corosync.conf during creation, and I can verify that setting is in /etc/corosync/corosync.conf on all nodes.
The custer is quorate:
Quorum information
------------------
Date: Thu Apr 19 19:02:32 2018
Quorum provider: corosync_votequorum
Nodes: 7
Node ID: 0x00000001
Ring ID: 1/548
Quorate: Yes
Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 7
Quorum: 4
Flags: Quorate
I haven't noticed any problems with nodes going rogue and losing communication with the cluster. However syslog on each host is filling with one error and two alerts *each second*
Apr 19 18:31:44 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:45 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:46 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:47 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:48 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
This is generating ~660K log lines each day, which seems a bit... wrong.
# grep TOTEM /var/log/syslog.1 | wc -l
664758
I tried turning logging to syslog off in corosync.conf, but it still logs to syslog.
Proxmox is the only place I use corosync. How should I diagnose this problem?
# cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: no
}
nodelist {
node {
name: node01
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.1.45
}
node {
name: node02
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.1.46
}
node {
name: node03
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.1.47
}
node {
name: node04
nodeid: 4
quorum_votes: 1
ring0_addr: 192.168.1.48
}
node {
name: node05
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.1.49
}
node {
name: node06
nodeid: 6
quorum_votes: 1
ring0_addr: 192.168.1.50
}
node {
name: node07
nodeid: 7
quorum_votes: 1
ring0_addr: 192.168.1.51
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: TESTING
config_version: 7
interface {
bindnetaddr: 192.168.1.45
ringnumber: 0
}
ip_version: ipv4
secauth: on
transport: udpu
version: 2
}
Nodes are connected via 10GbT switch with multiple VLANs trunked to each node.
Due to multicast not working correctly on the switches the networking group uses (I don't control the network), I specify "transport: udpu" in /etc/pve/corosync.conf during creation, and I can verify that setting is in /etc/corosync/corosync.conf on all nodes.
The custer is quorate:
Quorum information
------------------
Date: Thu Apr 19 19:02:32 2018
Quorum provider: corosync_votequorum
Nodes: 7
Node ID: 0x00000001
Ring ID: 1/548
Quorate: Yes
Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 7
Quorum: 4
Flags: Quorate
I haven't noticed any problems with nodes going rogue and losing communication with the cluster. However syslog on each host is filling with one error and two alerts *each second*
Apr 19 18:31:44 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:45 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:46 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:47 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
Apr 19 18:31:48 node03 corosync[33602]: error [TOTEM ] Digest does not match
Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
This is generating ~660K log lines each day, which seems a bit... wrong.
# grep TOTEM /var/log/syslog.1 | wc -l
664758
I tried turning logging to syslog off in corosync.conf, but it still logs to syslog.
Proxmox is the only place I use corosync. How should I diagnose this problem?
# cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: no
}
nodelist {
node {
name: node01
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.1.45
}
node {
name: node02
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.1.46
}
node {
name: node03
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.1.47
}
node {
name: node04
nodeid: 4
quorum_votes: 1
ring0_addr: 192.168.1.48
}
node {
name: node05
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.1.49
}
node {
name: node06
nodeid: 6
quorum_votes: 1
ring0_addr: 192.168.1.50
}
node {
name: node07
nodeid: 7
quorum_votes: 1
ring0_addr: 192.168.1.51
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: TESTING
config_version: 7
interface {
bindnetaddr: 192.168.1.45
ringnumber: 0
}
ip_version: ipv4
secauth: on
transport: udpu
version: 2
}