corosync totem retransmit and cluster problem

mmenaz · Aug 8, 2017

In a 3 node cluster, 2 with drbd9 storage and one for quorum, I've 4-5 times a day corosync problems like those below.
Sometime I enter the web GUI and I see the other nodes in red, pvecm status says is everything ok, but if I dare run the command i.e. "qm list" it hangs and there is no way to kill it and regain console control (ok, I use screen but...).
The cluster becomes unmanagable and I have to reboot the nodes, but AFAIU the vm are then timeout killed and not correctly shutdown.
VM use vmbr1, drbd a separate 10Gbit link, so vmbr0 is pratically dedicated to cluster.
With those tests
# omping -c 10000 -i 0.001 -F -q 192.168.1.3 192.168.1.4 192.168.1.5
# omping -c 600 -i 1 -q 192.168.1.3 192.168.1.4 192.168.1.5
I've 0% loss for each node
I've used enterprise repo and upgraded until drbd9 license temporary problems and the decision by Proxmox to not support it anymore (btw the customer did not wanted to revenue the license because I don't dare to upgrade and use Drbd separate repositories, nor I can tell them that there is this problem that was very infrequent in the past... so I'm almost stuck here).
The problem, looking at older logs, seems much more frequent since I added a second Win2012R2 VM, latest virtio (now they have 2 2008R2 and 2 2012R2). The VM at the moment has no activity since has yet to be configured for it's final usage, so can't be special pressures against the system (btw, all the vm run on the same node, prox01).
Any idea?

Code:

root@prox01:~# pveversion -v
proxmox-ve: 4.3-71 (running kernel: 4.4.21-1-pve)
pve-manager: 4.3-9 (running version: 4.3-9/f7c6f0cd)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.2.8-1-pve: 4.2.8-41
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-46
qemu-server: 4.0-92
pve-firmware: 1.1-10
libpve-common-perl: 4.0-79
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-68
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.3-12
pve-qemu-kvm: 2.7.0-4
pve-container: 1.0-80
pve-firewall: 2.0-31
pve-ha-manager: 1.0-35
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.5-1
lxcfs: 2.0.4-pve2
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
drbdmanage: 0.97.3-1

Node 01 IP .5, daemon.log

Aug  7 11:39:54 prox01 rrdcached[2317]: flushing old values
Aug  7 11:39:54 prox01 rrdcached[2317]: rotating journals
Aug  7 11:39:54 prox01 rrdcached[2317]: started new journal /var/lib/rrdcached/journal/rrd.journal.1502098794.845477
Aug  7 11:39:54 prox01 rrdcached[2317]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1502091594.845507
Aug  7 12:09:54 prox01 smartd[2260]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Temperature_Case changed from 66 to 67
Aug  7 12:09:54 prox01 smartd[2260]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Temperature_Case changed from 66 to 67
Aug  7 12:28:19 prox01 corosync[2539]:  [TOTEM ] FAILED TO RECEIVE
Aug  7 12:28:21 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.5:23016) was formed. Members left: 3 2
Aug  7 12:28:21 prox01 corosync[2539]:  [TOTEM ] Failed to receive the leave message. failed: 3 2
Aug  7 12:28:21 prox01 corosync[2539]:  [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Aug  7 12:28:21 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:23020) was formed. Members joined: 3 2
Aug  7 12:28:21 prox01 corosync[2539]:  [QUORUM] Members[3]: 3 2 1
Aug  7 12:28:21 prox01 corosync[2539]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:39:54 prox01 rrdcached[2317]: flushing old values
Aug  7 12:39:54 prox01 rrdcached[2317]: rotating journals
Aug  7 12:39:54 prox01 rrdcached[2317]: started new journal /var/lib/rrdcached/journal/rrd.journal.1502102394.845514
Aug  7 12:39:54 prox01 rrdcached[2317]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1502095194.845499
Aug  7 12:39:54 prox01 pmxcfs[2401]: [dcdb] notice: data verification successful
Aug  7 13:39:54 prox01 pmxcfs[2401]: [dcdb] notice: data verification successful


Node 02 IP .4, daemon.log
Aug  7 11:31:20 prox02 rrdcached[2456]: flushing old values
Aug  7 11:31:20 prox02 rrdcached[2456]: rotating journals
Aug  7 11:31:20 prox02 rrdcached[2456]: started new journal /var/lib/rrdcached/journal/rrd.journal.1502098280.037047
Aug  7 11:31:20 prox02 rrdcached[2456]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1502091080.036995
Aug  7 11:39:54 prox02 pmxcfs[2538]: [dcdb] notice: data verification successful
Aug  7 12:01:20 prox02 smartd[2406]: Device: /dev/bus/0 [megaraid_disk_03] [SAT], SMART Usage Attribute: 190 Temperature_Case changed from 67 to 68
Aug  7 12:28:19 prox02 corosync[2643]:  [TOTEM ] A new membership (192.168.1.3:23012) was formed. Members left: 1
Aug  7 12:28:19 prox02 corosync[2643]:  [TOTEM ] Failed to receive the leave message. failed: 1
Aug  7 12:28:19 prox02 pmxcfs[2538]: [dcdb] notice: members: 2/2538, 3/1526
Aug  7 12:28:19 prox02 pmxcfs[2538]: [dcdb] notice: starting data syncronisation
Aug  7 12:28:19 prox02 corosync[2643]:  [QUORUM] Members[2]: 3 2
Aug  7 12:28:19 prox02 corosync[2643]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:28:19 prox02 pmxcfs[2538]: [dcdb] notice: cpg_send_message retried 1 times
Aug  7 12:28:19 prox02 pmxcfs[2538]: [status] notice: members: 2/2538, 3/1526
Aug  7 12:28:19 prox02 pmxcfs[2538]: [status] notice: starting data syncronisation
Aug  7 12:28:21 prox02 corosync[2643]:  [TOTEM ] A new membership (192.168.1.3:23016) was formed. Members
Aug  7 12:28:21 prox02 corosync[2643]:  [QUORUM] Members[2]: 3 2
Aug  7 12:28:21 prox02 corosync[2643]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: received sync request (epoch 2/2538/00000004)
Aug  7 12:28:21 prox02 pmxcfs[2538]: [status] notice: received sync request (epoch 2/2538/00000004)
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: received all states
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: leader is 2/2538
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: synced members: 2/2538, 3/1526
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: start sending inode updates
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: sent all (0) updates
Aug  7 12:28:21 prox02 corosync[2643]:  [TOTEM ] A new membership (192.168.1.3:23020) was formed. Members joined: 1
Aug  7 12:28:21 prox02 corosync[2643]:  [QUORUM] Members[3]: 3 2 1
Aug  7 12:28:21 prox02 corosync[2643]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: cpg_send_message retried 1 times
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: all data is up to date
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: members: 1/2401, 2/2538, 3/1526
Aug  7 12:28:21 prox02 pmxcfs[2538]: [dcdb] notice: starting data syncronisation
Aug  7 12:28:21 prox02 pmxcfs[2538]: [status] notice: members: 1/2401, 2/2538, 3/1526
Aug  7 12:28:21 prox02 pmxcfs[2538]: [status] notice: queue not emtpy - resening 5 messages
Aug  7 12:31:20 prox02 rrdcached[2456]: flushing old values
Aug  7 12:31:20 prox02 rrdcached[2456]: rotating journals

Nodo 03 IP .3, daemon.log
Aug  7 11:22:14 prox03 rrdcached[1411]: flushing old values
Aug  7 11:22:14 prox03 rrdcached[1411]: rotating journals
Aug  7 11:22:14 prox03 rrdcached[1411]: started new journal /var/lib/rrdcached/journal/rrd.journal.1502097734.862596
Aug  7 11:22:14 prox03 rrdcached[1411]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1502090534.862603
Aug  7 11:39:54 prox03 pmxcfs[1526]: [dcdb] notice: data verification successful
Aug  7 12:22:14 prox03 rrdcached[1411]: flushing old values
Aug  7 12:22:14 prox03 rrdcached[1411]: rotating journals
Aug  7 12:22:14 prox03 rrdcached[1411]: started new journal /var/lib/rrdcached/journal/rrd.journal.1502101334.862523
Aug  7 12:22:14 prox03 rrdcached[1411]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1502094134.862528
Aug  7 12:28:16 prox03 corosync[1572]:  [TOTEM ] Retransmit List: 10d55a
Aug  7 12:28:16 prox03 corosync[1572]:  [TOTEM ] Retransmit List: 10d55a 10d55b
Aug  7 12:28:16 prox03 corosync[1572]:  [TOTEM ] Retransmit List: 10d55a 10d55b 10d55c
[... MANY MANY TOTEM MESSAGES WITH SAME TIMESTAMP]
Aug  7 12:28:19 prox03 corosync[1572]:  [TOTEM ] Retransmit List: 10d55a 10d55b 10d55c 10d55d 10d55e 10d55f 10d560 10d561 10d562 10d563 10d564 10d565 10d566
Aug  7 12:28:19 prox03 corosync[1572]:  [TOTEM ] Retransmit List: 10d55a 10d55b 10d55c 10d55d 10d55e 10d55f 10d560 10d561 10d562 10d563 10d564 10d565 10d566
Aug  7 12:28:19 prox03 corosync[1572]:  [TOTEM ] A new membership (192.168.1.3:23012) was formed. Members left: 1
Aug  7 12:28:19 prox03 corosync[1572]:  [TOTEM ] Failed to receive the leave message. failed: 1
Aug  7 12:28:19 prox03 pmxcfs[1526]: [dcdb] notice: members: 2/2538, 3/1526
Aug  7 12:28:19 prox03 pmxcfs[1526]: [dcdb] notice: starting data syncronisation
Aug  7 12:28:19 prox03 pmxcfs[1526]: [status] notice: members: 2/2538, 3/1526
Aug  7 12:28:19 prox03 pmxcfs[1526]: [status] notice: starting data syncronisation
Aug  7 12:28:19 prox03 corosync[1572]:  [QUORUM] Members[2]: 3 2
Aug  7 12:28:19 prox03 corosync[1572]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:28:21 prox03 corosync[1572]:  [TOTEM ] A new membership (192.168.1.3:23016) was formed. Members
Aug  7 12:28:21 prox03 corosync[1572]:  [QUORUM] Members[2]: 3 2
Aug  7 12:28:21 prox03 corosync[1572]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: received sync request (epoch 2/2538/00000004)
Aug  7 12:28:21 prox03 pmxcfs[1526]: [status] notice: received sync request (epoch 2/2538/00000004)
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: received all states
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: leader is 2/2538
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: synced members: 2/2538, 3/1526
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: all data is up to date
Aug  7 12:28:21 prox03 corosync[1572]:  [TOTEM ] A new membership (192.168.1.3:23020) was formed. Members joined: 1
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: members: 1/2401, 2/2538, 3/1526
Aug  7 12:28:21 prox03 pmxcfs[1526]: [dcdb] notice: starting data syncronisation
Aug  7 12:28:21 prox03 pmxcfs[1526]: [status] notice: members: 1/2401, 2/2538, 3/1526
Aug  7 12:28:21 prox03 pmxcfs[1526]: [status] notice: queue not emtpy - resening 5 messages
Aug  7 12:28:21 prox03 corosync[1572]:  [QUORUM] Members[3]: 3 2 1
Aug  7 12:28:21 prox03 corosync[1572]:  [MAIN  ] Completed service synchronization, ready to provide service.
Aug  7 13:22:14 prox03 rrdcached[1411]: flushing old values
Aug  7 13:22:14 prox03 rrdcached[1411]: rotating journals

Alwin · Aug 8, 2017

When the cluster looses its quorum, you should see this with pvecm status and pvecm nodes (missing one/more). Do all of your pve hosts have the same package versions installed?

mmenaz · Aug 8, 2017

Alwin said:
When the cluster looses its quorum, you should see this with pvecm status and pvecm nodes (missing one/more). Do all of your pve hosts have the same package versions installed?

Yes, exactly the same version of everything. I've always updated the node at the same time.
I've had a check in first node now (time 16.52) and I've found that I've had some weird thing today too
(that log file starts at Aug 6 06:39:54)

Code:

root@prox01:~# grep -i totem /var/log/daemon.log | tail
Aug  8 14:58:20 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.4:24088) was formed. Members
Aug  8 14:58:22 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.4:24092) was formed. Members
Aug  8 14:58:24 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.4:24096) was formed. Members
Aug  8 14:58:26 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.4:24100) was formed. Members
Aug  8 14:58:26 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:24104) was formed. Members joined: 3
Aug  8 15:13:09 prox01 corosync[2539]:  [TOTEM ] FAILED TO RECEIVE
Aug  8 15:13:11 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.5:24108) was formed. Members left: 3 2
Aug  8 15:13:11 prox01 corosync[2539]:  [TOTEM ] Failed to receive the leave message. failed: 3 2
Aug  8 15:15:05 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:24316) was formed. Members joined: 3
Aug  8 15:15:05 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:24320) was formed. Members joined: 2

and also

Code:

root@prox01:~# grep pmxcfs /var/log/daemon.log | grep crit
Aug  8 15:13:11 prox01 pmxcfs[2401]: [dcdb] crit: received write while not quorate - trigger resync
Aug  8 15:13:11 prox01 pmxcfs[2401]: [dcdb] crit: leaving CPG group
Aug  8 15:15:05 prox01 pmxcfs[2401]: [dcdb] crit: ignore sync request from wrong member 2/2538
Aug  8 15:15:05 prox01 pmxcfs[2401]: [status] crit: ignore sync request from wrong member 2/2538

dietmar · Aug 8, 2017

Do you use an extra network for the cluster/corosync traffic? Maybe there is just too much load on that network?

mmenaz · Aug 8, 2017

dietmar said:
Do you use an extra network for the cluster/corosync traffic? Maybe there is just too much load on that network?

As I said, all the vm traffic is on vmbr1, drbd traffic is on separate 10Gbit nic, so vmbr0 is used only for cluster traffic and GUI management (very rare).
vmbr0/eth0, in all 3 nodes, is a gbit connection in a gbit switch (mii-tools confirms this ("eth0: negotiated 1000baseT-FD flow-control, link ok").
Also I'm a bit scared about how things go badly when the cluster collapses... even if I have all the vm running in a single node, things go so bad that I have to reboot. Last time an "hanging" backup task was putting server load to 16, was unable to really kill the vzdump process (not through GUI, nor killin it since I had zombie processes then) and the only solution was to reboot.
The second time I had cluster issues and I tried to solve every "qm" command was hanging, restarting pve services gave errors / went in timeout / never come back again and I had to reboot (all vms reported an abrupt termination, so no clean shutdown was performed, maybe doe to pve services not working).
Just in case, is there a direct kvm command I can issue to have the vm shutdown, if qm shutdown ID hangs/does not work (they have guest agent installed)?

Alwin · Aug 9, 2017

How much time does omping replies need to complete? (especially, when you see the failed to receive messages)
On what interface are you doing backups? (network overload?)
What is the mtu between your nodes?
Can you please post the corosync.conf and syslog?

mmenaz · Aug 9, 2017

Q.: How much time does omping replies need to complete? (especially, when you see the failed to receive messages)
A.: since the problem does not happen at the same time, I can't test during the failed to receive without some luck.
I've tested now (and everything is fine now) with this result

Code:

# omping -c 10000 -i 0.001 -F -q 192.168.1.3 192.168.1.4 192.168.1.5

root@prox01:~# omping -c 10000 -i 0.001 -F -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.4 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : given amount of query messages was sent
192.168.1.3 : given amount of query messages was sent

192.168.1.3 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.071/0.297/0.726/0.094
192.168.1.3 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.091/0.317/0.729/0.093
192.168.1.4 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.075/0.210/0.641/0.053
192.168.1.4 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.079/0.226/0.647/0.053

root@prox02:~# omping -c 10000 -i 0.001 -F -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.3 : waiting for response msg
192.168.1.5 : waiting for response msg
192.168.1.5 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.5 : given amount of query messages was sent
192.168.1.3 : given amount of query messages was sent

192.168.1.3 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.072/0.288/1.779/0.118
192.168.1.3 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.088/0.301/1.787/0.117
192.168.1.5 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.048/0.171/0.473/0.065
192.168.1.5 : multicast, xmt/rcv/%loss = 10000/9988/0% (seq>=13 0%), min/avg/max/std-dev = 0.056/0.183/1.011/0.064

root@prox03:~# omping -c 10000 -i 0.001 -F -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.4 : waiting for response msg
192.168.1.5 : waiting for response msg
192.168.1.5 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : waiting for response msg
192.168.1.4 : server told us to stop
192.168.1.5 : waiting for response msg
192.168.1.5 : server told us to stop

192.168.1.4 :   unicast, xmt/rcv/%loss = 9680/9680/0%, min/avg/max/std-dev = 0.059/0.203/0.667/0.059
192.168.1.4 : multicast, xmt/rcv/%loss = 9680/9673/0% (seq>=8 0%), min/avg/max/std-dev = 0.074/0.212/0.674/0.059
192.168.1.5 :   unicast, xmt/rcv/%loss = 9854/9854/0%, min/avg/max/std-dev = 0.057/0.125/0.349/0.040
192.168.1.5 : multicast, xmt/rcv/%loss = 9854/9847/0% (seq>=8 0%), min/avg/max/std-dev = 0.064/0.132/0.726/0.041

# omping -c 600 -i 1 -q 192.168.1.3 192.168.1.4 192.168.1.5

root@prox01:~# omping -c 600 -i 1 -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.4 : waiting for response msg
192.168.1.4 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.3 : waiting for response msg
192.168.1.3 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : given amount of query messages was sent
192.168.1.3 : given amount of query messages was sent

192.168.1.3 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.130/0.389/0.697/0.063
192.168.1.3 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.180/0.406/0.702/0.061
192.168.1.4 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.105/0.224/0.343/0.036
192.168.1.4 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.120/0.243/0.370/0.038
root@prox01:~#

root@prox02:~# omping -c 600 -i 1 -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.3 : waiting for response msg
192.168.1.5 : waiting for response msg
192.168.1.5 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : waiting for response msg
192.168.1.3 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.5 : given amount of query messages was sent
192.168.1.3 : given amount of query messages was sent

192.168.1.3 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.114/0.368/0.661/0.114
192.168.1.3 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.124/0.384/0.678/0.112
192.168.1.5 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.075/0.211/0.404/0.067
192.168.1.5 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.088/0.224/0.418/0.066
root@prox02:~#

root@prox03:~# omping -c 600 -i 1 -q 192.168.1.3 192.168.1.4 192.168.1.5
192.168.1.4 : waiting for response msg
192.168.1.5 : waiting for response msg
192.168.1.5 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.1.4 : given amount of query messages was sent
192.168.1.5 : given amount of query messages was sent

192.168.1.4 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.101/0.245/0.457/0.053
192.168.1.4 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.111/0.254/0.469/0.054
192.168.1.5 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.073/0.182/0.305/0.043
192.168.1.5 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.080/0.192/0.317/0.043
root@prox03:~#

Q.: On what interface are you doing backups? (network overload?)
A.: I do backup on local disk, and a NAS does backup of some VM content that are on vmbr1 (eth1), separate then from corosync traffic. Also is done at night, while I've problems at ramdom hours, expecially daytime.

Q.: What is the mtu between your nodes?
A.: default 1500 on all 3 nodes

Code:

root@prox01:~# ip link
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:1e:67:94:b5:74 brd ff:ff:ff:ff:ff:ff

Q.: Can you please post the corosync.conf and syslog?
A.: sure, thanks

Code:

root@prox01:~# cat /etc/corosync/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: prox01
  }

  node {
    name: prox02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: prox02
  }

  node {
    name: prox03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: prox03
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: cldueuff
  config_version: 3
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 192.168.1.5
    ringnumber: 0
  }

}

also

Code:

root@prox01:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.1.5 prox01.mydomain.it prox01 pvelocalhost

192.168.1.4 prox02.mydomain.it prox02
192.168.1.3 prox03.mydomain.it prox03

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

For syslog see attached file.
Thanks a lot!

Alwin · Aug 9, 2017

Judging from the provided syslog, check host prox03, as it appears that this host is flapping. Maybe there is a different network config or the network crad/driver has issues.

mmenaz · Aug 9, 2017

Alwin said:
Judging from the provided syslog, check host prox03, as it appears that this host is flapping. Maybe there is a different network config or the network crad/driver has issues.

Just for curiosity, how can you tell so? I mean, grepping I find no "prox03" and only 2 lines with it's IP (192.168.1.3) that are:

Code:

Aug  9 11:34:03 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:24356) was formed. Members joined: 3
...
Aug  9 12:09:16 prox01 corosync[2539]:  [TOTEM ] A new membership (192.168.1.3:24368) was formed. Members joined: 3

I need to learn how to interpret cluster related logs messages

Btw, let's say that prox03 is defective, but 2 prox nodes survive, that have quorum, and so I should have prox01 and prox02 green when I log in the Prox01 gui, and have "qm list" or whatever working... instead seems that all the cluster collapses and becomes unmanegable.
Is it normal?

Alwin · Aug 10, 2017

mmenaz said:
Just for curiosity, how can you tell so? I mean, grepping I find no "prox03" and only 2 lines with it's IP (192.168.1.3) that are:

You can see the nodeid form corosync in the join/leave message.

In the logs you can find that corosync retransmits its cluster messages before the node leaves and joins the cluster again. As in the log I can only see prox03 leaving and joining, I suspect that there is a problem with that node or connecting to it.

Check out the link for more on this :https://www.hastexo.com/resources/hints-and-kinks/whats-totem-retransmit-list-all-about-corosync/

jagan · Dec 29, 2017

Hi, i am using 2 Node cluster with DRBD on PVE 3.4 and i am also getting huge " Totem Retransmit" entries in Syslogs.
some times one of my node is not responding (freezing) , i can't able to ping the node IP & Monitor keyboard not responding. Every time manually rebooting the node to bring it online.

I am running both the nodes on production and i can't reinstall the entire setup.

Please help me, if anyone has solution to fix my problem.

Thanks in Advance.

Alwin · Dec 29, 2017

@jagan, please do not hijack other threads, open a new one.

Search

Search

corosync totem retransmit and cluster problem

mmenaz

Renowned Member

Alwin

Proxmox Retired Staff

mmenaz

Renowned Member

dietmar

Proxmox Staff Member

mmenaz

Renowned Member

Alwin

Proxmox Retired Staff

mmenaz

Renowned Member

Attachments

Alwin

Proxmox Retired Staff

mmenaz

Renowned Member

Alwin

Proxmox Retired Staff

jagan

Active Member

Alwin

Proxmox Retired Staff