pve 4.4: Cluster nodes flapping in GUI from green to red and vice versa

northe · Sep 23, 2017

Hi folks,
I have a weird behaviour while setting up a brand new cluster.
In the 5 node pve 4.4-18 cluster all nodes start after reboot in green, available state in the GUI. After a minute or so, node by node turn from green state to red cross until all nodes seems to be offline. It takes an other minute and node by node returns to green state until all nodes are ok.
According GUI and pvecm status the HA-Cluster never leave health condition, /etc/pve is still writeable and in /etc/pve/.members are all nodes listed. I can't find any hint in in the logs. Date is 100% in sync.

First thought was mulitcast but all 5 nodes joined the 239.192.145.115 multicast group.

Some snippets from logs and cat from config files:

# dis igmp-snooping group vlan 21
Total 2 entries.
----------
VLAN 21: Total 2 entries.
(0.0.0.0, 232.43.211.234)
Host slots (0 in total):
Host ports (1 in total):
BAGG5 (00:03:08)
(0.0.0.0, 239.192.145.115)
Host slots (0 in total):
Host ports (5 in total):
BAGG1 (00:03:12)
BAGG2 (00:03:04)
BAGG3 (00:03:04)
BAGG4 (00:03:12)
BAGG5 (00:03:13)

----------
proxmox-ve: 4.4-95 (running kernel: 4.4.79-1-pve)
pve-manager: 4.4-18 (running version: 4.4-18/ef2610e8)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.79-1-pve: 4.4.79-95
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-52
qemu-server: 4.0-112
pve-firmware: 1.1-11
libpve-common-perl: 4.0-96
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-101
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
ceph: 10.2.9-1~bpo80+1

----------
# /etc/pve/.clusterlog (no other entries, then listed below)
, "msg": "successful auth for user 'root@pam'"},
{"uid": 101, "time": 1506171300, "pri": 6, "tag": "pvedaemon", "pid": 1085, "node": "node1708vm-5", "user": "root@pam", "msg": "successful auth for user 'root@pam'"}

----------
# /var/log/daemon.log
Sep 23 20:55:01 node1708vm-4 pmxcfs[2281]: [status] notice: received log
Sep 23 20:56:41 node1708vm-4 pvedaemon[14733]: worker exit
Sep 23 20:56:41 node1708vm-4 pvedaemon[2530]: worker 14733 finished
Sep 23 20:56:41 node1708vm-4 pvedaemon[2530]: starting 1 worker(s)
Sep 23 20:56:41 node1708vm-4 pvedaemon[2530]: worker 15883 started
Sep 23 20:58:05 node1708vm-4 pveproxy[3010]: worker exit
Sep 23 20:58:05 node1708vm-4 pveproxy[16101]: worker 3010 finished
Sep 23 20:58:05 node1708vm-4 pveproxy[16101]: starting 1 worker(s)
Sep 23 20:58:05 node1708vm-4 pveproxy[16101]: worker 16998 started
Sep 23 20:58:39 node1708vm-4 pvestatd[2479]: status update time (600.276 seconds)
----------
# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: node1708vm-5
nodeid: 5
quorum_votes: 1
ring0_addr: node1708vm-5
}
node {
name: node1708vm-3
nodeid: 3
quorum_votes: 1
ring0_addr: node1708vm-3
}
node {
name: node1708vm-4
nodeid: 4
quorum_votes: 1
ring0_addr: node1708vm-4
}
node {
name: node1708vm-2
nodeid: 2
quorum_votes: 1
ring0_addr: node1708vm-2
}
node {
name: node1708vm-1
nodeid: 1
quorum_votes: 1
ring0_addr: node1708vm-1
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: CLUSTER01
config_version: 7
ip_version: ipv4
secauth: on
version: 2
token: 4000 # <--- thought this would help, but no does not.
interface {
bindnetaddr: 10.0.21.10
ringnumber: 0
}

}
-------

# cat /etc/network/intefaces
auto lo
iface lo inet loopback
auto eth0 inet manual
auto eth1 inet manual
auto eth2 inet manual
auto eth3 inet manual
auto eth4 inet manual
auto eth5 inet manual

auto bond20
iface bond20 inet static
address 10.0.20.10
netmask 255.255.255.0
slaves eth0 eth1 eth4 eth5
bond-miimon 100
bond-mode 802.3ad
auto bond0
iface bond0 inet manual
slaves eth2 eth3
bond-miimon 100
bond-mode 802.3ad
auto bond0.21
iface bond0.21 inet manual
vlan-raw-device bond0
auto bond0.100
iface bond0.100 inet manual
vlan-raw-device bond0
auto bond0.101
iface bond0.101 inet manual
vlan-raw-device bond0
auto bond0.170
iface bond0.170 inet manual
vlan-raw-device bond0
auto bond0.180
iface bond0.180 inet manual
vlan-raw-device bond0
auto bond0.190
iface bond0.190 inet manual
vlan-raw-device bond0
auto bond0.200
iface bond0.200 inet manual
vlan-raw-device bond0

auto vmbr21
iface vmbr21 inet static
address 10.0.21.10
netmask 255.255.255.0
bridge_ports bond0.21
bridge_stp off
bridge_fd 0
auto vmbr99
iface vmbr99 inet static
bridge_ports bond0.99
bridge_stp off
bridge_fd 0
auto vmbr100
iface vmbr100 inet static
bridge_ports bond0.100
bridge_stp off
bridge_fd 0
auto vmbr101
iface vmbr101 inet static
bridge_ports bond0.101
bridge_stp off
bridge_fd 0
auto vmbr170
iface vmbr170 inet static
bridge_ports bond0.170
bridge_stp off
bridge_fd 0
auto vmbr180
iface vmbr180 inet static
bridge_ports bond0.180
bridge_stp off
bridge_fd 0
auto vmbr190
iface vmbr190 inet static
bridge_ports bond0.190
bridge_stp off
bridge_fd 0
auto vmbr200
iface vmbr200 inet static
bridge_ports bond0.200
bridge_stp off
bridge_fd 0
auto eth6
iface eth6 inet static
address 192.168.0.166
netmask 255.255.255.0
gateway 192.168.0.251
----
cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
# Admin LAN
192.168.0.166 node1708-1.coro.company.de node1708-1 pvelocalhost

## Proxmox Cluster ##
10.0.21.10 node1708vm-1.vm.company.de node1708vm-1
10.0.21.20 node1708vm-2.vm.company.de node1708vm-2
10.0.21.30 node1708vm-3.vm.company.de node1708vm-3
10.0.21.40 node1708vm-4.vm.company.de node1708vm-4
10.0.21.50 node1708vm-5.vm.company.de node1708vm-5

# Ceph network is 10.0.20.0/24

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
---------
# cat /etc/pve/.members (at the time, when all nodes have a red cross)
{
"nodename": "node1708vm-1",
"version": 18,
"cluster": { "name": "CLUSTER01", "version": 7, "nodes": 5, "quorate": 1 },
"nodelist": {
"node1708vm-1": { "id": 1, "online": 1, "ip": "10.0.21.10"},
"node1708vm-2": { "id": 2, "online": 1, "ip": "10.0.21.20"},
"node1708vm-3": { "id": 3, "online": 1, "ip": "10.0.21.30"},
"node1708vm-4": { "id": 4, "online": 1, "ip": "10.0.21.40"},
"node1708vm-5": { "id": 5, "online": 1, "ip": "10.0.21.50"}
}
}
-----------
#pvecm status (at this moment 10.0.21.10 is offline)
Quorum information
------------------
Date: Sat Sep 23 21:40:31 2017
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1/412
Quorate: Yes

Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.21.10 (local)
0x00000002 1 10.0.21.20
0x00000003 1 10.0.21.30
0x00000004 1 10.0.21.40
0x00000005 1 10.0.21.50

--------
journalctl -xe
Sep 23 21:17:54 node1708vm-1 pvestatd[2553]: status update time (600.290 seconds)
Sep 23 21:25:02 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 23 21:27:29 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 23 21:27:55 node1708vm-1 pvestatd[2553]: status update time (600.309 seconds)
Sep 23 21:36:47 node1708vm-1 rrdcached[2195]: flushing old values
Sep 23 21:36:47 node1708vm-1 rrdcached[2195]: rotating journals
Sep 23 21:36:47 node1708vm-1 rrdcached[2195]: started new journal /var/lib/rrdcached/journal/rrd.journal.1506195407.632967
Sep 23 21:36:47 node1708vm-1 rrdcached[2195]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1506188207.632967
Sep 23 21:36:47 node1708vm-1 pmxcfs[2324]: [dcdb] notice: data verification successful
Sep 23 21:37:55 node1708vm-1 pvestatd[2553]: status update time (600.279 seconds)
Sep 23 21:40:02 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 23 21:42:29 node1708vm-1 pmxcfs[2324]: [status] notice: received log

-------
lspci | grep -i ethernet
03:00.0 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
03:00.1 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
81:00.0 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
81:00.1 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
83:00.0 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
83:00.1 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)

I have no clue where to search for the reason of this strange issue and would be happy for every hint in this case.

Thanks in advance.

Juergen

wolfgang · Sep 25, 2017

Hi,

Do you have a nfs storage?
If so I would check if it is available all the time.
A red node can also mean the a storage is not working proper.
Or exactly the pvestad is hanging or report problems, what 99% is related to a nfs problem.

northe · Sep 25, 2017

Hello Wolfgang,
no NFS is not running. Storage is served by Ceph. Right now, node1 is red and 'pvestatd status' returns 'running'.

wolfgang · Sep 25, 2017

Is this only happende after boot once or does it continues?

northe · Sep 25, 2017

this continues.
All nodes are up after reboot and then one by one is failing (less than a minute) until all nodes are red. It takes 3 minutes the first node gets green again. The second node is turning into green state in 40 sec. Node 3 and 4 follow fast (10 sek) the last one takes a little longer (~30sec). The sequence of falling and raising seems always to be the same. I don't know if this is interessting.
There is no interface up/down event seen by the switch.

wolfgang · Sep 25, 2017

can you send the ceph.config

and also the ceph status

northe · Sep 25, 2017

ceph.conf

[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.0.20.0/24
filestore xattr use omap = true
fsid = 61a56a49-a8c0-4b53-8c27-073b9c6d7713
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = 10.0.20.0/24
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd crush update on start = false
[mon.node1708-2]
host = node1708-2
mon addr = 10.0.20.20:6789
[mon.node1708-1]
host = node1708-1
mon addr = 10.0.20.10:6789
[mon.node1708-3]
host = node1708-3
mon addr = 10.0.20.30:6789

ceph status
cluster 61a56a49-a8c0-4b53-8c27-073b9c6d7713
health HEALTH_OK
monmap e7: 3 mons at {node1708-1=10.0.20.10:6789/0,node1708-2=10.0.20.20:6789/0,node1708-3=10.0.20.30:6789/0}
election epoch 52, quorum 0,1,2 node1708-1,node1708-2,node1708-3
osdmap e1096: 40 osds: 40 up, 40 in
flags sortbitwise,require_jewel_osds
pgmap v56562: 2660 pgs, 3 pools, 56000 MB data, 14003 objects
167 GB used, 363 TB / 363 TB avail
2660 active+clean

wolfgang · Sep 25, 2017

pleas make on 3 nodes simultaneous omping on the cluster network

Code:

omping -c 10000 -i 0.001 -F -q node1 node2 node3

northe · Sep 25, 2017

is it okay?
#1
10.0.21.20 : unicast, xmt/rcv/%loss = 9015/9015/0%, min/avg/max/std-dev = 0.022/0.036/0.154/0.011
10.0.21.20 : multicast, xmt/rcv/%loss = 9015/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.30 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.024/0.066/0.136/0.024
10.0.21.30 : multicast, xmt/rcv/%loss = 10000/1/99%, min/avg/max/std-dev = 0.063/0.063/0.063/0.000

#2
10.0.21.10 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.022/0.041/0.126/0.012
10.0.21.10 : multicast, xmt/rcv/%loss = 10000/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.30 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.022/0.075/0.145/0.025
10.0.21.30 : multicast, xmt/rcv/%loss = 10000/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

#3
10.0.21.10 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.024/0.045/0.135/0.015
10.0.21.10 : multicast, xmt/rcv/%loss = 10000/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.20 : unicast, xmt/rcv/%loss = 9272/9272/0%, min/avg/max/std-dev = 0.022/0.041/0.128/0.009
10.0.21.20 : multicast, xmt/rcv/%loss = 9272/2/99%, min/avg/max/std-dev = 0.036/0.052/0.068/0.023

wolfgang · Sep 25, 2017

That means your multicast is not able to send in 1ms form node to node.
What means you network is slow.
Please run this test without the -i and -F parameter

Code:

omping -c 10000 -q node1 node2 node3

can you also send the last hour of the syslog?

northe · Sep 25, 2017

For proxmox cluster network, each node has a dual port 10GB NIC in link-aggregation in mode=4
lspci -vs 03:00.0
03:00.0 Ethernet controller: Broadcom Limited BCM57840 NetXtreme II 10 Gigabit Ethernet (rev 11)
Subsystem: QLogic Corp. Device e3f2
Physical Slot: 3
Flags: bus master, fast devsel, latency 0, IRQ 28
Memory at c5000000 (64-bit, prefetchable) [size=8M]
Memory at c4800000 (64-bit, prefetchable) [size=8M]
Memory at c5810000 (64-bit, prefetchable) [size=64K]
Expansion ROM at c7380000 [disabled] [size=512K]
Capabilities: [48] Power Management version 3
Capabilities: [50] Vital Product Data
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-
Capabilities: [ac] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [13c] Device Serial Number xx-xx-xx-xx-xx-xx-xx-xx
Capabilities: [150] Power Budgeting <?>
Capabilities: [160] Virtual Channel
Capabilities: [1b8] Alternative Routing-ID Interpretation (ARI)
Capabilities: [220] #15
Capabilities: [300] #19
Kernel driver in use: bnx2x

I had to ^c the omping, it seems to run for ever.
#1
10.0.21.20 : unicast, xmt/rcv/%loss = 336/336/0%, min/avg/max/std-dev = 0.038/0.111/0.176/0.029
10.0.21.20 : multicast, xmt/rcv/%loss = 336/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.30 : unicast, xmt/rcv/%loss = 335/335/0%, min/avg/max/std-dev = 0.037/0.097/0.148/0.022
10.0.21.30 : multicast, xmt/rcv/%loss = 335/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
#2
10.0.21.10 : unicast, xmt/rcv/%loss = 334/334/0%, min/avg/max/std-dev = 0.047/0.100/0.158/0.021
10.0.21.10 : multicast, xmt/rcv/%loss = 334/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.30 : unicast, xmt/rcv/%loss = 337/337/0%, min/avg/max/std-dev = 0.036/0.099/0.150/0.021
10.0.21.30 : multicast, xmt/rcv/%loss = 337/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
#3
10.0.21.10 : unicast, xmt/rcv/%loss = 334/334/0%, min/avg/max/std-dev = 0.050/0.115/0.170/0.027
10.0.21.10 : multicast, xmt/rcv/%loss = 334/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
10.0.21.20 : unicast, xmt/rcv/%loss = 336/336/0%, min/avg/max/std-dev = 0.030/0.083/0.149/0.024
10.0.21.20 : multicast, xmt/rcv/%loss = 336/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

I thought, I have to tar the syslog, but its too short. The lines with NTP outage can be ignored, because I have installed and running NTP-Server which points to a local time-server.

# for x in node1708vm-1 node1708vm-2 node1708vm-3 node1708vm-4 node1708vm-5 ; do echo -n "Date for $x"; ssh $x date; done
Date for node1708vm-1Mon Sep 25 16:17:36 CEST 2017
Date for node1708vm-2Mon Sep 25 16:17:37 CEST 2017
Date for node1708vm-3Mon Sep 25 16:17:37 CEST 2017
Date for node1708vm-4Mon Sep 25 16:17:37 CEST 2017
Date for node1708vm-5Mon Sep 25 16:17:37 CEST 2017

Sep 25 15:10:33 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:13:02 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:17:01 node1708vm-1 CRON[17855]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Sep 25 15:19:10 node1708vm-1 pvestatd[2553]: status update time (600.296 seconds)
Sep 25 15:25:33 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:28:03 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:29:10 node1708vm-1 pvestatd[2553]: status update time (600.285 seconds)
Sep 25 15:36:47 node1708vm-1 pmxcfs[2324]: [dcdb] notice: data verification successful
Sep 25 15:36:47 node1708vm-1 rrdcached[2195]: flushing old values
Sep 25 15:36:47 node1708vm-1 rrdcached[2195]: rotating journals
Sep 25 15:36:47 node1708vm-1 rrdcached[2195]: started new journal /var/lib/rrdcached/journal/rrd.journal.1506346607.633000
Sep 25 15:36:47 node1708vm-1 rrdcached[2195]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1506339407.633061
Sep 25 15:37:26 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 193.175.73.151:123 (0.debian.pool.ntp.org).
Sep 25 15:37:37 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 193.175.73.151:123 (0.debian.pool.ntp.org).
Sep 25 15:37:37 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 85.14.245.16:123 (0.debian.pool.ntp.org).
Sep 25 15:37:47 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 85.14.245.16:123 (0.debian.pool.ntp.org).
Sep 25 15:37:47 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 85.236.36.4:123 (0.debian.pool.ntp.org).
Sep 25 15:37:57 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 85.236.36.4:123 (0.debian.pool.ntp.org).
Sep 25 15:37:57 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 46.4.77.168:123 (0.debian.pool.ntp.org).
Sep 25 15:38:07 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 46.4.77.168:123 (0.debian.pool.ntp.org).
Sep 25 15:38:07 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 213.95.200.107:123 (1.debian.pool.ntp.org).
Sep 25 15:38:18 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 213.95.200.107:123 (1.debian.pool.ntp.org).
Sep 25 15:38:18 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 213.136.86.203:123 (1.debian.pool.ntp.org).
Sep 25 15:38:28 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 213.136.86.203:123 (1.debian.pool.ntp.org).
Sep 25 15:38:28 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 80.241.208.120:123 (1.debian.pool.ntp.org).
Sep 25 15:38:38 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 80.241.208.120:123 (1.debian.pool.ntp.org).
Sep 25 15:38:38 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 188.68.36.203:123 (1.debian.pool.ntp.org).
Sep 25 15:38:48 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 188.68.36.203:123 (1.debian.pool.ntp.org).
Sep 25 15:38:48 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 78.46.37.25:123 (2.debian.pool.ntp.org).
Sep 25 15:38:59 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 78.46.37.25:123 (2.debian.pool.ntp.org).
Sep 25 15:38:59 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 213.95.200.109:123 (2.debian.pool.ntp.org).
Sep 25 15:39:09 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 213.95.200.109:123 (2.debian.pool.ntp.org).
Sep 25 15:39:09 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 217.79.179.106:123 (2.debian.pool.ntp.org).
Sep 25 15:39:11 node1708vm-1 pvestatd[2553]: status update time (600.304 seconds)
Sep 25 15:39:19 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 217.79.179.106:123 (2.debian.pool.ntp.org).
Sep 25 15:39:19 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 146.0.32.144:123 (2.debian.pool.ntp.org).
Sep 25 15:39:29 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 146.0.32.144:123 (2.debian.pool.ntp.org).
Sep 25 15:39:29 node1708vm-1 systemd-timesyncd[1353]: Using NTP server [2001:1640:11a::2]:123 (2.debian.pool.ntp.org).
Sep 25 15:39:29 node1708vm-1 systemd-timesyncd[1353]: Using NTP server [2003:a:87f:c37c::1]:123 (2.debian.pool.ntp.org).
Sep 25 15:39:29 node1708vm-1 systemd-timesyncd[1353]: Using NTP server [2a01:4f8:190:226e::1]:123 (2.debian.pool.ntp.org).
Sep 25 15:39:29 node1708vm-1 systemd-timesyncd[1353]: Using NTP server [2a01:4f8:121:4e4::123]:123 (2.debian.pool.ntp.org).
Sep 25 15:39:30 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 131.234.137.63:123 (3.debian.pool.ntp.org).
Sep 25 15:39:40 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 131.234.137.63:123 (3.debian.pool.ntp.org).
Sep 25 15:39:40 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 62.116.162.126:123 (3.debian.pool.ntp.org).
Sep 25 15:39:50 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 62.116.162.126:123 (3.debian.pool.ntp.org).
Sep 25 15:39:50 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 88.99.126.167:123 (3.debian.pool.ntp.org).
Sep 25 15:40:00 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 88.99.126.167:123 (3.debian.pool.ntp.org).
Sep 25 15:40:00 node1708vm-1 systemd-timesyncd[1353]: Using NTP server 87.118.124.35:123 (3.debian.pool.ntp.org).
Sep 25 15:40:10 node1708vm-1 systemd-timesyncd[1353]: Timed out waiting for reply from 87.118.124.35:123 (3.debian.pool.ntp.org).
Sep 25 15:40:33 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:43:03 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:49:11 node1708vm-1 pvestatd[2553]: status update time (600.305 seconds)
Sep 25 15:55:33 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:58:03 node1708vm-1 pmxcfs[2324]: [status] notice: received log
Sep 25 15:59:11 node1708vm-1 pvestatd[2553]: status update time (600.321 seconds)
Sep 25 16:09:12 node1708vm-1 pvestatd[2553]: status update time (600.285 seconds)
Sep 25 16:10:35 node1708vm-1 pmxcfs[2324]: [status] notice: received log

wolfgang · Sep 26, 2017

You multicast is not working (proper).

You say your corosync is working.

So I think you have something like a storm control or loop protection active on the switch.

Now you have 2 options fix your network or use unicast instead multicast.
It should be ok for 5 nodes but you have to test.

By the way token should not be changed if you have problems increase the token_coefficient.

northe · Sep 26, 2017

Thank you Wolfgang. I am taking a deeper look into the config of the switch, if there is any kind of rate limiting or wrong port-type. Since I am in an early stage, of the setup, perhaps I have the luck to start again with 5.1

..yes, multicast must work.

northe · Sep 26, 2017

Hello Wolfgang, I fixed the multicast issue but the flapping nodes are keeping on flapping.
I repeated the test
omping -c 10000 -i 0.001 -F -q 10.0.21.10 10.0.21.20 10.0.21.30
and it looks much better now:
#1
10.0.21.20 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.022/0.038/0.148/0.012
10.0.21.20 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.024/0.039/0.150/0.014
10.0.21.30 : unicast, xmt/rcv/%loss = 9736/9736/0%, min/avg/max/std-dev = 0.021/0.049/0.165/0.018
10.0.21.30 : multicast, xmt/rcv/%loss = 9736/9736/0%, min/avg/max/std-dev = 0.024/0.051/0.167/0.021
#2
10.0.21.10 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.021/0.039/0.119/0.012
10.0.21.10 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.024/0.039/0.123/0.013
10.0.21.30 : unicast, xmt/rcv/%loss = 9872/9872/0%, min/avg/max/std-dev = 0.022/0.054/0.161/0.018
10.0.21.30 : multicast, xmt/rcv/%loss = 9872/9872/0%, min/avg/max/std-dev = 0.024/0.057/0.154/0.019
#3
10.0.21.10 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.021/0.044/0.120/0.014
10.0.21.10 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.023/0.046/0.132/0.014
10.0.21.20 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.020/0.039/0.128/0.012
10.0.21.20 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.024/0.044/0.152/0.015

northe · Sep 26, 2017

looks strange...

wolfgang · Sep 26, 2017

Try to restart all corosync.services and also the pvestad.service on the cluster.

northe · Sep 26, 2017

Before I sent you the picture , I rebooted all 5 nodes. Strange too is, that I can see the statistics of a failed node (like uptime, CPU usage, RAM usage, ..)

wolfgang · Sep 26, 2017

Have you remove the torken entry in the corosync.conf?

northe · Sep 26, 2017

Yes, before reboot.

wolfgang · Sep 27, 2017

Please send me the output of this command from a red node.

Code:

corosync-quorumtool
ps faxel

pve 4.4: Cluster nodes flapping in GUI from green to red and vice versa

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Active Member

Active Member

Attachments

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff