All servers from cluster reboot after one server reboots

richinbg

Member
Oct 2, 2017
28
3
8
32
Hello,
i am running a cluster with Proxmox 4.4 with three nodes.
Everytime I do reboot one of those nodes, all nodes start to reboot and it takes over half an hour to an hour until all of the clusters have a quorum again and stop rebooting over and over again ...

Is this something that I am doing wrong? Or should this take so long?
I can provide information about configuration etc. if required and you can tell me what you require.

Thanks.
 
Hello,
Thanks for your reply.

Here are the results:
Code:
pveversion -v
proxmox-ve: 4.4-96 (running kernel: 4.4.79-1-pve)
pve-manager: 4.4-18 (running version: 4.4-18/ef2610e8)
pve-kernel-4.4.79-1-pve: 4.4.79-95
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.67-1-pve: 4.4.67-92
pve-kernel-4.4.76-1-pve: 4.4.76-94
pve-kernel-4.4.83-1-pve: 4.4.83-96
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-53
qemu-server: 4.0-112
pve-firmware: 1.1-11
libpve-common-perl: 4.0-96
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.9.0-5~pve4
pve-container: 1.0-101
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.9-1~bpo80+1
root@ ~ # pvecm status
Quorum information
------------------
Date:             Wed Oct  4 09:52:44 2017
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1/21376
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.7.4.11 (local)
0x00000002          1 10.7.4.12
0x00000003          1 10.7.4.13

10.7.4.12 :   unicast, xmt/rcv/%loss = 450/450/0%, min/avg/max/std-dev = 0.040/0.108/0.892/0.062
10.7.4.12 : multicast, xmt/rcv/%loss = 450/447/0%, min/avg/max/std-dev = 0.047/0.118/0.770/0.055
10.7.4.13 :   unicast, xmt/rcv/%loss = 448/448/0%, min/avg/max/std-dev = 0.057/0.160/0.842/0.068
10.7.4.13 : multicast, xmt/rcv/%loss = 448/445/0%, min/avg/max/std-dev = 0.086/0.193/0.918/0.061

All three nodes have the same multicast address:
corosync-cmapctl -g totem.interface.0.mcastaddr
totem.interface.0.mcastaddr (str) = 239.192.104.2

Attached logs for omping -m 239.192.104.2 10.7.4.11 10.7.4.12 10.7.4.13.
I currently cannot provide an error log since I would need to reboot the machines.

I also noticed that after a reboot, they start to get a quorum and also before that boot up already all VMs... Normally this works. Even if I would unplug the network card from one machine, it was tested that the VMs which are in HA mode, get moved to another node within 2 mins.
 

Attachments

  • omping_multicast.txt
    31.3 KB · Views: 7
Do you use multiple switches/ paths for your corosync network? Looks to me, as if (R)STP would block one path and till it is unblocked, all nodes reboot (fencing).
 
Just one 10G switch is used and should be configured indeed.
We are using bounding, too.
 
Yes we are using that. And in general this works. Just if one server reboots we have the issue that it takes literally an hour until all of them are back in sync again.
 
I guess, it might have to do with the RSTP, it might just block traffic from links till some point the conditions are right and the blockage can be undone. You can see this on your switches, when you reboot one PVE host.
 
OK well I guess I iwll have a look for that on the switch the next time of a reboot.

Besides that, even though the quorum has not been successfully established the VMs already start booting. Should this not behave differently?
I would expect that it does not start to boot VMs before the quroum has been established.
Additionally, the strange thing is, even if it has been established, the machines get rebooted a couple of times first before they stay stable :(
 
Just one 10G switch is used and should be configured indeed.
You are using only one switch, so I assume, that have on NIC with 2 ports in a bond and all services are running over it?

Based on my assumption, if that is the case, then what you are seeing, is possible caused, by the recovery of Ceph (+ other traffic), while one node reboots and this interferes with corosync's traffic. In the end the quorum is lost on all servers. On startup corosync doesn't have a timestamp and so doesn't know when its token ran out, so it establishes quorum and starts the VMs, till soon after, it looses it again.
 
Sorry for the late reply, I did not have time yet to look more into this.
If your assumption is correct, how could I prevent it? With having a second switch?
 
Based on my assumption, if that is the case, then what you are seeing, is possible caused, by the recovery of Ceph (+ other traffic), while one node reboots and this interferes with corosync's traffic.

Hello,

Could you please give us more details on corosync interferances problem ?
When a node is booting, what could be interfere with corosync ?
Furthermore when corosync use auth !
I need to understand this problem because i have a similar issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!