Quorum Error 500

May 9, 2019
27
1
23
35
Good day,

I am unable to create a post for my issue bellow

I have issues with Quorom 500 error and would like some assistance


Primary Node (Host)

Quorum information
------------------
Date: Tue Jun 18 08:09:17 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/12
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.10.2 (local)


root@node1 ~ # systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2019-06-15 01:16:42 SAST; 3 days ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 24669 (corosync)
Tasks: 2 (limit: 4915)
Memory: 44.8M
CPU: 49min 20.724s
CGroup: /system.slice/corosync.service
└─24669 /usr/sbin/corosync -f

Jun 15 01:21:07 cloud02-pbx corosync[24669]: warning [CPG ] downlist left_list: 1 received
Jun 15 01:21:07 cloud02-pbx corosync[24669]: notice [QUORUM] This node is within the non-primary component and will NOT provide any services.
Jun 15 01:21:07 cloud02-pbx corosync[24669]: notice [QUORUM] Members[1]: 1
Jun 15 01:21:07 cloud02-pbx corosync[24669]: notice [MAIN ] Completed service synchronization, ready to provide service.
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [TOTEM ] A new membership (10.0.10.2:12) was formed. Members left: 2
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [TOTEM ] Failed to receive the leave message. failed: 2
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [CPG ] downlist left_list: 1 received
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [QUORUM] Members[1]: 1
Jun 15 01:21:07 cloud02-pbx corosync[24669]: [MAIN ] Completed service synchronization, ready to provide service.



Second Node

Quorum information
------------------
Date: Tue Jun 18 08:09:21 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/1108
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 10.0.10.3 (local)


root@node22:~# systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2019-06-15 01:49:42 SAST; 3 days ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 1690 (corosync)
Tasks: 2 (limit: 11059)
Memory: 43.1M
CPU: 47min 6.818s
CGroup: /system.slice/corosync.service
└─1690 /usr/sbin/corosync -f

Jun 15 01:54:02 server2 corosync[1690]: [QUORUM] Members[1]: 2
Jun 15 01:54:02 server2 corosync[1690]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 15 01:54:03 server2 corosync[1690]: notice [TOTEM ] A new membership (10.0.10.3:1108) was formed. Members
Jun 15 01:54:03 server2 corosync[1690]: warning [CPG ] downlist left_list: 0 received
Jun 15 01:54:03 server2 corosync[1690]: notice [QUORUM] Members[1]: 2
Jun 15 01:54:03 server2 corosync[1690]: notice [MAIN ] Completed service synchronization, ready to provide service.
Jun 15 01:54:03 server2 corosync[1690]: [TOTEM ] A new membership (10.0.10.3:1108) was formed. Members
Jun 15 01:54:03 server2 corosync[1690]: [CPG ] downlist left_list: 0 received
Jun 15 01:54:03 server2 corosync[1690]: [QUORUM] Members[1]: 2
Jun 15 01:54:03 server2 corosync[1690]: [MAIN ] Completed service synchronization, ready to provide service.


OMPING and Ping Between nodes

root@Node1 ~ # omping -c 10000 -i 1 -q 10.0.10.2 10.0.10.3
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : response message never received


root@Node1~# ping 10.0.10.3
PING 10.0.10.3 (10.0.10.3) 56(84) bytes of data.
64 bytes from 10.0.10.3: icmp_seq=1 ttl=64 time=0.220 ms
64 bytes from 10.0.10.3: icmp_seq=2 ttl=64 time=0.221 ms
64 bytes from 10.0.10.3: icmp_seq=3 ttl=64 time=0.285 ms
64 bytes from 10.0.10.3: icmp_seq=4 ttl=64 time=0.246 ms
64 bytes from 10.0.10.3: icmp_seq=5 ttl=64 time=0.343 ms
64 bytes from 10.0.10.3: icmp_seq=6 ttl=64 time=0.268 ms
64 bytes from 10.0.10.3: icmp_seq=7 ttl=64 time=0.236 ms
64 bytes from 10.0.10.3: icmp_seq=8 ttl=64 time=0.279 ms
^C
--- 10.0.10.3 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7158ms
rtt min/avg/max/mdev = 0.220/0.262/0.343/0.040 ms
 
root@Node1 ~ # omping -c 10000 -i 1 -q 10.0.10.2 10.0.10.3
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : response message never received

I guess you only ran omping on one of the nodes? - It needs to be run parallel on all nodes at the same time

please post the results of all both omping commands from https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_cluster_network.
Please use code-tags for command-line output.

Thanks!
 
root@Node2:~# omping -c 10000 -i 1 -q 10.0.10.3 10.0.10.2
10.0.10.2 : waiting for response msg
10.0.10.2 : joined (S,G) = (*, 232.43.211.234), pinging


root@Node1 ~ # omping -c 10000 -i 1 -q 10.0.10.2 10.0.10.3
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : joined (S,G) = (*, 232.43.211.234), pinging
 
omping -c 10000 -i 0.001 -F -q 10.0.10.2 10.0.10.3


Code:
root@Node2:~# omping -c 10000 -i 0.001 -F -q 10.0.10.2 10.0.10.3
10.0.10.2 : waiting for response msg
10.0.10.2 : joined (S,G) = (*, 232.43.211.234), pinging
10.0.10.2 : given amount of query messages was sent

10.0.10.2 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.0                                     90/0.196/0.350/0.031
10.0.10.2 : multicast, xmt/rcv/%loss = 10000/9985/0% (seq>=16 0%), min/avg/max/s                                     td-dev = 0.093/0.202/0.357/0.031

root@cloud02-pbx ~ # omping -c 10000 -i 0.001 -F -q 10.0.10.2 10.0.10.3
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : joined (S,G) = (*, 232.43.211.234), pinging
10.0.10.3 : waiting for response msg
10.0.10.3 : server told us to stop

10.0.10.3 :   unicast, xmt/rcv/%loss = 9619/9619/0%, min/avg/max/std-dev = 0.092                                     /0.207/0.345/0.030
10.0.10.3 : multicast, xmt/rcv/%loss = 9619/9619/0%, min/avg/max/std-dev = 0.093                                     /0.211/0.355/0.030

omping -c 600 -i 1 -q 10.0.10.2 10.0.10.3

Code:
root@Node2:~# omping -c 600 -i 1 -q 10.0.10.2 10.0.10.3
10.0.10.2 : waiting for response msg
10.0.10.2 : joined (S,G) = (*, 232.43.211.234), pinging
10.0.10.2 : given amount of query messages was sent

10.0.10.2 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.161/0.265/0.420/0.046
10.0.10.2 : multicast, xmt/rcv/%loss = 600/599/0% (seq>=2 0%), min/avg/max/std-dev = 0.165/0.277/0.436/0.047

root@Node1: ~ # omping -c 600 -i 1 -q 10.0.10.2 10.0.10.3
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : waiting for response msg
10.0.10.3 : joined (S,G) = (*, 232.43.211.234), pinging
10.0.10.3 : given amount of query messages was sent

10.0.10.3 :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.130/0.264/0.391/0.034
10.0.10.3 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.134/0.272/0.407/0.036
 
Last edited:
Also,

I changed default ssh port so when I want to migrate a VM/CT its gives me error because port 22 is disabled. How can I change the port the cluster communicates on via ssh? also, root user is disabled for ssh

note fake IP used. But IP is the external IP not the Seperated Cluster network IP. Is this correct?

Code:
2019-06-18 13:51:29 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=server2' root@123.3.23.169 /bin/true
2019-06-18 13:51:29 ssh: connect to host 123.3.23.169 port 22: Connection refused
2019-06-18 13:51:29 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted
 
changed default ssh port so when I want to migrate a VM/CT its gives me error because port 22 is disabled. How can I change the port the cluster communicates on via ssh? also, root user is disabled for ssh
This is not really supported for PVE, since it relies on connecting between cluster-nodes with ssh-keys and on the default port.
* You could try to disable root-logins without-password (see `man sshd_config`) for disabling root-access with password
* for the different port you might have luck and be able to specify the alternative port in either root's ssh-config or in the systemwide one (`man ssh_config`)

However as said, this is neither a supported setup and thus not a widely tested one

As an alternative you could consider setting up a dedicated network for your corosync and migration traffic and configure ssh to listen on port 22 there (and still enable PermitRootLogin without-password)

hope this helps!

P.S. please open a new thread for a new topic!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!