Hi,
I have a 2 node proxmox cluster with a qdevice for quroum, all connected to the same switch and working well.
I shutdown one node (192.168.10.12) and the cluster still works well:
I then move 192.168.10.12 to another switch and turn it up. 192.168.10.11 can ping 192.168.10.12 and vice versa, and on the proxmox UI the 192.168.10.12 node shows a question mark, and cluster looks well:
After about a minute the cluster dies. The active master node 192.168.10.11 drops the ssh connection, is unreachable and GUI is not responding. The cluster is still up via
After a few minutes I am able to ssh again to the node but the GUI is still down and I have to poweroff both nodes using
When I connect 192.168.10.12 back on the same switch, everything works well! Both nodes are recognized and all is good.
Attached is the syslog of 192.168.10.11. I didn't find anything meaningful, but I don't know what to look for.
Would appreciate any hint or further debugging I should do, because I am quite stuck.
Thank you!
I have a 2 node proxmox cluster with a qdevice for quroum, all connected to the same switch and working well.
Code:
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW 192.168.10.11 (local)
0x00000002 1 A,V,NMW 192.168.10.12
0x00000000 1 Qdevice
I shutdown one node (192.168.10.12) and the cluster still works well:
Code:
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW 192.168.10.11 (local)
0x00000002 1 NR 192.168.10.12
0x00000000 1 Qdevice
I then move 192.168.10.12 to another switch and turn it up. 192.168.10.11 can ping 192.168.10.12 and vice versa, and on the proxmox UI the 192.168.10.12 node shows a question mark, and cluster looks well:
Code:
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW 192.168.10.11 (local)
0x00000002 1 A,V,NMW 192.168.10.12
0x00000000 1 Qdevice
After about a minute the cluster dies. The active master node 192.168.10.11 drops the ssh connection, is unreachable and GUI is not responding. The cluster is still up via
pvecm status
as shown above, but the node is dead.After a few minutes I am able to ssh again to the node but the GUI is still down and I have to poweroff both nodes using
systemctl --force --force poweroff
. When I turn on 192.168.10.11, the GUI comes back to life and I see 192.168.10.12 as down (red X), but the cluster status doesn't show it:
Code:
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW 192.168.10.11 (local)
0x00000000 1 Qdevice
When I connect 192.168.10.12 back on the same switch, everything works well! Both nodes are recognized and all is good.
Attached is the syslog of 192.168.10.11. I didn't find anything meaningful, but I don't know what to look for.
Would appreciate any hint or further debugging I should do, because I am quite stuck.
Thank you!