3 Node cluster - Icons disappear - Can't change VM settings

fpdragon

Member
Feb 24, 2022
85
5
13
40
Hi,

Once again, it seems that I messed up something in my config.
I had troubles in the past in getting instabilities with corosync/cluster config. One time I hade issues with a bad network cable and one time the ssh keys were out of sync.

This time the problem seems to be different.

I can ping all 3 nodes.
Web UI login works on node1 and node3.
On node 2 I get an error with
Code:
"Login failed. Please try again"

The strange thing is that the icons on the left tree menu are only shown for node1.
1700812350863.png
I had this before but the icons were always refreshed after seconds. This time it seems to stay.

I can access to different views and settings of all nodes but on node 2 and node 3 I can't change anything from the gui.
1700812533024.png

I checked the /etc/corosync/corosync.conf and /etc/pve/corosync.conf.
Both look fine and the same.

Here my
pvecm status
Code:
Cluster information
-------------------
Name:             HomeCluster
Config Version:   13
Transport:        knet
Secure auth:      on


Quorum information
------------------
Date:             Fri Nov 24 08:57:02 2023
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1.796
Quorate:          Yes


Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate


Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.1.254.254 (local)
0x00000002          1 10.1.254.253
0x00000003          1 10.1.254.251

Syslog shows nothing unusual, at least from my perspective.

Hope you can help.
Thanks.
 

Attachments

  • 1700812687545.png
    1700812687545.png
    22 KB · Views: 1
Hi,
please share the journal since boot for all 3 nodes. You can generate it on each node by running journalctl -b > "$(hostname)-journal.txt". What was done before the issues appeared? Please try to stop and restart the pvestad via sytemctl restart pvestatd.service on each node and see if this has an effect. Also, what is the output of pvesh get /cluster/resources for all 3 nodes.
 
Hmmm...
I rebooted node 2 and node 3 and suddly everything works fine again.

Learned nothing today. -_-
 
Do you have multiple networks between the nodes? If so, try to see if you can ping them should you run into the same issue. It might be possible that the network for Corosync is still working, but the Mgmt network doesn't. In that case, access to the API would not work, which can look a lot like what you saw.
 
Hi Aaron,
Thank you again for your patience during training. xD (can recommend)

I was able to ping all nodes from each other.
In this case it is my home setup so no special networking. Just one NIC for each node and corosync and mgmt should run on the same network.

However, it works again and maybe it was just because of a networking hickup? Hard to say...

Before I was reinstalling node 2 because I wanted to switch from lvm to zfs. The reinstall and the rejoin of the newly set up node went well but then after some days the icons disappeared without making further changes to the setup.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!