GUI Login is not working anymore via AD

pvecm status on a running host:

root@vmhost03:~# pvecm status
Cluster information
-------------------
Name: XXXXXXX
Config Version: 39
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Oct 17 16:15:50 2023
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000010
Ring ID: 10.5e76
Quorate: No

Votequorum information
----------------------
Expected votes: 17
Highest expected: 17
Total votes: 1
Quorum: 9 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000010 1 10.100.XX.XXX (local)
 
journalctl -xe on running host

The unit session-1784557.scope has successfully entered the 'dead' state.
Oct 17 16:35:37 vmhost03 systemd-logind[1150]: Session 1784557 logged out. Waiting for processes to exit.
Oct 17 16:35:37 vmhost03 systemd-logind[1150]: Removed session 1784557.
░░ Subject: Session 1784557 has been terminated
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░ Documentation: sd-login(3)
░░
░░ A session with the ID 1784557 has been terminated.
Oct 17 16:35:38 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 80
Oct 17 16:35:39 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 90
Oct 17 16:35:40 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 100
Oct 17 16:35:40 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retried 100 times
Oct 17 16:35:40 vmhost03 pmxcfs[1643]: [status] crit: cpg_send_message failed: 6
Oct 17 16:35:40 vmhost03 pve-firewall[1744]: firewall update time (30.052 seconds)
Oct 17 16:35:41 vmhost03 corosync[1724]: [KNET ] link: host: 6 link: 0 is down
Oct 17 16:35:41 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 10
Oct 17 16:35:42 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 20
Oct 17 16:35:42 vmhost03 corosync[1724]: [KNET ] rx: host: 12 link: 0 is up
Oct 17 16:35:43 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 30
Oct 17 16:35:44 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 40
Oct 17 16:35:45 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 50
Oct 17 16:35:46 vmhost03 corosync[1724]: [KNET ] link: host: 10 link: 0 is down
Oct 17 16:35:46 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 60
Oct 17 16:35:47 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 70
Oct 17 16:35:48 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 80
Oct 17 16:35:48 vmhost03 corosync[1724]: [KNET ] rx: host: 10 link: 0 is up
Oct 17 16:35:49 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 90
Oct 17 16:35:50 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 100
Oct 17 16:35:50 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retried 100 times
Oct 17 16:35:50 vmhost03 pmxcfs[1643]: [status] crit: cpg_send_message failed: 6
Oct 17 16:35:51 vmhost03 pmxcfs[1643]: [status] notice: cpg_send_message retry 10
 
we did the removal of /etc/corosync just before ... this could not be the root of the issue. It is just not so good now for this host. Might it really not be better or quicker to create to rebuild the cluster again?
 
when we copied the files back to vmhost01 from another running host this did not resolve the issue on vmhost01. But maybe this host ist really out of order now since we are doing all our troubleshooting actions on this host.
 
pvecm status on a running host:

root@vmhost03:~# pvecm status
This means that corosync does not see the other nodes - even on the "working node"

do pings (with appropriate sizes) work on the 10.100.xx.155 interfaces
I do assume that xx is the same for all nodes and that this is a /24 in the configuration?
 
when we copied the files back to vmhost01 from another running host this did not resolve the issue on vmhost01. But maybe this host ist really out of order now since we are doing all our troubleshooting actions on this host.
you could also unplug that host - but if all nodes don't see each other I'm not sure if the issue is indeed with one single node - vs. somewhere in the network between them (was there maybe a change on the switches that connects the nodes on the 10.100.xx.yy interfaces?
 
Could you tell me what we would have to do if we stop the clustering (coroync) on all the nodes and then start host by host. What do we have to copy on host vmhost01 that it will also get back as member?
 
you could also unplug that host - but if all nodes don't see each other I'm not sure if the issue is indeed with one single node - vs. somewhere in the network between them (was there maybe a change on the switches that connects the nodes on the 10.100.xx.yy interfaces?

the hosts are seeing each other. Communication is not a problem
 
we also copied /etc/pve and /etc/corosync to the vmhost01. But starting pve-cluster is not possible.
did you also restart corosync - what did it log after restarting?

the hosts are seeing each other. Communication is not a problem
The corosync logs from the "working node" indicate that there is some issue (corosync works with UDP - so just a ping does not ensure that corosync works fine - the logs after restarting corosync should maybe give you a hint)

Could you tell me what we would have to do if we stop the clustering (coroync) on all the nodes and then start host by host. What do we have to copy on host vmhost01 that it will also get back as member?
a) I would first focus on the other nodes - once they see each other and form a quorum (you'd need 9/17 votes for that) you can take a look at the node vmhost01 - fixing it should work with starting pmxcfs in local mode, copying a working corosync.conf from another node to /etc/corosync/corosync.conf and /etc/pve/corosync.conf, stopping pmxcfs in local mode, restarting corosync and pve-cluster (a.k.a. pmxcfs))
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!