[SOLVED] Proxmox VE - Login failed. Please try again.

same here: in a cluster of 2 nodes, if one's gone, I cannot login in webgui of the another node. using "pvecm expected 1" solved the problem. But it is totally unacceptable because i set up the cluster of 2 for failsafe.
 
We had a 3 node ceph cluster and customer couldnt login without cabling of the corosync links. That shouldnt be a thing In my opinion, as gui ist promoted via different link.
 
I have the same problem (not beeing able to log in the gui if the second node of a cluster of 2 is down.
"pvecm expected 1" is not working for me as it returns :

Unable to set expected votes: CS_ERR_INVALID_PARAM

Thats, when the 2 nodes are on. As soon as I shutdown the second node I can run "pvecm expected 1" and login on node 1.
Would it be necessary to have a automatic mechanism that detect when a node is the last one up in a cluster and automaticaly set votequorum to one in that specific case ?

pvecm status
Cluster information
-------------------
Name: Maison
Config Version: 2
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Mar 14 21:38:39 2022
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1.8d
Quorate: Yes

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.32.50.4 (local)
0x00000002 1 10.32.50.5
 
Last edited:
Just to add another +1 to this, I also have a (very basic) cluster of 2 nodes. If one of them is down, I can't log into the other:

I don't know yet why the 2nd node goes offline, it's a new problem and different issue, but when it's offline logins generally don't work on the other node.

I can login just fine via ssh on the node that's available, and I know the password is set correctly since the web login resumes working once the missing node comes back online.

Code:
root@proxmox:~# pvecm status
Cluster information
-------------------
Name:             BHB
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Tue Mar 29 10:32:43 2022
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1.95
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.3.5 (local)
 
Last edited:
confirmed with my 9 node cluster - took all offline for cleaning, adding ssd and updates - then restarted one at a time to confirm changes and cannot use gui (loads webpage for login but "Login failed. Please try again" when login with main root user and correct password - whereas it works fine on the console or ssh. Is there a significant reason to limit access to pve gui when quorum not present?
 
Authentication via the GUI requires access to /etc/pve because of auth keys. If they need rotation, access is required. It might work for a few minutes or even hours after loss of quorum, but there's no guaranteed time.
If you're using TFA, access to /etc/pve is ALWAYS required.

So if you need to shut down half or more nodes of your cluster, and you still need access to the GUI, connect via SSH and set pvecm expected <val> to the number of still active nodes.
 
Just ran into this issue after having shut down a host and recycling the parts. Something got messed up in my cluster-manager, because pvecm expected 1 did not allow me to log back into the host via web GUI. I think it was because I had configured a QDevice, although I could not remove the QDevice either because of the error:
root@pve1:~# pvecm qdevice remove
All nodes must be online! Node pve2 is offline, aborting.

Ultimately I had to do as @mokaz suggested, which was to destroy the cluster config for the still online node and re-create the cluster once I could get back into the web GUI:
Code:
systemctl stop pve-cluster corosync
pmxcfs -l
rm -rf /etc/corosync/*
rm /etc/pve/corosync.conf
killall pmxcfs
systemctl start pve-cluster
After all that, I'm back in business, without any downtime on any VM/CT's.
 
  • Like
Reactions: renegade247
FWIW I just encountered a similar issue, after updating a cluster from 6 to 7, everything was fine at first.
Then after attempting to restore a CT, I started getting errors on a node, then losing qorum on the whole cluster (complete collapse, with all nodes going into high CPU usage on corosync process)

By sequentially stopping/restarting corosync on the nodes, I was able to narrow the issue down to that node specifically, and to troubleshoot the issue with a GUI, I attempted login from the web, with the dreaded failed login error (SSH was fine), with PAM en PVE.

Then I attempted

Code:
pvecm updatecerts

and this worked, I could log into the node with the web UI!
I then restarted the node's corosync... no more qorum collapse either.

So if you encounter the issue, this could be something to try.
 
This is an old thread but would like to +1 this thread.

Just registered to say that I had the same issue today. I am new to Proxmox and just created a second node in the cluster yesterday in my homelab. The second node is on a server, so to save power, I shut it down yesterday and today, I was unable to log into Proxmox WebUI.

There is nothing stored on the second node and I was just experimenting with Proxmox clusters. No two factor authentication either.

As soon as I powered ON the second node server, I was able to log into Proxmox webui. Not sure if this is intentional but not fun for newbies like me.

The second server is quite old so having it ON all the time is going to be quite expensive. I guess gotta pay to play :)
 
Thats not how a cluster works. You need atleast 3 machines and the majority of them needs to be running all the time.
 
Yes - a proxmox cluster is a very powerful and complex high availability tool with shared storage for configurations and heartbeats, etc. Unfortunately it's also the only way to log in to several proxmox nodes, migrate machines and manage multiple independent nodes in a shared interface. Unless you have three or more machines of a very special scenario where high availability is so important that you can dedicate significant resources to configuring and monitoring your machines, a cluster is probably not the best tool, even if the other functionality sounds attractive.

The future feature "Cross cluster authentication mechanism", listed in the proxmox virtualisation environment Roadmap since a few years, might allow for some of that functionality without creating a cluster.
 
Sorry to revive an old thread, but just experienced the same issue after having MFA enabled on one of the nodes. Did just like bparvez mentioned and shut down the other node for the night (had to do some maintenance), causing the first node to keep failing authentication. Started working again shortly after rebooting the second node, but it scared the crap out of me temporarily.

It makes a bit more sense after reading this thread and understanding how MFA affects things, though.
Authentication via the GUI requires access to /etc/pve because of auth keys. If they need rotation, access is required. It might work for a few minutes or even hours after loss of quorum, but there's no guaranteed time.
If you're using TFA, access to /etc/pve is ALWAYS required.

So if you need to shut down half or more nodes of your cluster, and you still need access to the GUI, connect via SSH and set pvecm expected <val> to the number of still active nodes.
So if I have understood this right, if I have a total of two active nodes, and one goes down, I can solve this issue by connecting through SSH and setting the expected value to 1?
 
Last edited:
Yes, that's how to do it generally. But PVE 7.3 has a solution for this now [0]:
Code:
    Only require write-access (quorum) to TFA config for recovery keys.

    All other TFA methods only need read-access to the config. This makes it possible to login to a node, which is not in the quorate partition, even if your user has TFA configured.


[0] https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_7.3
 
I have a system with a fresh install of PVE 7.2.1. I can log in from the console and SSH, but cannot log in to the web UI. I tried a complete re-install and still have the same problem. Anyone have any thoughts?
 
Yes, that's how to do it generally. But PVE 7.3 has a solution for this now [0]:
Code:
    Only require write-access (quorum) to TFA config for recovery keys.

    All other TFA methods only need read-access to the config. This makes it possible to login to a node, which is not in the quorate partition, even if your user has TFA configured.


[0] https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_7.3
Hello, I am having the same problem also running 2 fresh install nodes on PVE 7.3-3. After one of them turned off this solution in PVE 7.3 didn't help me, couldn't log in to GUI and i still had to use "pvecm expected 1".
 
Last edited:
I have a very similar issue. I have two proxmox nodes and sometimes on node 2 I will get this "Failed, can't connect to server" if I try to run an update or access the shell or any console on that node. What I have to do is update my certs first on node 1 by running "pvecm updatecerts" and I have to do the exact same thing on node two by logging in through a ssh session.

I have to do this at least once a week! I am not sure what is going on and although I have a work around it is VERY annoying. Does anyone has any clue as to what may be causing this? For now I have set cron jobs to update the certs which is fine for two nodes I guess but if I had more nodes than this it would be impractical. Any solution?
 
I believe I have found my issue. I checked both ssh_known_hosts located in /etc/ssh and they had duplicate entries of my second node which caused conflicts. I removed the offending entries matching the hostname and ip address on the master node (node 1) and ran "pvecm updatecerts --force" and that propagated the change to node 2. The offending duplicate entry is no longer present and now I am able to connect via the PVE Web UI.
 
  • Like
Reactions: mira
Hello, I have this issue: fresh install proxmox-ve_7.3-1.iso, no login via web and ssh, login only via the physical server.
So, logged into physical server as root, change the password with passwd, now I can login via web and ssh.
Maybe this is a bug?.
I hope that you will find this information useful.
 
Hello again. There is something else I had to do in order to fix the issue on my side. I had to add a cron job that would run every hour from the leader server. Just add the following and it will propagate updated certs to all other nodes

0 * * * * pvecm updatecerts --force

Not sure if this is the true solution to the issue but it works, I didn't have any issue in over 3 months.
 
  • Like
Reactions: cvidalr00t

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!