I enabled an LDAP realm, failed a login during testing and now the GUI won't load

Safely2974

New Member
Aug 10, 2023
10
0
1
Hi Most Excellent Proxmox Forum Users,

We have a 3 server cluster, doing all the fun stuff like CEPH in the back end.

Here's what I did to break this:
  • created an LDAP Server entry (we're using JumpCloud)
  • tested a user login against that realm
Immediately it killed the GUI of that server. I stupidly thought it was something on my end, so I figured I'd try on the second server and again, it immediately killed the GUI. I might be dumb but at least I'm smart enough not to try 3 times.

Here's what I've done so far:
  • I still have SSH access, and the 3rd server in the cluster has full access to the other servers
  • pveproxy is active, and restarting does not restore the GUI
  • pvedaemon is active, and restarting does not restore the GUI
  • When I'm on SSH of the server I can "telnet 192.168.this.server 8006" and it answers
  • We get no response from a web browser on the two servers that have a dead GUI
Some other pertinent details:
  • pveproxy access.log continues to increment API requests from this server, but does not show my attempts to reach the GUI
  • "ss -an | grep LISTEN | grep 8006" reports a LISTEN on port 8006
    • this matches the fact I can telnet to this port directly from the server
  • when I restarted both pveproxy and pvedaemon, I see that both the parent and the worker PIDs are updated
We don't use proxmox to restrict inbound access (so there's no /etc/default/pveproxy file) as we have a 3rd party hardware firewall handling the security.

I don't want to reboot the server, because we have workloads that have to stay running. Can anyone recommend some good next troubleshooting step?



Okay so in the time it took me to write this question, the GUI for our first server came back. I now have a second, and third question:
  • what logs should I look into figure out what proxmox was doing in this interim period
  • what the heck happened?
 
I can confirm this is a reproducible problem.

To create this issue I configure Datacenter/Permissions/Realms/LDAP Server. It doesn't seem to matter if I've got a working configuration (my LDAP configuration isn't working, but I tested against a half-config and it still breaks).

The trigger is when I attempt to authenticate against this Realm. I find that the server does not call out LDAP (I don't see any log events on the LDAP server, nor do I see LDAP packets passing the 3rd party hardware firewall. After 1 or maybe 2 attempts the GUI crashes on that server and no longer responds until I restart pveproxy and pvedaemon - and even then it takes 5-6 minutes for it to come back.

pveversion says pve-manager/7.4-17/513c62be (running kernel: 5.15.108-1-pve)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!