[SOLVED] Web Login Error: After a few days of PVE 9 running, logins to the PVE Web UI fail (401 error) until I reboot PVE

hive

Member
Mar 6, 2021
16
3
23
59
I'm running a standalone PVE 9.0.11 on a remote host connected with tailscale (no ports open to the interwebs).

After the server has been running for a few days, when I try to login to the web UI, I get a 401 authentication error.
I've noticed that when this occurs, my nightly backups to a remote PBS machine have stopped being sent to the PBS machine.

I can still ssh to the box (using tailscale ssh), and I can su to root using the same root password that fails to authenticate in the web UI.
If I reset my root password, and restart the PVE box, login to the web UI starts working again.

Has anyone else experienced this? Do you have a fix?
What log files would you suggest I look at to try to diagnose the root cause?
 
OK after running for nearly 2 days, logins on the web ui started failing again.
This time I just rebooted the PVE server (without changing the root password) and I can login again after the reboot (no other changes).
Does anyone have any suggestions re: identifying the root cause (from logs or otherwise) and resolving this?
 
I'd greatly appreciate any tips or pointers to get to the bottom of what is going on.
What logs would you suggest I look at to try to understand this?

My web login gets disabled again after ~2 days. Here's what I see:
-1 nightly set of backups to PBS succeeds
-2nd night the backups do not get sent to PBS
-Next morning web logins fail w/Authentication failure (401)
-Reboot resolves the issue and logins on the web interface work again
 
Hello, hive.
Your case sounds very strange. From the description I doubt that the reason is in the Proxmox itself.

I don't know tailscale and if it has any impact, but I would start with the most simple setup possible. That is, I would check if logging from the local network is possible when logging via tailscale fails.

On the logs: you can issue journalctl and search for any non-standard entries.
About failing backup jobs: I suggest reading the jobs logs. You should be able to see that the task is starting and what happens then.

Good luck! :)
 
It looks like the root disk filled up--too many files in /var/lib/vz/dump --I've pruned the backups from there (those were from before I setup the PBS).
Thanks @Onslow for the help tracking this down. Hopefully that was my only issue with this PVE setup.
 
  • Like
Reactions: Onslow