Unavailable storage will make the webUI unusable...

Dunuin · Dec 6, 2022

Hi,

It is really annoying that an unavailable storage will prevent the PVE webUI from working. This is not a single case. Both of my nodes got this problem. And today with PVE 7.3 and still back then with PVE 6.X.

All my Nodes use local ZFS/LVM-thin pools to store the data. But I got a SMB share on a NAS for ISOs and LXC templates (don't want to waste space on the expensive and small enterprise SSDs disks for that...only need them once for creating new guests...) and multiple PBS with multiple namespaces. So I got 7-8 PBS storages added to each PVE node. All PVE nodes are unclustered, as I want to be able to shut all of them, except for one, down, to save electricity.

But how PVE now works makes that totally annoying. The problem is basically this: As soon as a single storage will become unavailable, the webUI will become that unresponsive that it will be unusable. Here are some examples:

1.) Login won't work (and my password safe's browser extension is typing in the credentials, so there are correct):

After 20-30 seconds it will fail with this:

2.) Guests will show no statistics or everything will take an eternity to load (that LXC is by the way running for days):

Then it will fail with "Broken pipe (596)":

3.) You just can't use the webUI. Whenever doing something like changing any option of a guest or node it will fail. Maybe every 10th or 20th try will work. I then open my SSH client, connect to the server and change that option or do that task using the CLI, as that will still work on the first try and is faster than trying it 10 or 20 times in the webUI. Here I for example tried to start a Container:

And it will time out:

All of the above can easily be fixed by disabling the unavailable storages. I can for example run

pvesm set --disable 1 PBS_Manual_BackupNAS && pvesm set --disable 1 PBS_Weekly_BackupNAS && pvesm set --disable 1 PBS_Manual_MainNAS && pvesm set --disable 1 PBS_Weekly_MainNAS

through SSH to disable my 4 unavailable PBS storages. Wait one minute and everything works perfectly fine. Login works, graphs works, no timeout when doing anything, ...

In my opinion, an unavailable storage shouldn't make the webUI unusable.

Especially as none of my guests requires any of the SMB/PBS storages to be online 24/7 for normal operation.

When I shut down my NAS (that eats 80W, so I can't run that 24/7) and forget to disable the SMB/PBS storages hosted by that first -> webUI stops working...
When I shut down my physical Backup Server (only runs a few hours per week) and forget to disable the PBS storages first -> webUI stops working...
When my internet connection randomly got a problem and the Tuxis Offsite PBS server can't be reached -> webUI stops working...
When I shut down my PBS VM, NAS VM, Router VM (which will also happen on the weekly stop mode backups) -> webUI stops working...

It is especially annoying, as you can't even use the webUI to disable those storages, because login will fail or editing the storage will time out.
Might be not that problematic in a datacenter where everything should run 24/7 without a single point of failure, but in a homelab that is quite frustrating.

I know that I could work with hook scripts for some scenarios, so that a PBS storage is only enabled while a backup task is running. But even with that, it is still annoying that I have to enable these PBS storages again when I want to restore a backup.
And not sure if there is a hook to enable/disabling ISO/template SMB storages when creating a new VM/LXC.

And spammed logs are also not that great for the SSD wear, especially as all logs will be written with sync writes to a elasticsearch DB again (ok, I could create a rule to filter those out before sending them to the log server, but then still a lot of wear on the system SSDs):

Code:

Dec  6 16:02:19 j3710 pvedaemon[1050137]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:20 j3710 pvestatd[3323]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:20 j3710 pvedaemon[1127436]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:22 j3710 pvestatd[3323]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:22 j3710 pvedaemon[1006603]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:22 j3710 pvedaemon[1050137]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:22 j3710 pvedaemon[1127436]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:23 j3710 pveproxy[1249548]: proxy detected vanished client connection
Dec  6 16:02:23 j3710 pvestatd[3323]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:23 j3710 pvedaemon[1006603]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:24 j3710 pvestatd[3323]: status update time (6.953 seconds)
Dec  6 16:02:24 j3710 pveproxy[1166883]: proxy detected vanished client connection
Dec  6 16:02:25 j3710 pvedaemon[1127436]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:25 j3710 pvedaemon[1050137]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:27 j3710 pvedaemon[1006603]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:27 j3710 pvedaemon[1127436]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:27 j3710 pvedaemon[1006603]: problem with client ::ffff:192.168.43.70; Broken pipe
Dec  6 16:02:28 j3710 pveproxy[1264927]: proxy detected vanished client connection
Dec  6 16:02:28 j3710 pvedaemon[1050137]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:30 j3710 pvestatd[3323]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:30 j3710 pvedaemon[1006603]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:30 j3710 pvedaemon[1127436]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:30 j3710 pvedaemon[1050137]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:31 j3710 pvestatd[3323]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:33 j3710 pvedaemon[1127436]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:33 j3710 pvedaemon[1006603]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:33 j3710 pvestatd[3323]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:33 j3710 pvedaemon[1050137]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:33 j3710 pveproxy[1249548]: proxy detected vanished client connection
Dec  6 16:02:34 j3710 pvestatd[3323]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:34 j3710 pvedaemon[1050137]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:34 j3710 pvedaemon[1006603]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:35 j3710 pvestatd[3323]: status update time (7.945 seconds)
Dec  6 16:02:36 j3710 pvedaemon[1127436]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:36 j3710 pveproxy[1166883]: proxy detected vanished client connection
Dec  6 16:02:36 j3710 pveproxy[1264927]: proxy detected vanished client connection
Dec  6 16:02:38 j3710 pvedaemon[1006603]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:38 j3710 pvedaemon[1127436]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:38 j3710 pvedaemon[1050137]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:39 j3710 pvedaemon[1050137]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:39 j3710 pvedaemon[1006603]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:39 j3710 pveproxy[1166883]: proxy detected vanished client connection
Dec  6 16:02:41 j3710 pvedaemon[1127436]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:41 j3710 pvestatd[3323]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:42 j3710 pvedaemon[1050137]: PBS_Manual_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:42 j3710 pvedaemon[1006603]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:42 j3710 pvestatd[3323]: PBS_Weekly_MainNAS: error fetching datastores - 500 Can't connect to 192.168.49.8:8007 (No route to host)
Dec  6 16:02:43 j3710 pveproxy[1249548]: proxy detected vanished client connection
Dec  6 16:02:44 j3710 pvedaemon[1127436]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:44 j3710 pvedaemon[1050137]: PBS_Weekly_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:44 j3710 pvestatd[3323]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)
Dec  6 16:02:44 j3710 pvedaemon[1006603]: PBS_Manual_BackupNAS: error fetching datastores - 500 Can't connect to 192.168.43.75:8007 (No route to host)

Would be really nice if the webUI could continue working with an unavailable storage. Maybe an unavailable storage could be temporarily disabled after X failed tries and while that storage is temporarily disabled, PVE could only try to poll that storage every 5 or 15 minutes instead of every few seconds. And automatically enable it, if that storage comes online again.

How it is now, it is really inconvenient to use the SSH client all the time and to disable/enable dozens of storages each day.

fiona · Dec 7, 2022

Hi,
thank you for the detailed write-up! There is an open bug report for this: https://bugzilla.proxmox.com/show_bug.cgi?id=3714
I'll add a link to this post there.

manilx · Jul 14, 2023

I have just found the same issue: https://forum.proxmox.com/threads/r...-dont-reconnect-after-nas-is-rebooted.130380/

AND it's bugging the hell out of me also. Why is this not yet fixed?

LnxBil · Jul 14, 2023

At least the metrics / rrdtool problem has been there since the dawn of time. It's a simple hanging mountpoint in Linux, but the way it has been for decades (really!). Any syscall involved with this hanging network share will result in a hang too. You can somehow mitigate this problem with manually defined hard timeouts for the network client, so that it'll fail faster, but you cannot solve the issue in Linux at all, or it would have been for the last decades.

Search

Search

Unavailable storage will make the webUI unusable...

Dunuin

Distinguished Member

fiona

Proxmox Staff Member

manilx

Member

LnxBil

Distinguished Member