pvestatd often reports storage offline with CIFS

Jan 16, 2018
237
56
68
Hi,

we see very often the problem, that pvestatd erroneously says the a CIFS Storage (we use it for Backup and ISO) is offline:

root@gar-ha-cfw01a:/usr/share/perl5/PVE/Storage# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2018-09-25 09:47:51 CEST; 11min ago
Process: 5231 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
Main PID: 5524 (pvestatd)
Tasks: 1 (limit: 4915)
Memory: 97.6M
CPU: 9.821s
CGroup: /system.slice/pvestatd.service
└─5524 pvestatd

Sep 25 09:56:33 gar-ha-cfw01a pvestatd[5524]: storage 'pve-iso' is not online
Sep 25 09:56:35 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
Sep 25 09:56:55 gar-ha-cfw01a pvestatd[5524]: storage 'pve-iso' is not online
Sep 25 09:57:13 gar-ha-cfw01a pvestatd[5524]: storage 'pve-iso' is not online
Sep 25 09:57:55 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
Sep 25 09:58:05 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
Sep 25 09:58:13 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
Sep 25 09:58:24 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
Sep 25 09:59:06 gar-ha-cfw01a pvestatd[5524]: storage 'pve-iso' is not online
Sep 25 09:59:14 gar-ha-cfw01a pvestatd[5524]: storage 'pve-cfw01a-bck' is not online
root@gar-ha-cfw01a:/usr/share/perl5/PVE/Storage#

The server is running the actual version from Enterprise repo:

Kernel Version Linux 4.15.18-4-pve #1 SMP PVE 4.15.18-23 (Thu, 30 Aug 2018 13:04:08 +0200)
PVE Manager Version pve-manager/5.2-9/4b30e8f9


The reason is probably that the login to the CIFS Server (samba with a windows Domain behind) is sometimes taking longer than 2 seconds.

time /usr/bin/smbclient //gar-sv-pool03b/pool03b_pve-bck-cfw01a -d 0 -m smb3 -U pve-backup -A /etc/pve/priv/pve-cfw01a-bck.cred -W AD -c echo 1 0
Domain=[AD] OS=[] Server=[]
echo <num> <data>

real 0m2.187s
user 0m0.031s
sys 0m0.004s


Changing the timeout in /usr/share/perl5/PVE/Storage/CIFSPlugin.pm from 2 to 5 seconds and restarting pvestatd did help.

So please change the timeout accourdingly, or make it configurable.
 
Thank you for this thread !! I have two domain controllers (primary & secondary) and every time my primary would need to restart or shutdown, I would loose access to CIFS shares. (ISO repo and backup)

I am really glad I found this post as changing the "timeout" from 2 to 5 fixed my issue!! I am surprised more people with a multiple domain controllers don't run into the same issue!!