pvestatd storage is not online

Kei

Well-Known Member
May 29, 2016
88
2
48
38
Hello,
I'm receiving this error message on /var/log/syslog of all nodes of my cluster:

Feb 21 12:25:39 pve1 pvestatd[1986]: storage 'HEZE_NFS_VM' is not online
Feb 21 12:25:42 pve1 pvestatd[1986]: storage 'HEZE_NFS_Clones' is not online
Feb 21 12:25:44 pve1 pvestatd[1986]: storage 'HEZE_NFS_ISO' is not online
Feb 21 12:25:46 pve1 pvestatd[1986]: storage 'HEZE_NFS_BKP' is not online
Feb 21 12:25:46 pve1 pvestatd[1986]: status update time (8.155 seconds)
Feb 21 12:27:49 pve1 pvestatd[1986]: storage 'HEZE_NFS_ISO' is not online
Feb 21 12:27:51 pve1 pvestatd[1986]: storage 'HEZE_NFS_BKP' is not online
Feb 21 12:27:53 pve1 pvestatd[1986]: storage 'HEZE_NFS_Clones' is not online
Feb 21 12:27:55 pve1 pvestatd[1986]: status update time (7.970 seconds)
Feb 21 12:28:49 pve1 pvestatd[1986]: storage 'HEZE_NFS_Clones' is not online
Feb 21 12:28:51 pve1 pvestatd[1986]: storage 'HEZE_NFS_VM' is not online
Feb 21 12:28:53 pve1 pvestatd[1986]: storage 'HEZE_NFS_BKP' is not online
Feb 21 12:28:55 pve1 pvestatd[1986]: storage 'HEZE_NFS_ISO' is not online
Feb 21 12:28:55 pve1 pvestatd[1986]: status update time (8.182 seconds)
Feb 21 12:28:59 pve1 pvestatd[1986]: storage 'HEZE_NFS_ISO' is not online
Feb 21 12:29:01 pve1 pvestatd[1986]: storage 'HEZE_NFS_BKP' is not online
Feb 21 12:29:19 pve1 pvestatd[1986]: storage 'HEZE_NFS_ISO' is not online
Feb 21 12:29:21 pve1 pvestatd[1986]: storage 'HEZE_NFS_BKP' is not online
Feb 21 12:29:23 pve1 pvestatd[1986]: status update time (6.124 seconds)

This event repeats in brief or long intervals (1-30 minutes) and might happen every couple of days.
I believe it's impossible that the network is to be blamed. At most, this could be a problem on the NAS side, but the connection seems to be fine, even during the timespan where these errors are generated.
Also, only NFS has this problem, while iSCSI is fine, even if the storage server is the same.
Is it possible to troubleshoot in depth these errors? If so, where should I begin to look?
 
I am having this exact same error after an update from Proxmox....I have changed nothing with any of my storage devices and one day they just randomly stopped working. Kinda useless without my NFS mounts(with all my VM data) being accessible. Hopefully someone can lead us in the right direction towards a fix.
 
I am having this exact same error after an update from Proxmox....I have changed nothing with any of my storage devices and one day they just randomly stopped working. Kinda useless without my NFS mounts(with all my VM data) being accessible. Hopefully someone can lead us in the right direction towards a fix.

we are using NFS very heavily in our (test) environment and have not been able to reproduce this issue. the online check does nothing more than "/sbin/showmount --no-headers --exports NFSSERVER" with a timeout of two seconds. so either it takes too long, or your connection to the NFS server is flaky.
 
the online check does nothing more than "/sbin/showmount --no-headers --exports NFSSERVER" with a timeout of two seconds
Thank yuo for this information, this will be good to make some troubleshooting, expecially because the VLAN where the NFS server resides, is supposed to be mostly idle, so I'm very curious to understand why the server is not responding fast enough.

You're right Manu, I should have started this tread under network/firewalling. My bad.

Kinda useless without my NFS mounts(with all my VM data) being accessible.
So you can confirm that you're actually having VM's down due to this? Maybe my disconnections are too short to have actual problems on my vm's.
However, are you using freenas by any chance?
 
with a timeout of two seconds
Wait, are you saying that "status update time (8.155 seconds)" means that instead of replying in >2 seconds, the server took over 8 seconds to do so?
 
Wait, are you saying that "status update time (8.155 seconds)" means that instead of replying in >2 seconds, the server took over 8 seconds to do so?

2 seconds is just for the NFS online check (which is killed after 2 seconds), the log talks about the whole "status update" which includes all storages (and prints that warning if not done in <= 5 seconds ). the latter can happen, but it is usually a sign that either a storage or the hypervisor node itself is overloaded. if you frequently see this message, there is most likely room for improvement somewhere..
 
so either it takes too long, or your connection to the NFS server is flaky.
Fair enough. I think I can test this sniffing the traffic in normal conditions and when this error occours to see the difference of what happens in the network. Btw, is it possible to see an output for all the responses for this online check, that is to say also the responses from the NFS server that did not cause a timeout?

EDIT: I rephrase my question. If I wanted it to print a message even if the update has completed within 5 seconds, which script should I work on?
 
Last edited:
line 523f of /usr/share/perl5/PVE/Service/pvestatd.pm print that information to syslog, you can comment out line 524 and add a ; to the end of 523 to temporarily print it always. but note that the update happens every 10 seconds - so this will produce a lot of log lines ;)
 
line 523f of /usr/share/perl5/PVE/Service/pvestatd.pm print that information to syslog, you can comment out line 524 and add a ; to the end of 523 to temporarily print it always. but note that the update happens every 10 seconds - so this will produce a lot of log lines ;)

Thank you fabian, I will try that asap.
 
So I am getting this error as well I am on PVE 5.0.23, and i believe Debian is at 9.1 This install was from your ISO. The following is from the syslog:
Oct 3 05:04:57 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:00 cfpve1 systemd[1]: Starting Proxmox VE replication runner...
Oct 3 05:05:00 cfpve1 pvedaemon[1068]: storage 'ISOs' is not online
Oct 3 05:05:01 cfpve1 systemd[1]: Started Proxmox VE replication runner.
Oct 3 05:05:07 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:10 cfpve1 pvedaemon[1070]: <root@pam> successful auth for user 'root@pam'
Oct 3 05:05:17 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:28 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:37 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:48 cfpve1 pvestatd[1030]: storage 'ISOs' is not online
Oct 3 05:05:57 cfpve1 pvestatd[1030]: storage 'ISOs' is not online

This goes all the way to the end of the log.

I have attached a few shots of what I did, I am not sure where to go from any help/advice would be greately appreciated.

Thanks,
Michael
 

Attachments

  • Promox_Stor6.png
    Promox_Stor6.png
    46.4 KB · Views: 39
  • Proxmox_Stor7.png
    Proxmox_Stor7.png
    46.8 KB · Views: 33
I'm having the same issue with NFS and QNAP NAS. Upgraded from Proxmox 3.x which had no issues at all with the same NFS share on the same NAS device. No network changes as well. Any help with this will be appreciated.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!