With the loss of nfs storage, the LoadAverage has grown on all nodes cluster, some nodes have rebooted. Why?

Dec 25, 2017
23
0
21
36
With the loss of nfs storage, the loadavg has grown on all nodes cluster, some nodes have rebooted. Why?

Why does a high loadavg rebooted the node?
 
If you lose an NFS server, and it is mounted with option "hard" (the default), processes that attempt to use the mount point will be stuck until the server comes back. This will cause the load average to increase as more and more processes are waiting on the missing server. Basically, losing an NFS mount under default settings is like having a hard drive stop responding to commands.

I don't know why it would necessarily reboot the server though. Maybe it has something to do with what was stored there?

ETA: Another possibility is that you're using a watchdog and the process that was supposed to poke it got stuck.
 
Last edited:
If you lose an NFS server, and it is mounted with option "hard" (the default), processes that attempt to use the mount point will be stuck until the server comes back. This will cause the load average to increase as more and more processes are waiting on the missing server. Basically, losing an NFS mount under default settings is like having a hard drive stop responding to commands.

I don't know why it would necessarily reboot the server though. Maybe it has something to do with what was stored there?

ETA: Another possibility is that you're using a watchdog and the process that was supposed to poke it got stuck.

Thank you!

I don’t use Watchdog, I don’t understand why the reboot happened. Is using the soft option, safe for the backup storage?
 
Many programs, especially scripts, don't bother to check for write failure. So data can be lost if the server goes down with the soft mount option. Only you can decide if that's acceptable. I think Proxmox backup will tell you it failed so for that use-case it might make sense. But if it fails very often eventually the backup you need will be missing because the server was down. So it just moves the problem around, it doesn't fix it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!