Unreachable nfs-share causes load 1 on pve host

vidahon820

New Member
Oct 15, 2024
2
0
1
Hi,

I have a nfs mount on my pve host to my synology nas:

Code:
root@pve:~# cat /etc/pve/storage.cfg

nfs: Synology_Video
        export /volume1/video
        path /mnt/pve/Synology_Video
        server 192.168.1.11
        content iso
        options vers=4,soft
        prune-backups keep-all=1

to serve videos to an unprivileged lxc running jellyfin. The use case is: start synology via wol, wait until synology is ready and the pve picks up the nfs share, watch movie via jellyfin, shutdown synology.

Shortly after the synology has shutdown, the load on the pve host is rising until it reaches load 1 and then it stays there until either the synology comes back online or I reboot the pve host.

Code:
root@pve:~# journalctl -f
Feb 14 13:11:04 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:11:05 pve kernel: nfs: server 192.168.1.11 not responding, timed out
Feb 14 13:11:15 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:11:24 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:11:33 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:11:46 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:11:55 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:04 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:13 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:26 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:35 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:44 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:12:53 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:06 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:15 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:24 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:28 pve kernel: nfs: server 192.168.1.11 not responding, timed out
Feb 14 13:13:33 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:45 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:13:55 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:14:04 pve pvestatd[21847]: storage 'Synology_Video' is not online
Feb 14 13:14:10 pve kernel: nfs: server 192.168.1.11 not responding, timed out

The load does not seem to come from any process I can figure out in htop. What is going on and how can I fix this?

Thanks
 
The load does not seem to come from any process I can figure out in htop.
That is not always the case. The load value does "only" represent a process that will have the CPU, so a value of 1 states, that one process will have the CPU all the time. This does not mean that it does something useful. You can have a load value of 1000 and everything is smooth or you can have a load value of 20 and your system is crawling. Hanging mounts are known to have such an impact on the load value. Anything that is in an uninterruptible sleep (D state) can cause this.
 
Thanks for your replies. I went with the following workaround, so I can keep my workflow:

added a custom script on my Synology to execute on shutdown, which sshs into pve and runs these commands:
Bash:
#!/bin/bash

systemctl stop pvestatd.service
sleep 2
umount -fl /mnt/pve/Synology_Video

added a second script to execute every minute via crontab, which restarts the pvestatd service if not running:
Bash:
#!/bin/bash

SERVICENAME="pvestatd"

systemctl is-active --quiet $SERVICENAME
STATUS=$? # return value is 0 if running

if [[ "$STATUS" -ne "0" ]]; then
  sleep 20
  echo "Service '$SERVICENAME' is not curently running... Restarting now..."
  service $SERVICENAME restart
fi

Maybe not the most elegant solution, but it does work.