NFS Hanging/Shares Inaccessible on 4.15.17-3-pve

trystan

New Member
Dec 15, 2017
21
1
3
35
I have a three node cluster, the 4.15.17-3-pve kernel is freezing NFS client access on one host (Supermicro X8DT3)

The only difference on this server is a balance-rr setup that directly connects to the NFS server (Fully updated proxmox host.)

Rolling back to the previous 4.15.17-2-pve kernel solves the NFS hanging.

Interestingly enough, when I first boot with the 4.15.17-3-pve kernel I can cd into the mount and browse the NFS share, it's only when booting a VM that freezes the system.

Here's the storage config on the client:
Code:
nfs: vm
        export /rpool/data
        path /mnt/pve/vm
        server sionis-nfs
        content images,rootdir
        maxfiles 8
        nodes cyrus,lucius
        options vers=4.2,async,hard,tcp,noatime

Here's the export on the server:
Code:
/rpool/data 172.16.8.54(rw,async,no_root_squash,no_subtree_check,crossmnt)
 
Check your logfiles (journal/syslog) and performance monitoring. From your description it seems, that your NFS (or network) is to slow.
 
Alwin, while you might be correct, can you explain why would a bit older version of kernel work just fine?
He stated: "Rolling back to the previous 4.15.17-2-pve kernel solves the NFS hanging."
 
I noticed a lot of commits for Intel NIC module changes on the new kernel build.

The network is a 3x 1GB bonded direct attach with round robin. Literally 3 1gb ports directly connected to 3 1gb ports between 2 proxmox hosts.

On the newer kernel I'm unable to even load df -h because of the system freeze.
 
pve-kernel (4.15.17-12) unstable; urgency=medium

* backport fix for SUN NICs when used with Open vSwitch

* update and re-enable out-of-tree Intel ethernet drivers (e1000e, igb, ixgbe)

-- Proxmox Support Team <support@proxmox.com> Fri, 08 Jun 2018 11:18:32 +0200

pve-kernel (4.15.17-10) unstable; urgency=medium

* update to Ubuntu-4.15.0-22.24

* update ZFS to 0.7.9-pve1

-- Proxmox Support Team <support@proxmox.com> Tue, 22 May 2018 11:15:44 +0200
There was a change back to the OOT intel nic drivers. This well can have a influence, try to use a different bond algorithm, like active-backup, to see if this solves the problem. The balance-rr will under load send packages out-of-sync and that might cause issues, depending on software.
 
There was a change back to the OOT intel nic drivers. This well can have a influence, try to use a different bond algorithm, like active-backup, to see if this solves the problem. The balance-rr will under load send packages out-of-sync and that might cause issues, depending on software.
I'll give a different algo a try when I can.

I was/am on the older kernel getting great speeds over round robin with NFS though.