[SOLVED] nfs mounts using wrong source ip/interface

6uellerbpanda

Well-Known Member
Sep 15, 2015
100
6
58
Linz
6uellerbpanda.gitlab.io
since upgrade to pve 6.1 (it was working fine with 6.0) we've the problem that nfs mounts are using random source ip/interfaces and not the one in the same vlan.

our current config looks like this:

pve-manager/6.1-7/13e58d5e (running kernel: 5.0.21-5-pve)

Code:
# /etc/pve/storage.cfg
nfs: sto-nas-02_pvmfs
      server 10.0.11.2
      export /mnt/tank/pvmfs
      path /mnt/pve/sto-nas-02_pvmfs
      content images, vztmpl, rootdir, backup
      maxfiles 1
      options noatime,noacl,sync

nfs: sto-nas-01_pvmfs
      server 10.0.11.1
      export /mnt/tank/pvmfs
      path /mnt/pve/sto-nas-01_pvmfs
      content images, vztmpl, rootdir, backup
      maxfiles 1
      options noatime,noacl,sync


Code:
# /etc/network/interfaces

## STORAGE
auto enp9s0.11
iface enp9s0.11 inet static
    address 10.0.11.3
    netmask 255.255.255.128
    
## MGMT
iface enp9s0 inet manual
auto vmbr0
iface vmbr0 inet static
    address 10.0.100.33
    netmask 255.255.255.0
    gateway  10.0.100.254
    bridge_vlan_aware yes
    bridge_ports enp9s0
    bridge_stp off
    bridge_fd 0


and here is the mount output:
Code:
10.0.11.1:/mnt/tank/pvmfs on /mnt/pve/sto-nas-01_pvmfs type nfs4 (rw,noatime,sync,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.11.3,local_lock=none,addr=10.0.11.1)
10.0.11.2:/mnt/tank/pvmfs on /mnt/pve/sto-nas-02_pvmfs type nfs4 (rw,noatime,sync,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.100.33,local_lock=none,addr=10.0.10.20)

as you can see first mount (sto-nas-01_pvmfs) is working as expected and uses the correct int/ip for the same vlan.
the second one although not - it us using the int/ip of the bridge interface although it has an interface sitting in the same vlan.

we've multiple pve servers with the same network config and accessing the same storage, and with some it is exactly the other way around, or all mounts are correct, or all aren't correct - totally random.

there's a nfs4 mount option clientaddr which I think could solve the problem but as the storage.cfg is a clusterwide file I can't put node specific ip address as a mount option.


does anybody have the same problem or know a workaround to this ?
 
This is strange, since your network config looks right.

please post:
* the output of `ip route`
* the output of `ip route get 10.0.11.2`
* the output of `ip route get 10.0.11.1`
* the output of `showmount -e 10.0.11.2`
* the output of `showmount -e 10.0.11.1`
on that node

check the `dmesg`/journal of the node for hints to what's going on.

also check and compare the nfs configuration of both storage boxes

is there anything specific to the setup? what is the storage box?

I hope this helps!
 
@Stoiko Ivanov thanks for your time

here you go:

Code:
root@hv-vm-01:/root# ip route
default via 10.0.100.254 dev vmbr0 onlink
10.0.11.0/25 dev enp9s0.11 proto kernel scope link src 10.0.11.3
10.0.12.0/28 dev enp1s0f0 proto kernel scope link src 10.0.12.1
10.0.100.0/24 dev vmbr0 proto kernel scope link src 10.0.100.33

root@hv-vm-01:/root# ip route get 10.0.11.2
10.0.11.2 dev enp9s0.11 src 10.0.11.3 uid 0 
    cache 
root@hv-vm-01:/root# ip route get 10.0.11.1
10.0.11.1 dev enp9s0.11 src 10.0.11.3 uid 0 
    cache

# we added the additional subnets as a workaround
root@hv-vm-01:/root# showmount -e 10.0.11.2
/mnt/tank/pvmfs                     10.0.20.0,10.0.11.0,10.0.100.0

root@hv-vm-01:/root# showmount -e 10.0.11.1
/mnt/tank/pvmfs                     10.0.11.0

check the `dmesg`/journal of the node for hints to what's going on.
dmesg or log files reveal nothing

also check and compare the nfs configuration of both storage boxes
they're identical

is there anything specific to the setup? what is the storage box?
network config and everything else is the same on all servers
storage is freenas

what I forgot to mention is that it gives me the same error when I try to mount it manually from the terminal.
adding the "clientaddr" option though works...so I don't think it is a pve specific problem.
 
the routing looks correct.

I just noticed that you're running a quite outdated kernel - could you try installing the latest updates and reboot (into a current 5.4 kernel)?
 
it's not a problem with the addr ?

Code:
local_lock=none,addr=10.0.10.20
 
hm - tbh not sure where this would come from - in my experience connecting to an ip does only use the routing table for information and NFS over TCP never made some magic negotiations (though last time I had a problem with a NFS mount is also quite a while ago).

Things I would check:
* do you have any kind of masquarading/NATing rules on the system? (freenas or PVE)
* what kind of equipment is between freenas and PVE?
* does this equipment maybe do something which causes the issue?
* does the problem still persist if you change the NFS version?

once this is ruled out I would probably start tcpdumping while running a mount command and check the pcap file with wireshark (this quite often provides the necessary hint) - and then use `strace -f`on the mount command to see where the IP from the other network comes into play

I hope this helps!
 
A colleague just suggested - you should also try switching to NFS v3
seems nfsv4 is also relying on DNS (forward and reverse lookups) for access control
 
so I made an upgrade to 6.2 (5.4.60-1-pve) yesterday and the outcome is the same.

also made an strace and a tcpdump. see tar.

interesting is that initial nfs session is going via the correct interface and then beginning with SECINFO_NO_NAME (acording to rfc this handles the sec between client and server) is handled via diff interface

I compared the output both with the nfs server which works (10.0.11.1) and the flow is the same - except the change of interfaces.

I also found out that clientaddr mount option is ignored/not used with nfs 4.1 anymore 'cause it is using a session based model now - https://bugzilla.redhat.com/show_bug.cgi?id=1821903

out of curiosity I only allowed the storage subnet 10.0.11.x on freenas for that nfs share (as it was in the beginning) and surprisingly and it was able to mount it without any problems though in the mount output of pve it was using the ip in the different subnet.

imho this shouldn't work except nfs server is fine when nfs EXCHANGE_ID is coming from the allowed subnet and everything else doesn't matter anymore.

also tried to mount it with vers=4.0 which worked perfectly fine so I guess this is a nfs 4.1 specific problem.


next maintenance window I will reboot the nfs server in question to rule out the problem on freenas side - the mounting of non allowed subnet made me thinking that the problem lies on the other end...
 

Attachments

  • hv-vm-02_dump.tar.gz
    10.7 KB · Views: 0
a small update and the solution:

a few weeks ago we upgraded freenas but it was still mounting with the wrong ip/interface.

today we updated proxmox to 6.4-13 with 5.4.151-1 kernel and that solved the problem without any further adjustment.
so I guess at the end in what was a kernel bug but I'm too lazy to look it up.
 
I have the same issue here.

When using mount-option: -o nfsvers=4.1 on the command line it works fine. When using -o nfsvers=4.2 on the command line the wrong ip address is used.

Another Note: Shortly the OS of the NFS-Server has been upgraded(Debian 10 -> 11).

As workaround for now, I added the following options-line to /etc/pve/storage.conf

nfs: backup-pxc01
export /backup/nfs/proxmox/pxc01
path /mnt/pve/backup-pxc01
server 192.168.208.41
content backup

options vers=4.1
nodes px01,px03,px15

Running Kernel is 5.13.19-6-pve on all nodes. Maybe reboot fixes this too.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!