Cannot mount NFS directory but truenas box is reachable and can mount via shell

RogueJesus

Member
Jul 18, 2022
11
0
6
Hey, so pretty much what it says on the title. I have 2 PVEs, pvemain [10.5.0.10] and pvecubi [10.5.0.2] (both with DHCP reservation). Pvemain is running truenas on 10.51.0.5 (which operates on a VXLAN network, if at all relevant) and pvecubi, which is a clean proxmox install.

Trying to create a NFS Directory in pvecubi throws storage 'isos-remote' is not online (500), however I can both 1) ping 10.51.0.5 and 2) mount the share directly via mount -t nfs 10.51.0.5:/mnt/tank/isos test_mount/ (also wrote and ran a .sh file out of the mounted dir out of sanity).

Showmount returns the same both on pvemain and pvecubi, *even though I have succesfully created a NFS Directory via the Proxmox UI on pvemain*.
Code:
root@pvecubi:~# showmount -e 10.51.0.5
clnt_create: RPC: Program not registered

I found this topic mentioning the culprit could be hostname resolution but I cannot see anything wrong with it. Here follows the hosts files:

pvemain:
Code:
127.0.0.1 localhost.localdomain localhost
10.5.0.10 pvemain.local pvemain
10.5.0.2  pvecubi.local pvecubi
10.51.0.5 truenas.local truenas

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

pvecubi:
Code:
127.0.0.1 localhost.localdomain localhost
10.5.0.2 pvecubi.local pvecubi
10.5.0.10 pvemain.local pvemain
10.51.0.5 truenas.local truenas

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

truenas VM:
Code:
127.0.0.1    truenas.local truenas
127.0.0.1    localhost
10.5.0.2    pvecubi.local pvecubi
10.5.0.10    pvemain.local pvemain
# The following lines are desirable for IPv6 capable hosts
::1    localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
# STATIC ENTRIES

Possibly relevant logs on pvecubi:
Code:
root@pvecubi:~# dmesg | grep -i nfs
[    4.351992] RPC: Registered tcp NFSv4.1 backchannel transport module.
[  184.078375] NFS: Registering the id_resolver key type

Code:
root@pvecubi:~# journalctl -xe | grep -i nfs
Mar 13 01:37:45 pvecubi kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
Mar 13 01:37:45 pvecubi systemd[1]: rpc-gssd.service - RPC security service for NFS client and server was skipped because of an unmet condition check (ConditionPathExists=/etc/krb5.keytab).
Mar 13 01:37:45 pvecubi systemd[1]: Reached target nfs-client.target - NFS client services.
░░ Subject: A start job for unit nfs-client.target has finished successfully
░░ A start job for unit nfs-client.target has finished successfully.
Mar 13 01:37:47 pvecubi systemd[1]: Starting rpc-statd-notify.service - Notify NFS peers of a restart...
Mar 13 01:37:47 pvecubi systemd[1]: Started rpc-statd-notify.service - Notify NFS peers of a restart.
Mar 13 01:40:44 pvecubi kernel: NFS: Registering the id_resolver key type
Mar 13 01:40:44 pvecubi nfsrahead[1888]: setting /root/test_mount readahead to 128
Mar 13 01:58:00 pvecubi nfsrahead[6793]: setting /root/test_mount readahead to 128

Please advise, I am a bit lost on how to debug this
 
Last edited:
Hey @bbgeek17, thanks. But I don't get it, if it was a network issue causing the NFS server to be unreachable, how come I can mount it if I do so through to the PVE shell?

I have noticed TrueNAS also shows it as an active session, so it's only Proxmox's UI that cannot reach it.1741871667956.png

Edit: but to try and discard network issues, I have added willdcard rules to my firewall (opnsense), allowing all traffic in/out.
 
Last edited:
Hey, thanks. But I don't it, if it was a network issue causing the NFS server to be unreachable, how come I can mount it if I do so through to the PVE shell?
Because you are not predicating your "mount" on successful run of "rpcinfo" and "showmount".

If you were to run showmount -e 10.51.0.5 && mount 10.51.0.5:/myexport /mnt/export it will fail.
That is the logic used inside PVE.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Because you are not predicating your "mount" on successful run of "rpcinfo" and "showmount".

If you were to run showmount -e 10.51.0.5 && mount 10.51.0.5:/myexport /mnt/export it will fail.
That is the logic used inside PVE.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Got you, could you please point in the right direction in regards to "allowing rpc traffic" from flowing through? And how come pvemain [10.5.0.10] could reach the NFS mount via the UI? Even though it is hosting the truenas VM [10.51.0.5], it's still on a different subnet and should follow a similar route as pvecubi (edit: and the showmount command also fails on pvemain, even though, again, the UI can mount it)

Code:
root@pvemain:~# showmount -e 10.51.0.5
clnt_create: RPC: Program not registered
 
Last edited:
Got you, could you please point in the right direction in regards to "allowing rpc traffic" from flowing through? And how come pvemain [10.5.0.10] could reach the NFS mount via the UI? Even though it is hosting the truenas VM [10.51.0.5], it's still on a different subnet and should follow a similar route as pvecubi.
The fact that one of the hosts can successfully send/receive RPCs, indicates that the issue is likely not in the NAS configuration.

My guess is that the traffic does not flow the way you think it should. The quickest path to resolution would be to run "tcpdump" on both sides and make conclusions based on what is observed during the requests.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Well, I have done so and the traffic seems to be reaching the machines:

NAS box
Code:
12:26:04.626420 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [S], seq 2298163604, win 64240, options [mss 1460,sackOK,TS val 2015254598 ecr 0,nop,wscale 7], length 0
12:26:04.626454 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [S.], seq 1760227987, ack 2298163605, win 64308, options [mss 1410,sackOK,TS val 2860897434 ecr 2015254598,nop,wscale 7], length 0
12:26:04.627866 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 1, win 502, options [nop,nop,TS val 2015254599 ecr 2860897434], length 0
12:26:04.627866 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [P.], seq 1:93, ack 1, win 502, options [nop,nop,TS val 2015254600 ecr 2860897434], length 92
12:26:04.627901 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [.], ack 93, win 502, options [nop,nop,TS val 2860897436 ecr 2015254600], length 0
12:26:04.628151 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [P.], seq 1:33, ack 93, win 502, options [nop,nop,TS val 2860897436 ecr 2015254600], length 32
12:26:04.629379 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 33, win 502, options [nop,nop,TS val 2015254601 ecr 2860897436], length 0
12:26:04.629523 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [F.], seq 93, ack 33, win 502, options [nop,nop,TS val 2015254601 ecr 2860897436], length 0
12:26:04.629549 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [F.], seq 33, ack 94, win 502, options [nop,nop,TS val 2860897437 ecr 2015254601], length 0
12:26:04.629690 IP 10.5.0.2.857 > 10.51.0.5.111: UDP, length 88
12:26:04.629777 IP 10.51.0.5.111 > 10.5.0.2.857: UDP, length 28
12:26:04.631092 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 34, win 502, options [nop,nop,TS val 2015254603 ecr 2860897437], length 0
12:26:14.150898 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [S], seq 838504682, win 64240, options [mss 1460,sackOK,TS val 2015264122 ecr 0,nop,wscale 7], length 0
12:26:14.150926 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [S.], seq 1203095224, ack 838504683, win 64308, options [mss 1410,sackOK,TS val 2860906959 ecr 2015264122,nop,wscale 7], length 0
12:26:14.152368 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [.], ack 1, win 502, options [nop,nop,TS val 2015264124 ecr 2860906959], length 0
12:26:14.152369 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [P.], seq 1:93, ack 1, win 502, options [nop,nop,TS val 2015264124 ecr 2860906959], length 92
12:26:14.152418 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [.], ack 93, win 502, options [nop,nop,TS val 2860906960 ecr 2015264124], length 0
12:26:14.152624 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [P.], seq 1:33, ack 93, win 502, options [nop,nop,TS val 2860906960 ecr 2015264124], length 32
12:26:14.153800 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [.], ack 33, win 502, options [nop,nop,TS val 2015264125 ecr 2860906960], length 0
12:26:14.153806 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [F.], seq 93, ack 33, win 502, options [nop,nop,TS val 2015264125 ecr 2860906960], length 0
12:26:14.153839 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [F.], seq 33, ack 94, win 502, options [nop,nop,TS val 2860906961 ecr 2015264125], length 0
12:26:14.153981 IP 10.5.0.2.900 > 10.51.0.5.111: UDP, length 88
12:26:14.154065 IP 10.51.0.5.111 > 10.5.0.2.900: UDP, length 28
12:26:14.155198 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [.], ack 34, win 502, options [nop,nop,TS val 2015264127 ecr 2860906961], length 0

pvecubi
Code:
19:26:04.626689 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [S], seq 2298163604, win 64240, options [mss 1460,sackOK,TS val 2015254598 ecr 0,nop,wscale 7], length 0
19:26:04.628386 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [S.], seq 1760227987, ack 2298163605, win 64308, options [mss 1410,sackOK,TS val 2860897434 ecr 2015254598,nop,wscale 7], length 0
19:26:04.628405 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 1, win 502, options [nop,nop,TS val 2015254599 ecr 2860897434], length 0
19:26:04.628533 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [P.], seq 1:93, ack 1, win 502, options [nop,nop,TS val 2015254600 ecr 2860897434], length 92
19:26:04.629923 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [.], ack 93, win 502, options [nop,nop,TS val 2860897436 ecr 2015254600], length 0
19:26:04.629990 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [P.], seq 1:33, ack 93, win 502, options [nop,nop,TS val 2860897436 ecr 2015254600], length 32
19:26:04.629996 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 33, win 502, options [nop,nop,TS val 2015254601 ecr 2860897436], length 0
19:26:04.630012 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [F.], seq 93, ack 33, win 502, options [nop,nop,TS val 2015254601 ecr 2860897436], length 0
19:26:04.630099 IP 10.5.0.2.857 > 10.51.0.5.111: UDP, length 88
19:26:04.631681 IP 10.51.0.5.111 > 10.5.0.2.857: Flags [F.], seq 33, ack 94, win 502, options [nop,nop,TS val 2860897437 ecr 2015254601], length 0
19:26:04.631687 IP 10.5.0.2.857 > 10.51.0.5.111: Flags [.], ack 34, win 502, options [nop,nop,TS val 2015254603 ecr 2860897437], length 0
19:26:04.631721 IP 10.51.0.5.111 > 10.5.0.2.857: UDP, length 28
19:26:14.151026 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [S], seq 838504682, win 64240, options [mss 1460,sackOK,TS val 2015264122 ecr 0,nop,wscale 7], length 0
19:26:14.153036 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [S.], seq 1203095224, ack 838504683, win 64308, options [mss 1410,sackOK,TS val 2860906959 ecr 2015264122,nop,wscale 7], length 0
19:26:14.153063 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [.], ack 1, win 502, options [nop,nop,TS val 2015264124 ecr 2860906959], length 0
19:26:14.153224 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [P.], seq 1:93, ack 1, win 502, options [nop,nop,TS val 2015264124 ecr 2860906959], length 92
19:26:14.154388 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [.], ack 93, win 502, options [nop,nop,TS val 2860906960 ecr 2015264124], length 0
19:26:14.154421 IP 10.51.0.5.111 > 10.5.0.2.900: Flags [P.], seq 1:33, ack 93, win 502, options [nop,nop,TS val 2860906960 ecr 2015264124], length 32
19:26:14.154430 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [.], ack 33, win 502, options [nop,nop,TS val 2015264125 ecr 2860906960], length 0
19:26:14.154450 IP 10.5.0.2.900 > 10.51.0.5.111: Flags [F.], seq 93, ack 33, win 502, options [nop,nop,TS val 2015264125 ecr 2860906960], length 0
19:26:14.154622 IP 10.5.0.2.900 > 10.51.0.5.111: UDP, length 88
19:26:14.155838 IP 10.51.0.5.111 > 10.5.0.2.900: UDP, length 28

I'm not sure what's wrong, but are the tcp sessions too soon or is this normal? Could it be related to the fact that there's VXLAN configured in both boxes? The NAS VM is running on top of a VXLAN (created via SDN)
 
Well........ Turns out all I had to do was force it to use version 4 on the UI, I assumed 'default' would just identify/negotiate the appropriate version with the server.