Connect and ACCESS NFS and ISCSI shares

listhor · Nov 15, 2023

I'm not able to configure Proxmox correctly to access any iSCSI share (Truenas and Synology); it connects to server but doesn't access/see share. Esxi (same server - same IP, only booted by esxi) connects to Truenas without any problem

On top of that it is the same case with NFS shares in Truenas but it access correctly NFS shares in Synology. All NFS shares are set to use version 4 or 4.1.
How to troubleshoot it?

bbgeek17 · Nov 15, 2023

listhor said:
I'm not able to configure Proxmox correctly to access any iSCSI share (Truenas and Synology); it connects to server but doesn't access/see share. Esxi (same server - same IP, only booted by esxi) connects to Truenas without any problem

On top of that it is the same case with NFS shares in Truenas but it access correctly NFS shares in Synology. All NFS shares are set to use version 4 or 4.1.
How to troubleshoot it?

Did you properly update ACL lists for each protocol to allow new server/initiator/client to connect? This would be done on your respective NAS.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

listhor · Nov 16, 2023

bbgeek17 said:
Did you properly update ACL lists for each protocol to allow new server/initiator/client to connect? This would be done on your respective NAS.

When it comes to NFS - server uses same IP address (either booted to esxi or pve). Finally I managed to get NFS working but only using version 4.
For iscsi the only access control was initiator name. I removed that condition and still is the same:

EDIT:
And Truenas log is full of following:

Code:

Nov 16 09:35:41 freenas 1 2023-11-16T09:35:41.438893+01:00 xxx.com ctld 1049 - - child process 15177 terminated with exit status 1
Nov 16 09:35:42 freenas 1 2023-11-16T09:35:42.535343+01:00 xxx.com ctld 15178 - - 10.55.0.1: read: connection lost
Nov 16 09:35:42 freenas 1 2023-11-16T09:35:42.535636+01:00 xxx.com ctld 1049 - - child process 15178 terminated with exit status 1
Nov 16 09:35:44 freenas 1 2023-11-16T09:35:44.053828+01:00 xxx.com ctld 15179 - - 10.55.0.1: read: connection lost
Nov 16 09:35:44 freenas 1 2023-11-16T09:35:44.054162+01:00 xxx.com ctld 1049 - - child process 15179 terminated with exit status 1

But I think it is a known issue???

bbgeek17 · Nov 16, 2023

listhor said:
But I think it is a known issue???

may be? certainly lots of chatter about it on freenas forums. The messages could be an artifact of health check probes, or genuine network issue - impossible to say without proper troubleshooting. However, freenas community is better equipped with debugging log messages on freenas.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

listhor · Nov 23, 2023

I managed (no_root_squash was missing) to connect Proxmox to NFS share over internal bridge (Truenas' boot-pool is virtualized). Seemingly it also works without any mapping as well - just checked that.
But If I reboot Truenas or interrupt communication pve is not able to re-establish storage (displays error 500) - mounting directory seems to be broken (all its attributes displayed as "?"). I need to disable storage, unmount share manually and re-enable storage.

I can connect over regular LAN only briefly for a split second, when all information about storage is displayed and then disappears - and again error 500. It's not a network issue as I'm able to connect to Synology share - what is strange as there's no users mapping there at all. If I remove mapping in Truenas then connection lasts 5 seconds until information is gone
rpcinfo -p <IP> displays the same on both ends; showmount -e displays correct exports.

What are detailed conditions/requirements for PVE to connect NFS and iSCSI shares? I can't find them in docs....

bbgeek17 · Nov 23, 2023

listhor said:
connect Proxmox to NFS share over internal bridge (Truenas' boot-pool is virtualized).

can you provide an explanation what this means in the context of your setup?

listhor said:
But If I reboot Truenas or interrupt communication pve is not able to re-establish storage (displays error 500) - mounting directory seems to be broken (all its attributes displayed as "?"). I need to disable storage, unmount share manually and re-enable storage.

is Truenas an isolated external appliance?

listhor said:
I can connect over regular LAN only briefly for a split second, when all information about storage is displayed and then disappears - and again error 500.

Can you use standard Linux tools to mount your NFS share, ie "mount truenas:/export /mnt/test" - does this work and is it stable?

listhor said:
It's not a network issue as I'm able to connect to Synology share - what is strange as there's no users mapping there at all. If I remove mapping in Truenas then connection lasts 5 seconds until information is gone

what does this mean? As far as PVE NFS access is concerned, the NFS is executed by "root" user on PVE, there are no user impersonations or other users involved.

listhor said:
What are detailed conditions/requirements for PVE to connect NFS and iSCSI shares? I can't find them in docs....

PVE uses standard Linux tools to connect to NFS and/or iSCSI. Technically all you need is an industry standard implementation of NFS server code and/or of iSCSI target. And, of course, stable network.

Here is a sample of NFS mount options for most basic default NFS storage defined in PVE:
bbnas:/mnt/data/testing on /mnt/pve/bbnas type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

listhor · Nov 23, 2023

bbgeek17 said:
can you provide an explanation what this means in the context of your setup?

PVE --- vmbr0 (mtu 9000) ------------------------- net0-----------------------TrueNAS (boot-pool virtualized)
|___ vmbr1 (mtu 1500, trunk)------switch---vlan11 on lagg4095 (igb0:igb1 (passthrough)) ___|

bbgeek17 said:
is Truenas an isolated external appliance?

I'm not sure what you mean by isolated but I think above "drawing" explains it.

bbgeek17 said:
Can you use standard Linux tools to mount your NFS share, ie "mount truenas:/export /mnt/test" - does this work and is it stable?

Yes, I did it and it works hassles free.

bbgeek17 said:
what does this mean? As far as PVE NFS access is concerned, the NFS is executed by "root" user on PVE, there are no user impersonations or other users involved.

I meant either maproot or mapall settings on server side. I've read also on this forum that PVE requires no_root_squash but from my experience it doesn't look like that...

bbgeek17 said:
PVE uses standard Linux tools to connect to NFS and/or iSCSI. Technically all you need is an industry standard implementation of NFS server code and/or of iSCSI target. And, of course, stable network.

Good to hear it. But example given doesn't explain why pve can't reconnect storage (error 500 due to already occupied/busy mounting directory) and why the same thing happens in case of LAN connection just right after connection is established?

Does similar industry standard apply to iSCSI connections?

bbgeek17 · Nov 23, 2023

listhor said:
I'm not sure what you mean by isolated but I think above "drawing" explains it.

The "drawing" is not as self-explanatory as it may seem to a person who is intimately involved with the config on a daily basis.
What it tells me is that you have a mix of MTU sizes, that if not implemented properly _will_ introduce unpredictable and random failures.
In fact, its a good match to your symptoms - initial negotiation works, then larger packets/checks fail.

listhor said:
Does similar industry standard apply to iSCSI connections?

yes.

My recommendation is to reduce your network complexity.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

listhor · Nov 23, 2023

bbgeek17 said:
The "drawing" is not as self-explanatory as it may seem to a person who is intimately involved with the config on a daily basis.
What it tells me is that you have a mix of MTU sizes, that if not implemented properly _will_ introduce unpredictable and random failures.
In fact, its a good match to your symptoms - initial negotiation works, then larger packets/checks fail.

Jumbo frames are only within "internal" vmbr0 bridge (between pve and virtualized truenas). Regular, physical LAN works on MTU 1500. Exactly same setup worked flawlessly with esxi (vswitch connected to hypervisor and truenas).
As Synology NFS share works ok and previously Truenas has been working fine with esxi (plus manual mounting in pve works ok) - it looks like there's something not right in pve storage management layer (?)

bbgeek17 · Nov 23, 2023

listhor said:
Jumbo frames are only within "internal" vmbr0 bridge (between pve and virtualized truenas).

I would need to see a comprehensive network diagram, including all IPs, subnets, networks, routes. Is it possible that the traffic is being routed in a different way than you think it is?

listhor said:
Regular, physical LAN works on MTU 1500.

Good supporting point to move everything to regular MTU and start from there.

listhor said:
Exactly same setup worked flawlessly with esxi (vswitch connected to hypervisor and truenas).

Thats an "apples and oranges" comparison. Although at 10000 feet the concepts are similar, the internal implementations of network layers are completely different.

listhor said:
(plus manual mounting in pve works ok

Collect a network trace of full NFS negotiation for both working and non-working case. Compare them side by side - are there any differences?

listhor said:
it looks like there's something not right in pve storage management layer (?)

PVE is a set of packages that wrap API/GUI/CLI around several Open Source technologies (Linux, QEMU, Corosync, etc). PVE does not re-implement anything RFC protocol related. Hundreds of millions of PCs are running Linux/Qemu/Corosync right now without trouble.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

listhor · Nov 24, 2023

So, following is general layout of my network

Content of: cat /proc/mounts | grep nfs

PVE

Code:

172.16.1.10:/volume3/NFS/pvess /mnt/pve/mmds-nfs nfs4 rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.8,local_lock=none,addr=172.16.1.10 0 0
10.55.1.2:/mnt/wszystko/PVE/pvess/nfs-pvess /mnt/test nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.55.1.2,mountvers=3,mountport=43296,mountproto=udp,local_lock=none,addr=10.55.1.2 0 0
172.16.1.62:/mnt/wszystko/PVE/pvess/nfs-pvess /mnt/test nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.1.62,mountvers=3,mountport=57230,mountproto=udp,local_lock=none,addr=172.16.1.62 0 0
10.55.1.2:/mnt/wszystko/PVE/pvess/nfs-pvess /mnt/test nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.55.1.2,mountvers=3,mountport=43296,mountproto=udp,local_lock=none,addr=10.55.1.2 0 0
10.55.1.2:/mnt/wszystko/PVE/pvess/nfs-pvess /mnt/pve/truenas-nfs nfs4 rw,relatime,vers=4.2,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.55.0.1,local_lock=none,addr=10.55.1.2 0 0

PVE2

Code:

172.16.1.10:/volume3/NFS/pvett /mnt/pve/mmds_nfs nfs4 rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.11,local_lock=none,addr=172.16.1.10 0 0
172.16.1.62:/mnt/wszystko/PVE/pvett/nfs-pvett /mnt/pve/truenas_nfs nfs rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.1.62,mountvers=3,mountport=822,mountproto=udp,local_lock=none,addr=172.16.1.62 0 0

And same on Ubuntu:

Code:

10.55.1.2:/mnt/wszystko/Pliki/e-book /mnt/nfs/truenas/ebooki nfs rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.55.1.2,mountvers=3,mountport=822,mountproto=udp,local_lock=all,addr=10.55.1.2 0 0
10.55.1.2:/mnt/wszystko/Multimedia /mnt/nfs/truenas/media nfs4 rw,noatime,vers=4.2,rsize=131072,wsize=131072,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.55.1.3,local_lock=none,addr=10.55.1.2 0 0
172.16.1.10:/volume3/NFS/Subiekt /mnt/nfs/mmds/Subiekt nfs4 rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.20,local_lock=none,addr=172.16.1.10 0 0
172.16.1.10:/volume3/NFS/mailcow /mnt/nfs/mmds/mailcow nfs4 rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.20,local_lock=none,addr=172.16.1.10 0 0

Second PVE shows exactly same symptoms as main PVE (when it comes to LAN / hardware network connection).
Internal Truenas NFS share (over vmbr0-mtu9000) in PVE is fine untill connection is broken and once is healthy again can't be restored without manual intervention.
Ubuntu server has rock solid NFS mounts of both Truenas and Synology shares. That's why I think also shares and network are OK.

bbgeek17 · Nov 27, 2023

listhor said:
So, following is general layout of my network

Nothing jumps out as immediately suspect. Except, of course, this is quite complex for volunteer forum troubleshooting.

My advice - start getting network traces, make sure you get both sides of the communication, compare and contrast them. Try to reduce complexity and it back gradually.

good luck

PS you can also try to switch the NFS in PVE from being handled by PVE to a direct mount, ie like your Ubuntu and see if that makes a difference.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

Connect and ACCESS NFS and ISCSI shares

listhor

Member

bbgeek17

Distinguished Member

listhor

Member

bbgeek17

Distinguished Member

listhor

Member

bbgeek17

Distinguished Member

listhor

Member

bbgeek17

Distinguished Member

listhor

Member

bbgeek17

Distinguished Member

listhor

Member

bbgeek17

Distinguished Member

We value your privacy