Recurrent network storage unavailability/poor performance - TrueNAS

danb35

Renowned Member
Oct 31, 2015
84
6
73
I'm having a recurring problem that NFS mounts on my TrueNAS server go "offline" in my PVE cluster, and/or remain online but show very poor performance. Trying to track down why it's happening, and what I can do to to address it.

My TrueNAS server is running TrueNAS CORE 12.0-U5.1. It has 2x Xeon E5-2670s, 128 GB of RAM, and two storage pools. The first storage pool consists of 4x 6-disk RAIDZ2 vdevs (24 disks total) of varying sizes, and is a little over half full. This pool contains my jails, a few SMB shares, and a couple of NFS exports. The second pool consists of 4x 2TB disks in mirrored pairs. It has one NFS export and one iSCSI target. Other client systems, primarily via SMB, don't seem to have any performance issues with the server.

My PVE cluster is running the latest update of PVE 7. It consists of three nodes of a Dell PowerEdge C6220, each with 2x Xeon E5-2680v2 and ~80 GB of RAM. They're connected to each other, and to the TrueNAS box, via 10 GbE. NFS exports from both pools are mounted to the cluster as storage--the first for ISOs, container templates, and backups (the latter not being used much since I started using PBS), and the second pool for a few low-activity VM disk images (most virtual disks are stored on a Ceph pool, about which I have no complaints).

Frequently, though not constantly, the cluster reports both storages to be unavailable. ls /mnt/pve hangs, and any tasks that involve either of those mounts fail. But there's nothing obviously wrong on the TrueNAS box--there isn't a great deal of I/O latency, there's plenty of CPU capacity, ARC hit ratio is fine. A little stumped about how to track this down--any ideas?
 
There is an TrueNAS CORE 12.0-U6 update that fixed a bug "NFSv4 mount does not recover after failover".
 
  • Like
Reactions: danb35
Hmmm. I know it's happened with previous versions as well, but it's worth a try. Updated the TrueNAS server to 12.0-U6, and rebooted each of the cluster members. Let's see what happens.
 
show very poor performance.
I can confirm that since a few versions back the Proxmox VMs are horribly slow. Storage is not even half full.

Proxmox 7.0.13
TrueNAS 12.0-U6

No hang on /mnt/pve but "storage not online". Not persistent but once while trying to start a VM. Never had that before.

Another culprit could be my UniFi Switch equipment. Do you by chance also have Ubiquiti switches?
 
I should have updated this thread earlier, but was reluctant to call it "solved" without a good bit of experience. Since updating TrueNAS to -U6, I haven't seen the NAS be marked "offline" for any of my PVE hosts--I don't watch them constantly, of course, but I haven't seen it. Performance has been acceptable since then, but the only thing I use the NFS on the TrueNAS box for are ISOs, a backup of my PBS VM (I guess it makes sense that I can't back it up to itself), and a very few low-utilization VM disks. I use iSCSI to provide the storage disk for the PBS VM, and that works just fine. Other than that, my VMs live on Ceph. So as far as I can see, the problem appears solved--even if I'm a little uneasy calling it that, as it had been intermittently present with earlier TrueNAS/FreeNAS releases as well.

I have a small Ubiquiti switch, but it isn't in the path from the TrueNAS box to the PVE hosts.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!