very disappointing ZFS... seems to pause when copying big files, very bad perfs.

fab26

Member
Aug 17, 2020
11
1
8
56
Hi,
I'm trying to set up a proxmox node with a ZFS pool and I got very bad performances. I'm a bit new to zfs and I think I need some help !

Symptom: when I try to copy a few GB to the server, it starts at full speed, then pause for 5-10 seconds, then starts again for 1-2s, then pause again etc.
the average speed is very low. I mean very very low.

This is a raidz1 setup, 4 x 8TB SATA, 128GB SSD for ZIL & Cache, another 128GB SDD for OS & swap
The node is a Supermicro 2U server, 2 x Xeon E5-2670 (2.60 GHz, 32 threads), 96GB RAM, Adaptec ASR-71605 in HBA mode.
Proxmox 6.2-4, Linux 5.4.34

I was thinking of a source issue but I tried with different clients and I got the same behaviour when doing a rsync from a linux box, a simple copy with the finder of an ordinary macbook pro or a serious test from my dev desktop.

I got a openMediaVault VM running on top of proxmox (2 cores, 16GB RAM) which handles the smb share but I don't think this is related to the issue.
Before installing OMV I tried to set up a samba share directly on the proxmox node (without any vm) and the perfs were not better.

I was also thinking of a network issue (switch, cables?) so I put the server out of the main network. Now it runs isolated on its own private gigabit switch only shared with one workstation which I use to test copying from / to the server. same laziness, awful perfs...


I'd love to hear your advices !!

Thanks,
Fab


I tried zpool iostat -v 1
Sometimes I can get quite decent values, about 200MB/s for the RAIDZ1 pool, about 50MB/disk:

Code:
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank0       13.9T  15.2T      0    677      0   207M
  raidz1    13.9T  15.2T      0    597      0   197M
    sdc         -      -      0    152      0  49.4M
    sde         -      -      0    138      0  49.4M
    sdf         -      -      0    151      0  49.4M
    sdg         -      -      0    154      0  49.2M
logs            -      -      -      -      -      -
  sdb1      1.46G  6.04G      0     79      0  9.98M
cache           -      -      -      -      -      -
  sdb2      32.7G  78.5G      0     80      0  10.1M


Then it just drops to zero:


Code:
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank0       13.9T  15.2T      0     74      0  3.22M
  raidz1    13.9T  15.2T      0     50      0   232K
    sdc         -      -      0     14      0  67.9K
    sde         -      -      0     12      0  59.9K
    sdf         -      -      0     14      0  67.9K
    sdg         -      -      0      5      0  24.0K
logs            -      -      -      -      -      -
  sdb1      1.82M  7.50G      0     23      0  3.00M
cache           -      -      -      -      -      -
  sdb2      32.9G  78.3G      0     23      0  3.00M
----------  -----  -----  -----  -----  -----  -----
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank0       13.9T  15.2T      0     23      0  3.00M
  raidz1    13.9T  15.2T      0      0      0      0
    sdc         -      -      0      0      0      0
    sde         -      -      0      0      0      0
    sdf         -      -      0      0      0      0
    sdg         -      -      0      0      0      0
logs            -      -      -      -      -      -
  sdb1      1.82M  7.50G      0     23      0  3.00M
cache           -      -      -      -      -      -
  sdb2      32.9G  78.3G      0     22      0  2.87M
----------  -----  -----  -----  -----  -----  -----
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank0       13.9T  15.2T      0     25      0  3.24M
  raidz1    13.9T  15.2T      0      0      0      0
    sdc         -      -      0      0      0      0
    sde         -      -      0      0      0      0
    sdf         -      -      0      0      0      0
    sdg         -      -      0      0      0      0
logs            -      -      -      -      -      -
  sdb1      1.82M  7.50G      0     25      0  3.24M
cache           -      -      -      -      -      -
  sdb2      32.9G  78.3G      0     25      0  3.24M
----------  -----  -----  -----  -----  -----  -----
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank0       13.9T  15.2T      0    396      0  49.5M
  raidz1    13.9T  15.2T      0      0      0      0
    sdc         -      -      0      0      0      0
    sde         -      -      0      0      0      0
    sdf         -      -      0      0      0      0
    sdg         -      -      0      0      0      0
logs            -      -      -      -      -      -
  sdb1      1.82M  7.50G      0    396      0  49.5M
cache           -      -      -      -      -      -
  sdb2      32.9G  78.3G      0    271      0  33.7M
----------  -----  -----  -----  -----  -----  -----
 
1.) Did you increased the volblocksize to 32K or higher? If you are using the default 8K you will be wasting 33% of your capacity because everything you write will be 150% of the size it really could be because of padding overhead.
2.) What HDD models are these? Did you verified that they use CMR and not SMR?
3.) SLOG and L2ARC are most of the times not useful and can even slow down your system. L2ARC will only be used if your RAM is full because using L2ARC will slowdown your pool compared to just using the fast ARC in RAM. So only usefull if you run out of RAM and can't add more. And ZIL will only be used on sync writes. Nearly all of your writes will possibly be async writes (like SMB) and the SLOG isn't used at all.
4.) Did you changed your ARC size or is it using the default 48GB RAM?
5.) Did you tried NFS? SMB is quite slow and as far as I know won't multithread so your CPU can be the bottleneck too if you got alot of slower cores but only one core can be used.
6.) Did you set the "atime=off" zfs option? If not every single read will cause a write to update the access time.
7.) Raidz1 isn't fast. It sacrifices disk and cpu performance for more capacity. If you want performance you should try a stripped mirror.
8.) 4 drives isn't optimal for raidz1. Better would be 3 or 5 drives.
 
Last edited:
  • Like
Reactions: Ramalama
Thank you so much for this very precise answer !

I checked each one of your thoughts:

1/ volblocksize=64k. looks ok.
2/ Aaaah the infamous CMR / SMR thing... I did not think about that. But those disks are Seagate ironwolf so they should be CMR, as far as I know.
4/ ARC size : c_max 48GB, C_min = 3GB, current size = 18GB. That looks fine.
5/ NFS is not really an option, this server must be reachable by standard macs and PCs
6/ atime=off. others options = default
7 and 8/ RAIDZ1, 3,4,5 drives: well I was not looking for ultimate performances, just decent ones. I may try to add a 5th one. Do you think it would make a really huge difference if I make a RAID10 array with 8 disks ?

and eventually:
3/ I tried to deactivate SLOG and L2ARC... Bingo ! read and write at 115MB, gigabit link saturated.

you're my hero !
 
7 and 8/ RAIDZ1, 3,4,5 drives: well I was not looking for ultimate performances, just decent ones. I may try to add a 5th one. Do you think it would make a really huge difference if I make a RAID10 array with 8 disks ?
4 vs 5 drives in a raidz1 shoudn't make a super big difference. But performance per drive should be a little bit better.
Raid10 should be better for workloads where you got alot of small writes and the HDDs can't handle all the IOPS and are bottlenecking. Like when using it as a storage for your VMs. But if you need the performance SSDs are always better.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!