Proxmox Nvme ZFS slow performance

wire2hire

Active Member
Oct 20, 2020
24
1
43
41
Hey,

i installed on a mac pro 2019 (7.1) proxmox on nvme. The nvme have 3000 read/write . But i become like 250mb/s . Fsync say 50 on pverpef test. ZFS raid0 , atime=off,ashift 12. 4096 sector nvme disk. Why its are so incredible slow, and what can i make to tuneup?

Disk: AP8192N
 
Please share exactly what commands you ran and their outputs. I'm can't find a lot of details about this disk.
 
Last edited:
Code:
pveperf
CPU BOGOMIPS:      280000.00
REGEX/SECOND:      4065158
HD SIZE:           6750.17 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     45.98


Code:
fio --name=rand_read_write_qd1 --directory=/rpool --rw=randrw --bs=4k --ioengine=libaio --iodepth=1 --numjobs=1 --size=1G --time_base
d --runtime=60 --group_reporting


Code:
rand_read_write_qd1: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [m(1)][100.0%][r=226MiB/s,w=227MiB/s][r=57.8k,w=58.1k IOPS][eta 00m:00s]
rand_read_write_qd1: (groupid=0, jobs=1): err= 0: pid=27495: Thu Jan  8 12:20:08 2026
  read: IOPS=58.1k, BW=227MiB/s (238MB/s)(13.3GiB/60001msec)
    slat (nsec): min=1638, max=284893, avg=6621.05, stdev=7762.34
    clat (nsec): min=306, max=42563, avg=433.53, stdev=304.55
     lat (nsec): min=1998, max=286011, avg=7054.58, stdev=7787.80
    clat percentiles (nsec):
     |  1.00th=[  334],  5.00th=[  350], 10.00th=[  374], 20.00th=[  406],
     | 30.00th=[  414], 40.00th=[  418], 50.00th=[  422], 60.00th=[  422],
     | 70.00th=[  430], 80.00th=[  438], 90.00th=[  482], 95.00th=[  498],
     | 99.00th=[  652], 99.50th=[  748], 99.90th=[ 1020], 99.95th=[ 7456],
     | 99.99th=[17536]
   bw (  KiB/s): min=144960, max=265768, per=100.00%, avg=232230.67, stdev=15200.83, samples=120
   iops        : min=36240, max=66442, avg=58057.73, stdev=3800.19, samples=120
  write: IOPS=58.0k, BW=227MiB/s (238MB/s)(13.3GiB/60001msec); 0 zone resets
    slat (usec): min=3, max=384, avg= 8.92, stdev= 8.45
    clat (nsec): min=337, max=83690, avg=475.73, stdev=320.04
     lat (usec): min=3, max=390, avg= 9.39, stdev= 8.48
    clat percentiles (nsec):
     |  1.00th=[  366],  5.00th=[  386], 10.00th=[  410], 20.00th=[  446],
     | 30.00th=[  454], 40.00th=[  458], 50.00th=[  462], 60.00th=[  466],
     | 70.00th=[  474], 80.00th=[  486], 90.00th=[  524], 95.00th=[  540],
     | 99.00th=[  708], 99.50th=[  812], 99.90th=[ 1160], 99.95th=[ 8096],
     | 99.99th=[17536]
   bw (  KiB/s): min=145256, max=267368, per=100.00%, avg=232097.00, stdev=15252.50, samples=120
   iops        : min=36314, max=66842, avg=58024.25, stdev=3813.13, samples=120
  lat (nsec)   : 500=89.43%, 750=9.92%, 1000=0.52%
  lat (usec)   : 2=0.05%, 4=0.01%, 10=0.04%, 20=0.03%, 50=0.01%
  lat (usec)   : 100=0.01%
  cpu          : usr=8.75%, sys=91.11%, ctx=1151, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=3483464,3481455,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=227MiB/s (238MB/s), 227MiB/s-227MiB/s (238MB/s-238MB/s), io=13.3GiB (14.3GB), run=60001-60001msec
  WRITE: bw=227MiB/s (238MB/s), 227MiB/s-227MiB/s (238MB/s-238MB/s), io=13.3GiB (14.3GB), run=60001-60001msec
 
I'm pretty sure the 3000MB/s values were not tested with 4k. Try if you achieve them with 1M and potentially more jobs and/or higher depth.
50 FSYNCS/s is very low though. I get 5200 on a SATA DC SSD and 950 on a consumer NVMe SSD.
 
Last edited:
your are right , thanks. but fsync i dont understood, its a good nvme.

Code:
 fio --name=rand_read_write_qd1 --directory=/rpool --rw=randrw --bs=4k --ioengine=libaio --iodepth=3 --numjobs=5 --size=1M --time_base
d --runtime=60 --group_reporting
rand_read_write_qd1: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=3
...
fio-3.39
Starting 5 processes
rand_read_write_qd1: Laying out IO file (1 file / 1MiB)
rand_read_write_qd1: Laying out IO file (1 file / 1MiB)
Jobs: 5 (f=5): [m(5)][100.0%][r=2123MiB/s,w=2119MiB/s][r=544k,w=542k IOPS][eta 00m:00s]
rand_read_write_qd1: (groupid=0, jobs=5): err= 0: pid=30771: Thu Jan  8 12:28:25 2026
  read: IOPS=543k, BW=2120MiB/s (2223MB/s)(124GiB/60001msec)
    slat (nsec): min=1420, max=690394, avg=2483.65, stdev=1153.02
    clat (nsec): min=1116, max=573606, avg=9587.68, stdev=3099.69
     lat (usec): min=3, max=735, avg=12.07, stdev= 3.34
    clat percentiles (nsec):
     |  1.00th=[ 5536],  5.00th=[ 5984], 10.00th=[ 6304], 20.00th=[ 7008],
     | 30.00th=[ 8384], 40.00th=[ 8896], 50.00th=[ 9280], 60.00th=[ 9664],
     | 70.00th=[10432], 80.00th=[11456], 90.00th=[12352], 95.00th=[13504],
     | 99.00th=[23680], 99.50th=[26752], 99.90th=[30336], 99.95th=[31616],
     | 99.99th=[39680]
   bw (  MiB/s): min= 1984, max= 2256, per=100.00%, avg=2120.07, stdev=10.99, samples=600
   iops        : min=508028, max=577720, avg=542738.30, stdev=2814.14, samples=600
  write: IOPS=543k, BW=2120MiB/s (2223MB/s)(124GiB/60001msec); 0 zone resets
    slat (usec): min=2, max=524, avg= 5.17, stdev= 1.76
    clat (nsec): min=1270, max=720802, avg=9717.34, stdev=3068.21
     lat (usec): min=6, max=726, avg=14.88, stdev= 3.60
    clat percentiles (nsec):
     |  1.00th=[ 5728],  5.00th=[ 6176], 10.00th=[ 6560], 20.00th=[ 7200],
     | 30.00th=[ 8512], 40.00th=[ 9024], 50.00th=[ 9408], 60.00th=[ 9792],
     | 70.00th=[10432], 80.00th=[11456], 90.00th=[12480], 95.00th=[13504],
     | 99.00th=[23936], 99.50th=[27008], 99.90th=[30336], 99.95th=[31872],
     | 99.99th=[38656]
   bw (  MiB/s): min= 1986, max= 2254, per=100.00%, avg=2120.28, stdev=10.89, samples=600
   iops        : min=508476, max=577104, avg=542792.57, stdev=2789.10, samples=600
  lat (usec)   : 2=0.01%, 10=64.86%, 20=33.50%, 50=1.64%, 100=0.01%
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.01%
  cpu          : usr=14.83%, sys=85.16%, ctx=1704, majf=0, minf=48
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32564304,32567555,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=3

Run status group 0 (all jobs):
   READ: bw=2120MiB/s (2223MB/s), 2120MiB/s-2120MiB/s (2223MB/s-2223MB/s), io=124GiB (133GB), run=60001-60001msec
  WRITE: bw=2120MiB/s (2223MB/s), 2120MiB/s-2120MiB/s (2223MB/s-2223MB/s), io=124GiB (133GB), run=60001-60001msec
 
The Apple Mac Pro website specified that it's up to 3000 sequential but you are testing random/IOPS. If you want high sync writes (for fsync/sec like 10000) then use a drive that has PLP. Consumer drives are typically between 50 (which is like a HDD) and 250 (if I remember correctly).
And ZFS does add overhead because all of the nice features it provides. Since you don't have data redundancy anyway, maybe try simple ext4 on LVM?
Maybe your drive is just not that good with small random writes. What make and model is it exactly?
 
  • Like
Reactions: Kingneutron