SSD as secondary storage is slow

justarandomguy

New Member
Jul 21, 2019
9
0
1
25
I have a Dell R710 server running Proxmox 5.4. It is installed on 2 2TB SATA HDDs, running in a ZFS mirror. I also installed 2 480GB SATA SSDs, which are in a separate ZFS mirrored pool. I'm having some performance issues with the SSDs, particularly when reading.

When running fio on the disk directly, I'm getting the results I would expect:

(4K)
Code:
root@R710:~# fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sdd
seq_read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [R(1)] [100.0% done] [40530KB/0KB/0KB /s] [10.2K/0/0 iops] [eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=18743: Tue Apr 21 14:33:27 2020
read : io=2345.5MB, bw=40028KB/s, iops=10007, runt= 60001msec

(1M)
Code:
root@R710:~# fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=1M --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sdd
seq_read: (g=0): rw=read, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=1
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [R(1)] [100.0% done] [340.4MB/0KB/0KB /s] [340/0/0 iops] [eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=2511: Tue Apr 21 14:35:35 2020
read : io=21760MB, bw=371352KB/s, iops=362, runt= 60003msec

I think the overhead is coming from ZFS. I think it might be using the much slower root directory for something when it reads from the SSD. Is this likely? If so, how can I check/fix it?
 
Which storage controller do you use for the SSD?

And provide the model number of the SSDs.
 
The SSDs are consumer-grade Adata SU800s. The disk controller is a SAS2008 which is flashed to IT mode. I would think the hardware isn't the problem, based on the fio benchmarks posted above.
 
SSDs are consumer-grade

Never use consumer grade SSDs with ZFS, its really a waste of time as they are not perform as needed with ZFS (or Ceph).

You will find countless reports here in the forum.
 
Never use consumer grade SSDs with ZFS, its really a waste of time as they are not perform as needed with ZFS (or Ceph).

You will find countless reports here in the forum.
I understand that consumer grade SSDs aren't ideal, but at the moment it's all I can afford. Regardless, when I'm maxing out the IO in my VM, I don't see the IO on the SSD get anywhere near what it would be able to handle, especially based on the fio benchmarks.
 
Also, from what I can tell, consumer-grade SSDs are particularly bad because of their high synchronous write times, whereas my main concern is the read speed of the system.
 
Thanks. I'm still a little confused - those benchmarks are for write speeds. Does ZFS do a lot of writing when reading?

No, it's not.

Another thing is that your fio command uses sequential read, SSD is not much faster in sequential read than an old harddisk, it will shine however on random reads and writes.
 
No, it's not.

Another thing is that your fio command uses sequential read, SSD is not much faster in sequential read than an old harddisk, it will shine however on random reads and writes.

I'm confused. The fio command is reading directly from the disk, ignoring ZFS, and I'm getting the speeds I want. Once I do a read test from within a VM is when the speed drops.
 
I'm confused. The fio command is reading directly from the disk, ignoring ZFS, and I'm getting the speeds I want. Once I do a read test from within a VM is when the speed drops.

You have not provided the output of fio inside of your VM. In general, every filesystem is slower than raw speed, so benchmarking your real disk is always faster than benchmarking a file inside of a filesystem.
Can you provide the (PVE-side) configuration of your VM and the fio output from inside?
 
Well, I was curious and I attached another 8GB disk to one of my running guests, which is locates on a 2-vdev mirror ZPOOL using 4 2TB HGST drives. And I must say… I cannot complain… ;)

Code:
root@cloud:~# fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/vdc
seq_read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [R(1)] [100.0% done] [35872KB/0KB/0KB /s] [8968/0/0 iops] [eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=17774: Sat May  2 16:13:54 2020
  read : io=2077.4MB, bw=35452KB/s, iops=8863, runt= 60001msec
    slat (usec): min=11, max=2241, avg=16.81, stdev= 8.57
    clat (usec): min=4, max=6482, avg=90.30, stdev=41.76
     lat (usec): min=64, max=6508, avg=107.11, stdev=43.55
    clat percentiles (usec):
     |  1.00th=[   65],  5.00th=[   71], 10.00th=[   73], 20.00th=[   74],
     | 30.00th=[   77], 40.00th=[   80], 50.00th=[   85], 60.00th=[   90],
     | 70.00th=[   94], 80.00th=[  100], 90.00th=[  112], 95.00th=[  127],
     | 99.00th=[  167], 99.50th=[  189], 99.90th=[  422], 99.95th=[  804],
     | 99.99th=[ 1784]
    lat (usec) : 10=0.01%, 20=0.01%, 50=0.01%, 100=79.59%, 250=20.20%
    lat (usec) : 500=0.11%, 750=0.03%, 1000=0.01%
    lat (msec) : 2=0.03%, 4=0.01%, 10=0.01%
  cpu          : usr=11.73%, sys=28.72%, ctx=531799, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=531790/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=2077.4MB, aggrb=35452KB/s, minb=35452KB/s, maxb=35452KB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
  vdc: ios=530854/0, merge=0/0, ticks=39496/0, in_queue=39320, util=65.67%

8.8k IOPs at 4K blocksize on spinning rust is not that bad, is it?
 
8.8k IOPs at 4K blocksize on spinning rust is not that bad, is it?

Yes, it's impossible :-D

The problem with benchmarking read-performance on ZFS is that if you don't fill your disk with random data, you benchmark how well you can read a block that is physically not there, so a zero block is returned. That is a problem with every thin-provisioned device.
 
Yes, it's impossible :-D

The problem with benchmarking read-performance on ZFS is that if you don't fill your disk with random data, you benchmark how well you can read a block that is physically not there, so a zero block is returned. That is a problem with every thin-provisioned device.

Ha ha - I know… I have been running ZFS for more than a decade now… However, I was just curious about using fio on a device, which I never had done before - I have always used fio on files… and for a good reason, of course. And you should see the numbers, if I disable direct read…
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!