Incredibly poor ZFS performance on SSD

bleomycin

Renowned Member
Mar 20, 2011
49
7
73
I have 2x240GB SSD's in raidz1 for rpool and 2x480GB SSD's in raidz1 for additional VM storage. All drives are Sandisk Extreme II's. These have been in service for about a year now and performance is just abysmal at this point. I know TRIM isn't supported in ZoL yet, but these drives aren't anywhere near full, and according to the anandtech review they have excellent performance consistency/garbage collection (which is why i chose them at the time). Any help/tips would be greatly appreciated.

This test was performanced with compression turned off:
Code:
root@proxmox:/SSD480/test# dd if=/dev/zero of=tempfile bs=1M count=4024 conv=fdatasync,notrunc
4024+0 records in
4024+0 records out
4219469824 bytes (4.2 GB) copied, 35.0058 s, 121 MB/s

rpool performance is actually OK, not amazing but still MUCH better than SSD480.
Code:
root@proxmox:/rpool/test# dd if=/dev/zero of=tempfile bs=1M count=4024 conv=fdatasync,notrunc
4024+0 records in
4024+0 records out
4219469824 bytes (4.2 GB) copied, 10.3724 s, 407 MB/s


Relevant System Specs:
Code:
Xeon e5-2680v3
128GB DDR4 ECC ram
2x240GB Sandisk Extreme II SSD (raidz1)
2x480GB Sandisk Extreme II SSD (raidz1)

Code:
zfs list
NAME                            USED  AVAIL  REFER  MOUNTPOINT
SSD480                          245G   185G   230G  /SSD480
SSD480/test                      96K   185G    96K  /SSD480/test
SSD480/zfsdisks                15.2G   185G    96K  /SSD480/zfsdisks
SSD480/zfsdisks/vm-116-disk-1  15.2G   185G  15.2G  -
rpool                           103G   112G    96K  /rpool
rpool/ROOT                     72.9G   112G    96K  /rpool/ROOT
rpool/ROOT/pve-1               72.9G   112G  72.6G  /
rpool/swap                     29.6G   142G  33.0M  -


Code:
pveversion -v
proxmox-ve: 4.2-48 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-2 (running version: 4.2-2/725d76f0)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-4.2.2-1-pve: 4.2.2-16
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-72
pve-firmware: 1.1-8
libpve-common-perl: 4.0-59
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-14
pve-container: 1.0-62
pve-firewall: 2.0-25
pve-ha-manager: 1.0-28
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 
Sorry, but dd is wrong on so many levels. First, it only tests sequential IO and second it writes zeroes. Zeroes are not (fully) stored at all in ZFS, so you are not actually testing your ZFS speed. SSD have similar sequential IO than hard disks, because SSD are good at random IO, not sequential IO. Please test with something sophisticated like fio. Also, multiple IO streams are often faster (combined) as one single thread IO stream, so multiple simultaneous access is/will be faster than one. This is a completely different story with spinning disks.

The mentioned SSDs are consumer SSDs which are not very fast at all - compared to enterprise SSDs. Please buy a good Intel MLC drive if you want REAL SSD performance.

Yet on the other side: Your setup should be fast in comparison to spinning disks, so do you "feel" the slowness?
 
Can you check this blog, and do the same benchmark ? (on the raw disk, not zfs pool, so it'll scrap your datas)

http://www.sebastien-han.fr/blog/20...-if-your-ssd-is-suitable-as-a-journal-device/

like ceph, zfs need fast sync write, and if it's a consumer ssd, generally they are very (very) slow.


Also, you said that you use raidz1 , are you sure ? raidz1 is like raid5, you need 3 disks.
you can do a zfs mirror of 2 disk (like a raid1).

performance is very different, as you need to compute parity for raidz-1
 
Thanks guys, very helpful. That link to the journal device blog is great. I know DD is a terrible test method, but if anything it usually vastly overestimates performance in my experience. You're right, I misspoke, i'm running a mirror not raidz1.

I went ahead and formatted both ssd's after backing them up to force a trim operation with:
Code:
mkfs.ext4 -F -E discard /dev/sda

I did this instead of secure erase because they're colo'd a couple hours away and are in a frozen state. At least with my simple (and crappy, i know!) DD test performance is right back where it should be. Trim support is coming to ZoL hopefully this year according to the ZoL mailing list so with any luck this won't be needed again. I'll keep the enterprise SSD's in mind, but this is a hobby server under very little load.

Performance had just degraded so bad that simple SAMBA file copies from NAS storage to the VM disks themselves were stalling out hard (the NAS is capable of saturating 10Gbit/s).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!