Slow speed on SSD ZFS pool

delebru

Member
Jul 10, 2018
25
2
23
36
I'm running Proxmox VE 5.2 on a local server I setup a few months ago with 8x 2TB hard drives on a RAID-10 like ZFS pool.

The server has about 10 LXCs which seem to run super fast, and only one Windows 10 VM which seems to have a quite poor disk performance. I thought the poor performance was mostly due to the hard drives so I decided to try with a couple of SSDs.

For testing purposes, I created a new pool with a 256GB Samsung 860 Pro and a 512GB Intel 545s in "RAID-0" style with ashift=12 and disabled compression. Shut down the Windows VM, cloned it to the new SSD pool then run a disk benchmark with HD Tune:

upload_2018-10-26_12-25-53.png

And this is how it looked on the Proxmox host:

upload_2018-10-26_12-26-16.png

Drive settings:
upload_2018-10-26_12-29-3.png


I was expecting much better results from the benchmark... 77MB/s average read speed felt like a punch in the face :p

I used the recommended VirtIO bus with cache disabled but maybe there's something else I neglected? Any ideas on what I could try to improve the disk performance?

Any tips will be greatly appreciated!!!
 
Well, I could be mistaken but Samsung 860 Pro has 512B block size (both physical/logical) check with fdisk or similar tool. If I'm correct ashit=9 would be more appropriate choise
 
One more thing to be noticed: if you mount zfs pool (dataset) as folder and VM disk is stored as file (not dataset) I would recommend to use raw (not qcow2) and set pool (dataset) propety recordsize with align to VM guest (4k in case of MS Windows). This property should be set before disk image file has been created

Some more zfs pool (dataset) options to be tuned: atime=off, xattr=sa
 
In case of zfs dataset if used as VM disk storage try to change Block Size option in Proxmox Storage ZFS item to 4k and recreate/clone VM disk). Some SSDs does not work well with default 8k blocksize (volblocksize in terms of ZFS properties). Or it could be just ZoL issue
 
Firstly thanks everyone for the comments!!! Really appreciate it :)

Turning on writeback cache seems to have some improvement:
upload_2018-10-27_11-6-25.png

atime was already off, changed xattr from on to sa but didn't have any impact on the benchmark.

Disk images were already in raw format.
Storages which present block devices (LVM, ZFS, Ceph) will require the raw disk image format, whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose either the raw disk image format or the QEMU image format.
@ https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines


Well, I could be mistaken but Samsung 860 Pro has 512B block size (both physical/logical) check with fdisk or similar tool. If I'm correct ashit=9 would be more appropriate choise
As far as I know you wouldn't get any performance improvements from making the pool with ashift 9 instead of 12 even if the drives have 512 sectors. Am I wrong here? Everyone seems to recommend going for ashift 12 in case you need to replace a drive without needing to recreate the pool if it has a larger sector size, and because it shouldn't have an impact on your pool's performance?.


I will also do a trim before the next tests but these are the read speeds while moving the drive images back to the main hdd pool so I wouldn't expect much from a trim...

upload_2018-10-27_10-51-28.png


Will come back with some results when I have them.
 
Trimmed the ssds, recreated the pool with ashift 9, added the zfs storage on the webui with 4k as block size, restored the VM from a backup and got even worse results:

with atime=off and xattr=sa
upload_2018-10-27_16-56-29.png

with atime=on and xattr=on
upload_2018-10-27_16-56-53.png

But reading and writing on the ssd pool from the host seems normal:
upload_2018-10-27_16-57-58.png


I also tried creating an ext4 partition on one of the ssds, mounting it on a folder and adding it to proxmox as a folder mount to test if it was something with the zfs pool but the benchmark on the VM still looked as slow.

Could it be something wrong with the virtio drivers? Is it possible the benchmark tool on the VM is not giving accurate results? Not sure what else I can try...
 
Since I need this Windows VM to have fast drives I'm beginning to believe the best option might be be installing windows in the host and Proxmox on a Hyper-V for the LXCs... I would really like to avoid doing this so if there's anything else I can try please shoot because I'm running out of ideas :(
 
I'm not sure what HD Tune is measuring. Is it just straight streaming transfer speeds, or is that random IO? Or, more importantly, what is your workload, and what kind of IO does it rely on?

Also, can you give us a screenshot of the Storage section in the Proxmox UI? I'm trying to figure out if you're using a ZFS dataset as a Directory storage type, or if you're using the ZFS storage type. Because you shouldn't be able to choose between qcow2 and raw with the ZFS storage type. And running a file-based virtual disk on top of ZFS results in some really abysmal performance, raw or qcow2.
 
I'm not sure what HD Tune is measuring. Is it just straight streaming transfer speeds, or is that random IO? Or, more importantly, what is your workload, and what kind of IO does it rely on?

Also, can you give us a screenshot of the Storage section in the Proxmox UI? I'm trying to figure out if you're using a ZFS dataset as a Directory storage type, or if you're using the ZFS storage type. Because you shouldn't be able to choose between qcow2 and raw with the ZFS storage type. And running a file-based virtual disk on top of ZFS results in some really abysmal performance, raw or qcow2.
The HD Tune graph and "Transfer Rate" results are of a sequential read speed test with the default 64K block sizes. Maybe there's a better way to benchmark the drive's performance?

That VM is running a build server for Unity 3D projects. Each build has to process thousands of images which makes it quite drive intensive. It is currently taking almost 4hs for a build which is done in 2.5 hours on a non virtualized machine with similar specs.

And yes, you are right about not being able to select qcow2, I can't. After creating the SSD pool with:
zpool create -f -o ashift=9 sddpool /dev/disk/by-id/xxx /dev/disk/by-id/zzz
I added the pool to Proxmox as follows:
upload_2018-10-29_14-41-26.png
 
I'm wondering if you're running into issues of the block sizes not lining up. But now that you've rebuilt your zpool, and set it to use 4k blocks, maybe you'll have a better time running the Unity builds. If they're not better, maybe go the other direction. Make Proxmox use 32k block sizes, and format NTFS to use 32k or 64k block sizes. Works with SQL Server.
 
I'm wondering if you're running into issues of the block sizes not lining up. But now that you've rebuilt your zpool, and set it to use 4k blocks, maybe you'll have a better time running the Unity builds. If they're not better, maybe go the other direction. Make Proxmox use 32k block sizes, and format NTFS to use 32k or 64k block sizes. Works with SQL Server.
Thanks for the comments! I have a build running now, if it's still slow I'll give that a try.
 
As far as I know you wouldn't get any performance improvements from making the pool with ashift 9 instead of 12 even if the drives have 512 sectors. Am I wrong here?

With compression, there is a difference.

But reading and writing on the ssd pool from the host seems normal:
upload_2018-10-27_16-57-58-png.8521

PLEASE do not use /dev/zero "tests" with ZFS, use only fio to test real performance. SSDs are also not very fast with sequential disk, at least not fast as you would expect. SSD are fast in random I/O in comparison to spindles.

. Make Proxmox use 32k block sizes, and format NTFS to use 32k or 64k block sizes.
That is also a way to go ... then you will have improved write speeds if you use compression, because compressing bigger chunks will always be faster and more efficient that compressing smaller chunks. Using compression with ashift=12 is challenging, because a 8 KB block can be compressed to e.g. 6 KB, but still uses two 4K records on ZFS and you waste space, but it'll only use 12 records with ashift=9 and you will have an additonal 2 KB or 4 records available for other stuff.
 
  • Like
Reactions: delebru
Thanks LnxBil for that info! I was always a bit skeptical about creating the pools with 4k sectors just because...

I tried running the VM on a 32k block size dataset as recommended and it made the builds faster by about 20%.

I did also try installing Windows directly on the SSDs without Proxmox to check the performance difference and it was only 10% faster than the builds with the 32k block size so I'm quite happy with that. I guess I got over worried about the performance when I saw those benchmark results on the VM.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!