ZFS optimal block size RAID 0 3 disks?

harmonyp

Member
Nov 26, 2020
196
4
23
47
I have been informed that if I am 3 NVMe RAID0 using ashift=12 with 12k block size is optimal however as you can only either use 8k or 16k which is the better option? Also what would the difference be between them?
 
ashift=12 is a very good default, ashift=13 and up showed no real value in most benchmarks.
Beside that you would have to check and research what kind of NAND is used in your SSDs to find out what value could make sense at all, if not going with ashift=12. Most manufacturers don't list these technical informations anywhere, you better have good contacts or lots of patience.

Better save the hassle and stay with ashift=12 as the gain is really nonexistant.

with 12k block size is optimal however as you can only either use 8k or 16k which is the better option?
Regarding ZFS recordsize / volblocksize, "it depends" - usually PVEs default of 128k (recordsize for LXC) / 8k (volblocksize for Qemu) works good, but depending on your usecase, your mileage may vary. You can do benchmarks with the same VM but different block sizes configured to test what works best for you - but also here: Better save the hassle, if you do not REALLY need those few extra IOPS (if you'd get any more - maybe 8k already works best for you).

3 NVMe RAID0
Now I got a little question for you: Why on earth would you want a RAID-0? o_O
 
ashift=12 is a very good default, ashift=13 and up showed no real value in most benchmarks.
Beside that you would have to check and research what kind of NAND is used in your SSDs to find out what value could make sense at all, if not going with ashift=12. Most manufacturers don't list these technical informations anywhere, you better have good contacts or lots of patience.

Better save the hassle and stay with ashift=12 as the gain is really nonexistant.


Regarding ZFS recordsize / volblocksize, "it depends" - usually PVEs default of 128k (recordsize for LXC) / 8k (volblocksize for Qemu) works good, but depending on your usecase, your mileage may vary. You can do benchmarks with the same VM but different block sizes configured to test what works best for you - but also here: Better save the hassle, if you do not REALLY need those few extra IOPS (if you'd get any more - maybe 8k already works best for you).


Now I got a little question for you: Why on earth would you want a RAID-0? o_O
The drives are SAMSUNG MZQLB1T9HAJR-00007

I do not use LXC.

My current setting (RAIDZ1) is using way too much CPU (z_wr_iss) so I want to try RAID-0 just for performance with PBS backups every few hours. Downtime is not too big of a deal for me.
 
The drives are SAMSUNG MZQLB1T9HAJR-00007
Those seem to be pretty big boys.
Older Samsung drives where known to use 8K NAND pages internally, but I don't think that information is up2date.

If you really really want to, you can compare ashift=12 vs. ashift=13 with those drives and see if you gain anything with ashift=13.
But again, I don't see the point, especially if you are using enterprise NVMe drives anyways.
 
I will do a 6 disk striped mirror benchmark this weenend comparing 8K vs 16k volblocksize. Should be the same for 3 disk raid0, just with half the read performance and half the write amplification.
 
Last edited: