RAIDz1 block size to maximize usable space?

iamspartacus

Member
Sep 9, 2020
53
5
13
41
I'm trying to store a 14TB visk (VMware conversion) on a 9 x 2.4TB RAIDz1 pool. However, anytime I try to import the disk, it gets to about 90% and errors out because it says I'm out of space. I've been reading that this is a ZFS padding issue.

So my question is, how can I configure this pool to be able to fit this vdisk? Is it just a matter of changing the block size or is there more config required?
 
@Dunuin Can you help me understand what's the best blocksize to use to maximize space here? This doesn't need to be a high performance pool. It's just storing file backups that never run at greater than 1Gbps (over the network).
 
With a 9-disk raidz1 :
4K/8K volblocksize: 50% of raw capacity lost
16K volblocksize: 33% raw capacity lost
32K volblocksize: 20% of raw capacity lost
64+K volblocksize: 11% raw capacity lost

So, if you only want to lose 1 of your 9 disks to parity you will have to set the "Block size" of your ZFS storage to 64K (when using ashift=12) BEFORE creating your first VM, as the blocksize can't be changed later without destroying the virtual disks.
 
With a 9-disk raidz1 :
4K/8K volblocksize: 50% of raw capacity lost
16K volblocksize: 33% raw capacity lost
32K volblocksize: 20% of raw capacity lost
64+K volblocksize: 11% raw capacity lost

So, if you only want to lose 1 of your 9 disks to parity you will have to set the "Block size" of your ZFS storage to 64K (when using ashift=12) BEFORE creating your first VM, as the blocksize can't be changed later without destroying the virtual disks.

Thank you!
 
this spreadsheet
Can't get my head around all of that data! According to the spreadsheet a RAIDZ1 of 9 disks @ 16k blocksize will also only have a (total?) loss of 11%. Dunuin suggested this to be 33%. Again according to spreadsheet @ 32k blocksize we have yet again a loss of 11%, but according to Dunuin this should be 20%.
 
Can't get my head around all of that data! According to the spreadsheet a RAIDZ1 of 9 disks @ 16k blocksize will also only have a (total?) loss of 11%.
No, the left column isn't blocksize in kilobytes. It is in sectors. So a "16" in the first column means 16 sectors and therefore 16 * 4K = 64K volblocksize in case your pool uses ashift=12 which means 4K sectors. Ashift=13 would be 8K sectors and therefore 16 * 8K = 128K volblocksize and so on.
 
  • Like
Reactions: gfngfn256
https://github.com/jameskimmel/ZFS/blob/main/The problem with RAIDZ.md#efficiency-tables

With the volblocksize of 16k you are close to storage efficiency mirror. So RAIDZ1 only will get you a reasonable storage efficiency, if you can go with volblocksize of 64k.
But then again, if you have smaller than 64k workloads, you will get rw amplification and fragmentation.

And since you are pretty close the disks being full, that fragmentation would totally kill performance.
That is why I think you won't get happy with the current hardware.

https://www.truenas.com/community/threads/the-path-to-success-for-block-storage.81165/
 
https://github.com/jameskimmel/ZFS/blob/main/The problem with RAIDZ.md#efficiency-tables

With the volblocksize of 16k you are close to storage efficiency mirror. So RAIDZ1 only will get you a reasonable storage efficiency, if you can go with volblocksize of 64k.
But then again, if you have smaller than 64k workloads, you will get rw amplification and fragmentation.

And since you are pretty close the disks being full, that fragmentation would totally kill performance.
That is why I think you won't get happy with the current hardware.

https://www.truenas.com/community/threads/the-path-to-success-for-block-storage.81165/
I just don't really have a choice. Being forced off VMware due to licensing increases and need move over to Proxmox. The move also means moving my 9 disks from a hardware raid controller to passthrough so it's either ZFS RAIDz1 or mdadm which isn't officially supported.
 
Why not just using your hardware raid + LVM Thin? You won't be able to do storage replication and won't have bit rot protection, but it's also a lighter filesystem without the challenges of ZFS padding loses and write amplification of RAIDz.
 
  • Like
Reactions: IsThisThingOn
Why not just using your hardware raid + LVM Thin? You won't be able to do storage replication and won't have bit rot protection, but it's also a lighter filesystem without the challenges of ZFS padding loses and write amplification of RAIDz.

It's not an option unfortunately. I would have done that in a heartbeat but I'm forced to swap hosts as well and the newer hosts are coming from being VMware VSAN nodes that didn't come with or need hardware raid controllers. So...I'm here now. I don't have any preference as to what FS I use, I just need my 9 x 2.4TB drives to be able to support about at 14-16TB vdisk, any way, shape, or form.
 
Last edited:
  • Like
Reactions: VictorSTS
It's not an option unfortunately. I would have done that in a heartbeat but I'm forced to swap hosts as well and the newer hosts are coming from being VMware VSAN nodes that didn't come with or need hardware raid controllers.
If you already have the drives visible in the Proxmox installer (or your current installation), you can use LVM (with multiple physical volumes) and you don't need additional hardware RAID controllers.
I just need my 9 x 2.4TB drives to be able to support about at 14-16TB vdisk, any way, shape, or form.
You could do that with LVM, LVM-Thin or a ZFS stripe (like a RAID0). Both have no redundancy but that will give you the most space.
Either way, you'll need to invest a little time to get to know the technology.
 
I just don't really have a choice. Being forced off VMware due to licensing increases and need move over to Proxmox. The move also means moving my 9 disks from a hardware raid controller to passthrough so it's either ZFS RAIDz1 or mdadm which isn't officially supported.
Ahh, I see where you are coming from.
But you do have a choice, in fact thousands of choices :)

Please tell me that you don’t want to migrate 14TB VMs by destroying your VMWare hypervisor and install Proxmox on it?!

You haven’t talked about your workload, so we can only guess your performance needs.
Here is what I would do: create a 6 drives 3way mirror with 20TB HDDs. Add 3 SSDs in 3way mirror as a special vdev for metadata. Add a SLOG drive to speed up sync writes.
That way you get 40TB usable storage with a ok performance for VMs.
 
Ahh, I see where you are coming from.
But you do have a choice, in fact thousands of choices :)

Please tell me that you don’t want to migrate 14TB VMs by destroying your VMWare hypervisor and install Proxmox on it?!

You haven’t talked about your workload, so we can only guess your performance needs.
Here is what I would do: create a 6 drives 3way mirror with 20TB HDDs. Add 3 SSDs in 3way mirror as a special vdev for metadata. Add a SLOG drive to speed up sync writes.
That way you get 40TB usable storage with a ok performance for VMs.

I appreciate your reply but I'm not going to get into different hardware possibilities because, honestly it's just moot at this time. The situation (at work, this is not a home server) is I have the following hardware that I need to migrate a VMware 14TB vdisk to in the next few weeks. I did mention my workload in this post. This is a file backup server that is a secondary server that data gets replicated to over a 1Gbps connection.

Dell R740
No Hardware RAID controller
9 x 2.4TB 10K 2.5 HDDs
4 x 1.92TB SATA SSDs

* Please note, none of what is happening is my choice here. But it's the situation I'm in.
 
Last edited:
  • Like
Reactions: IsThisThingOn
Just use RAIDz1, ashift=12, volblocksize=64k which will use around 11'2% overhead, resulting in ~17'2TiB usable (9x2,4TB=21,6TB - 2,4TB. Please keep in mind TB to TiB conversion). How much will the VM use on disk depends on how compressible the data is. Being backups, they are probably compressed already, so I would not expect much gains here. I think this will work ok.
 
Just use RAIDz1, ashift=12, volblocksize=64k which will use around 11'2% overhead, resulting in ~17'2TiB usable (9x2,4TB=21,6TB - 2,4TB. Please keep in mind TB to TiB conversion). How much will the VM use on disk depends on how compressible the data is. Being backups, they are probably compressed already, so I would not expect much gains here. I think this will work ok.

Thanks, that's exactly what I did last night and I have the import running now. Right now it's 60% done with 8.3 TiB transferred of 14.1TiB.

Current state of the datastore is...

Code:
# zfs list
NAME                                 USED  AVAIL  REFER  MOUNTPOINT
chs-vm01-datastore01                8.20T  9.12T   171K  /chs-vm01-datastore01
 
Ahh we are talking about a backup server. And reusing existing hardware. Then I would probably not bother with ZFS. I would use the raid Controller, install proxmox backup server as ext4 and call it a day. Compression, incremental, dedup, all that good stuff is already there.
 
Ahh we are talking about a backup server. And reusing existing hardware. Then I would probably not bother with ZFS. I would use the raid Controller, install proxmox backup server as ext4 and call it a day. Compression, incremental, dedup, all that good stuff is already there.

As I've said in multiple posts, a hardware raid controller is not available/an option. And I have a 14TB VMware vdisk of data I need to maintain.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!