I reversed engeneered that formula once, but can't find the result. We can try that again:
Code:
B4 Raidz1 formula:
=((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),2))/$A4-1)/((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),2))/$A4)
B4 Raidz2 formula:
=((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),3))/$A4-1)/(((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),3))/$A4))
B4 Raidz3 formula:
=((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),4))/$A4-1)/((CEILING($A4+$A$3*FLOOR(($A4+B$3-$A$3-1)/(B$3-$A$3)),4))/$A4)
So only difference between the forumlas is that raidz1 will ceil to a factor of 2, raidz2 to a factor of 3 and raidz3 to a factor of 4. Lets call this number
CeilFactor.
"$A$3" is the number of parity disks of the vdev. Let us call it
ParityDisks
"$A4" is then number of sectors, or in other words "volblocksize / 2^ashift". Let us call it
Sectors.
"B$3" is the total number of disks of the vdev. Let us call it
TotalDisks
Lets have a look at the B4 Raidz1 formula and make it a bit more readable:
Code:
(
(
CEILING( Sectors + ParityDisks *
FLOOR( ( Sectors + TotalDisks - ParityDisks - 1) / ( TotalDisks - ParityDisks ) )
, CeilFactor)
) / Sectors - 1
)
/
(
(
CEILING ( Sectors + ParityDisks *
FLOOR( ( Sectors + TotalDisks - ParityDisks - 1) / ( TotalDisks - ParityDisks ) )
, CeilFactor)
) / Sectors
)
To make it even more readable we could shorten it by replacing "
TotalDisks -
ParityDisks" with
DataDisks:
Code:
(
(
CEILING( Sectors + ParityDisks *
FLOOR( ( Sectors + DataDisks - 1) / DataDisks )
, CeilFactor)
) / Sectors - 1
)
/
(
(
CEILING ( Sectors + ParityDisks *
FLOOR( ( Sectors + DataDisks - 1) / DataDisks )
, CeilFactor)
) / Sectors
)
With that you can calculate the parity+padding overhead of any number of disks and any amount of sectors (in other words any volblocksize) of a raidz1:
Lets for examle use
Sectors = 4,
DataDisks = 8.
ParityDisks is always 1 and
CeilFactor always 2 for a raidz1:
Code:
(
(
CEILING( 4 + 1 *
FLOOR( ( 4 + 8 - 1) / 8 )
, 2)
) / 4 - 1
)
/
(
(
CEILING ( 4 + 1 *
FLOOR( ( 4 + 8 - 1) / 8 )
, 2)
) / 4
)
becomes...
(
(
CEILING( 4 + 1 *
FLOOR( 1,375 )
, 2)
) / 4 - 1
)
/
(
(
CEILING ( 4 + 1 *
FLOOR( 1,375 )
, 2)
) / 4
)
becomes...
(
CEILING( 4 + 1 * 1 , 2) / 4 - 1
)
/
(
CEILING ( 4 + 1 * 1 , 2) / 4
)
becomes...
(
CEILING( 5 , 2) / 4 - 1
)
/
(
CEILING ( 5 , 2) / 4
)
becomes...
( 6 / 4 - 1 ) / ( 6 / 4 )
becomes...
0.5 / 1.5
results in...
0.3333333
And if you look at the table for raidz1 and 9 total disks and 4 sectors you will also see 33% combined parity+padding loss.
Parity loss is always "
ParityDisks /
TotalDisks" or "
ParityDisks / (
DataDisks +
ParityDisks)
".
So in our example above that would be:
Code:
ParityDisks / (DataDisks + ParityDisks)
becomes...
1 / (8 + 1)
results in...
0,111111
If you now want to find out what just the padding loss is, you subtract the parity loss from the combined parity+padding loss:
0.3333333 - 0.111111 = 0.222222
So there is 11% parity loss and 22% padding loss forming 33% of total capacity loss.
And padding loss is indirect. Your pool won't show it as not the pool will become smaller, but the size of the zvols will become bigger. Results in the same...you can store less on your pool...but most people just don't get that parity overhead, as ZFS won't show it anywhere when showing the total or available pool size.
If you want to know how much bigger your zvols will get, you can calculate this:
Code:
ZvolSize = (1 - ParityLoss) / (1 - ParityLoss - PaddingLoss)
In our example this would result in:
Code:
ZvolSize = (1 - 0.111111) / (1 - 0.111111 - 0.222222)
Results in...
ZvolSize = 1,333333
So all zvols will be 133% in size, meaning that storing 1TB of data on a zvol will cause the zvol to consume 1.33TB of the pools capacity, as for every 1TB of data blocks there will have to be 333GB of empty padding blocks to be stored.
Now let us say the 9 disks from our example are each 2TB in size.
So we got a total raw capacity (which
zpool list
will report) of 18TB (because 9 * 2TB).
zfs list
will report that we got a capacity of 16TB, as it already subtracted the capacity used to store parity data (so 9 * 2TB raw storage - 2TB parity data).
For datasets that would be true, as datasets got no parity overhead. We could indeed store 16TB of files in datasets on that pool.
But this isn't true for zvols, as our zvols would be 133% in size. We can only store 12TB of zvols, because 12TB of zvols would consume 16TB of space. This is why I told you that padding overhead is indirect.
And then keep in mind that a zfs pool should always have 20% of free space to operate optimally. So you actually have to subtract additional 20% when calculating your real available capacity. So for 100% datasets this would be 12.8TB of real usable capacity. For 100% zvols it would be just 9.6 TB of real usable capacity. For 50% datasets + 50% zvols it would be 11.2TB of real usable capacity and so on.