Where is my zpool storage???

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I have a zfs pool on my PVE host made from 5 3TB drives in a raidz1 pool. So that should give me roundabout 15TB - 3TB = 12 TB capacity.

When I do "zpool list" ist says that it is 13.6T which is close enough (for what I am talking about here).

My issue is as follows:

I have passed through two virtual disks to a VM and placed them on the zpool. The disks have a size of 6.5T and 50G, respectively. So I still should have in excess of 5T left, right?

Well, the PVE GUI says it is 99.18% full (11.75 of 11.84TB). Shell says (zpool list) "Alloc 12.9T Free 899G"

So where is the rest of my storage???

(When I change directory to my zpool and do "ls" it shows literally nothing - not even the two virtual disks.)

Thanks!
 
Because of padding overhead, everything written to a zvol will be 160% in size. So when writing 7.5TB of data/metadata to zvols there also will be additional 4.5TB of padding blocks, resulting in 12TB of consumed space. So you only got 7.5TB of usable space. And then keep in mind that a pool will become very slow after filling it more than 90%. For best performance you might even want to not fill it more than 80%. So only 6TB of usuable capacity if you care about performance or fragmentation.

So its more like this with 5x 3TB, ashift=12 and volblocksize=8K:
5x 3TB = 15 TB raw capacity
-20% parity overhead
-30% padding overhead (when using zvols only)
--------------------------------
7,5 TB usable capacity
-20% that should be kept free
--------------------------------
6 TB real usable capacity

To not lose those 30% of raw capacity to padding overhead, you would need to increase the volblocksize to at least 32K when using ashift=12.
This can only be done at the creation of a zvol. So backup your VMs/LXCs, go to "Datacenter -> Storage -> YourZFSStorage -> Edit -> Block size:" and set it to at least "32K". Then restore your VMs/LXCs from backups overwriting the existing VMs/LXCs. This will then create new zvols with a 32K volblocksize.
But keep in mind that everything doing IO smaller than 32K will cause massive overhead. So not really an option if you for example want to run a MySQL or PostgreSQL DB.
 
Last edited:
Thanks, Dunuin, for the detailed explanation. I am still at the beginning of my zfs voyage.

I am about to set up a new PBS machine (to replace my old one). My (new, same as the old) PBS machine has only space for 5 3.5" hdds and with raidz2 this would have resulted in massive overhead combined with potentially abysmal resilvering performance in the case of a hdd failure. And I have just read a piece that has convinced me to move away from raidz2 and opt for mirrored vdevs. So I am planning to go with a pool consisting of two vdevs (each consisting of two mirrored hdds) plus a special device of two mirrored ssds.

But I am wondering whether the padding overhead is an issue independent of the pool composure, i.e. would happen also with mirrored vdevs (you are saying above that it has to do with the metrics of the zvols).

If so, I need to choose the right zvol block size in combination with the right ashift value and I only have a tenuous grasp of what that even means. I am pretty sure the choice is dependent on the type of data I want to store.

Now given that this is my (new) PBS machine (and I am assuming here that the data PBS stores have pretty much the same format all the time - "chunks"), is there maybe a rule of thumb that should work for this purpose?

And do I need to make any immutable choices already at the creation of the zpool or only later on when I create zvols in it?

Thank you!
 
But I am wondering whether the padding overhead is an issue independent of the pool composure, i.e. would happen also with mirrored vdevs (you are saying above that it has to do with the metrics of the zvols).
Padding overhead only effects zvols on raidz1/2/3 (maybe draid too, not sure about that). But a PBS usually uses datasets, so even with a raidz1/2/3 this wouldn't be a problem. But PBS needs IOPS performance and no matter how many disks your raidz1/2/3 would consist of, it will always be as slow as a single disk. For IOPs performance you need striped vdevs, so multiple small striped mirrors or raidz1/2/3s.

With only 5x 3.5" slots I would either use a SSD-only 3/4/5 disk raidz1/2 or alternatively a 4x HDD striped mirror + 2x SSD special device mirror (in case those slots are not hot swappable so you can add a 1x 3.5" to 2x 2.5" adapter.

I personally use a 1M recordsize for the dataset that stores the PBS datastore. That minimizes the metadata overhead and the HDDs are hit by a bit less IOPS and PBS chunk files are usually in the range of 1-2MB anyway.
And I would recommend enabling the relatime for that dataset too, so not every read will cause a write (but not that bad anyway when using a special device, as the atime changes would only hit the SSDs).
 
Last edited:
Thank you.

So if I understand correctly, I don't have to think about zvol block size, right?

And ashift? Is that something I need to consider when setting up the pool or anything else PBS might need to store data?
 
Thank you.

So if I understand correctly, I don't have to think about zvol block size, right?
Yes. Only if your PBS is a VM on a PVE.
And ashift? Is that something I need to consider when setting up the pool or anything else PBS might need to store data?
You always have to select the correct ashift that fits your physical disks. But ashift=12 should be fine in most cases.
 
Small addition (most people already know):

two-way mirrored vdevs are not as safe as raidz2:
In case of two disks failing, raidz2 will always be safer (any two disks can fail) compared to mirrored (two disks in a mirrored vdev must not fail).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!