ZFS volblocksize per VM disk instead of pool

showiproute

Well-Known Member
Mar 11, 2020
615
34
48
36
Austria
Hello everyone,

I took a deep dive into ZFS blocksize topic and found some useful information like:
General rules of thumb:


  • 1MiB for general-purpose file sharing/storage
  • 1MiB for BitTorrent download folders—this minimizes the impact of fragmentation!
  • 64KiB for KVM virtual machines using Qcow2 file-based storage
  • 16KiB for MySQL InnoDB
  • 8KiB for PostgreSQL
Source: https://klarasystems.com/articles/tuning-recordsize-in-openzfs/

Not sure if this is still valid or not but wouldn't it be better in general to define the volblocksize per VM disk within the VM settings rather than within the cluster storage settings?
 
Hello everyone,

I took a deep dive into ZFS blocksize topic and found some useful information like:

Source: https://klarasystems.com/articles/tuning-recordsize-in-openzfs/

Not sure if this is still valid or not but wouldn't it be better in general to define the volblocksize per VM disk within the VM settings rather than within the cluster storage settings?
Are you sure you are not confusing recordsize (for a filesystem on ZFS) with volblocksize (for a virtual disk [with another filesystem inside the VM] on ZFS)?
 
Last edited:
Are you sure you are not confusing recordsize (for a filesystem on ZFS) with volblocksize (for a virtual disk [with another filesystem inside the VM] on ZFS)?
Not so sure as the page mentions:

A (very) brief note on volblocksize


As discussed earlier, volblocksize is to zvols what recordsize is to datasets. A zvol is a ZFS block-level device which can be directly formatted with another file system (eg ext4, ntfs, exfat, and so forth).


A zvol can also be used as direct storage for applications which make use of “raw” unformatted drives. In general, datasets can be thought of as “ZFS file systems” and zvols can be thought of as “ZFS virtual disk devices.”


For the most part, tuning advice for volblocksize matches tuning advice for recordsize—match the block size to the typical random I/O operation size expected. However, volblocksize is completely fixed rather than dynamic. Typically, you still want to tune for the applications being hosted—so, 16KiB for a MySQL InnoDB store, or 8KiB for a PostgreSQL store, not some larger value.

So from this point of view the volblocksize should also match to the "thing" you run on it (database, storage, and so on)
 
  • Like
Reactions: Yann Decay
So from this point of view the volblocksize should also match to the "thing" you run on it (database, storage, and so on)
Indeed, but you usually don't run ZFS on top of ZFS. I do think this point is valid and your are smart to select a volblocksize that matches the workload inside the VM.
But as people with raidz1/2/3 found out: it is also a trade-off with padding, wasted space, IOPS per drive, etc., which is independent from the VM and only depends on the storage.

Eventually you want to match the actual storage (mirror, raidz, SDDs, HDDs, special/cache devices, etc.) with the workload inside the VM. This means you want different storages for different types of VMs. And therefore you don't really need a volblocksize per VM, in an ideal world you need a whole separate storage type per VM type.
 
  • Like
Reactions: showiproute
You can add the same ZFS pool multiple times, with different volblocksize (and name and other options), to the Proxmox Storages. Maybe that makes life a little easier for you?
 
Indeed, but you usually don't run ZFS on top of ZFS. I do think this point is valid and your are smart to select a volblocksize that matches the workload inside the VM.
But as people with raidz1/2/3 found out: it is also a trade-off with padding, wasted space, IOPS per drive, etc., which is independent from the VM and only depends on the storage.

Eventually you want to match the actual storage (mirror, raidz, SDDs, HDDs, special/cache devices, etc.) with the workload inside the VM. This means you want different storages for different types of VMs. And therefore you don't really need a volblocksize per VM, in an ideal world you need a whole separate storage type per VM type.
So this also mean that if a VM uses different disks on different ZFS pools I may also use different volblocksizes - am I right?

E.g. Ubuntu , root partition with a Postgres DB = 8k volblocksize + 2nd partition used for SMB storage on a different PVE ZFS pool 1M volblock size
 
You can add the same ZFS pool multiple times, with different volblocksize (and name and other options), to the Proxmox Storages. Maybe that makes life a little easier for you?
I run diffent things on the e.g. same SSD.

1x Postgres DB
1x Ubuntu Webserver

On a 2nd PVE I run a Windows server.
C: for OS and integrated Windows SQL database
D: music
E: backup from users
F: media


C = 8k
D - F = 32k or higher

Would that make sense?
 
I run diffent things on the e.g. same SSD.

1x Postgres DB
1x Ubuntu Webserver

On a 2nd PVE I run a Windows server.
C: for OS and integrated Windows SQL database
D: music
E: backup from users
F: media


C = 8k
D - F = 32k or higher

Would that make sense?
Running two VMs and a single drive does not make sense for an enterprise hypervisor like PVE. But it might work fine for you.
 
No, in general a run a bit more on my servers.
This was just being mentioned as an example if I understood that right.
Oh sorry. I would expect that the difference services would be separate containers/VMs and I would expect different types of storage optimized for each type of service.
I'm not sure that you are asking or whether I'm qualified to answer it, given my very limited experience with various storage types.
 
Oh sorry. I would expect that the difference services would be separate containers/VMs and I would expect different types of storage optimized for each type of service.
I'm not sure that you are asking or whether I'm qualified to answer it, given my very limited experience with various storage types.
In general yes: I run different services in different VMs but sometime also on the same storage like my Postgres DB VM and my webserver VM.
Due to that my guess was that it might be useful to use different blocksizes per VM instead as a general setting per storage.