Hello!
I'm wondering if there is a way, or whether this could be escalated to a feature request, to pass through the block size of the backing block device into the VM?
In my case, I use volblocksize of 16k for a ZFS based storage (because the pool is a 6 wide raidz2 (4 data/2 parity) with ashift=12 -> 4k * 4 = 16k stripes).
Even in the default setting with volblocksize=8k, the VM only sees 4k discard granularity, with everything else (sector size/min io) being 512 bytes, which potentially leads to a lot of ZFS overhead.
Now I know that Qemu supports a parameter to set the block size of a virtual hard disk, because I've used it with libvirt: <blockio physical_block_size='8192'/>. Given that parameter, I remember the guests did correctly identify 8k blocks as the smallest IO unit and e.g. parametrized ext4 mount options correctly (I moved from libvirt to Proxmox some time ago and can't confirm it in a running VM).
On a (un)related note, for thin provisioned ZVOLs, I'd very much recommend to set `error_policy='stop'` (again in libvirt-speak, I don't know the exact Qemu name for it), or at least give an option to the admininstrator to choose it. The default of Qemu is to pass through the no-space-available-error of the ZVOL to the guests kernel, which leads to kernel errors or fs corruption. "stop" suspends the guest inside the VM, giving the administrator chance to increase quota/... and allow the guest to finish the write successfully.
Thanks!
I'm wondering if there is a way, or whether this could be escalated to a feature request, to pass through the block size of the backing block device into the VM?
In my case, I use volblocksize of 16k for a ZFS based storage (because the pool is a 6 wide raidz2 (4 data/2 parity) with ashift=12 -> 4k * 4 = 16k stripes).
Even in the default setting with volblocksize=8k, the VM only sees 4k discard granularity, with everything else (sector size/min io) being 512 bytes, which potentially leads to a lot of ZFS overhead.
Now I know that Qemu supports a parameter to set the block size of a virtual hard disk, because I've used it with libvirt: <blockio physical_block_size='8192'/>. Given that parameter, I remember the guests did correctly identify 8k blocks as the smallest IO unit and e.g. parametrized ext4 mount options correctly (I moved from libvirt to Proxmox some time ago and can't confirm it in a running VM).
On a (un)related note, for thin provisioned ZVOLs, I'd very much recommend to set `error_policy='stop'` (again in libvirt-speak, I don't know the exact Qemu name for it), or at least give an option to the admininstrator to choose it. The default of Qemu is to pass through the no-space-available-error of the ZVOL to the guests kernel, which leads to kernel errors or fs corruption. "stop" suspends the guest inside the VM, giving the administrator chance to increase quota/... and allow the guest to finish the write successfully.
Thanks!