volblocksize of zvol and LBA of virtio drive?

Dunuin

Distinguished Member
Jun 30, 2020
14,796
4,718
258
Germany
Hi,

My raidz1 pool storing my VMs zvols is set to a volblocksize of 32k, so ZFS uses 32KB blocks to store the zvols.
My VMs are using virtio SCSI and these virtual drives are reported as 512B LBA size.

How is the virtualization working here so that data from the guest is stored on the pool?
Will virtio waste capacity because it tries to write 32KB on the pool for every 512B block of data the virtual drive is trying to write or will it do some type of conversion so 64 512B blocks are combined to a 32KB block ZFS can store on the zvol?
If it does some conversion, is it fine to just use the defaults of the guest OS or should I change the blocksize of the guests filesystems so it will match 32KB?
What is a good practice to avoid overhead, bad padding and write amplification?

Right now my setup looks like this:
SSDs (logical/physical sector size: 512B/4K) <-- ZFS pool (ashift: 12 so 4K) <-- zvol (volblocksize: 32K) <-- virtio SCSI virtual drive (LBA: 512B) <-- ext4 partitions (block size 4K)

I would think that so much layers with different block sizes should cause a lot of overhead due to padding.
 
Usually AFAIU the problem is if you use a smaller block size higher up in the stack as that can cause write amplifications.

For example in your currently described stack you are effectively writing 4k blocks to the 32k on the zvol. So for a 4k write, ZFS needs to read 32k, change what is needed, and the write-down the 32k again.

The 512B LBA that the disk reports shouldn't be much of a problem because the guest currently writes down in 4k blocks (a multiple of 512B).

Though someone with more insight might correct me here :)
 
I have virtual disks (VirtIO SCSI) on zvol's with a block size of 4k but they are reported (by gdisk) as Sector size (logical/physical): 512/512 bytes.
I did once see that it was possible to have QEMU report the virtual SCSI disks as 4k using command line options like physical_block_size.
Is there a way Proxmox can do this (automatically), so that the backing store and the virtual disks have the same block/sector size?
 
I read on changing the block size of ext4 or xfs so its not smaller then my 32K volblocksize of ZFS but it looks like linux can't handle filesystems with block sizes greater then the page file size and my page file size is 4K so it wouldn't allow me to mount filesystems with more then 4K block size.
 
I have virtual disks (VirtIO SCSI) on zvol's with a block size of 4k but they are reported (by gdisk) as Sector size (logical/physical): 512/512 bytes.
I did once see that it was possible to have QEMU report the virtual SCSI disks as 4k using command line options like physical_block_size.
Is there a way Proxmox can do this (automatically), so that the backing store and the virtual disks have the same block/sector size?
I did check our codebase, and we do not support to add this as a parameter for a VM disk.

Feel free to create an enhancement request at our bugtracker: https://bugzilla.proxmox.com
 
I'm in a similar situation. I created a separate enhancement ticket 3282.

As a side note, 4kn seems to be broken here and there. I'm getting all kinds of sector accessing issue and I'm reverting myself back to 512e.
 
Last edited: