Hi,
I used this setup...
pool (ashift=12) -> zvol (volblocksize=8k, raw) -> ext4 (blocksize=512, no clustersize)
...and got a write amplification of about 10 from virtual drive writes on guest to physical drive writes on the host.
I researched and found this chart showing parity/padding overhead in percent using different numbers of drives in a raidz1 and different volblocksizes.
The chart is showing that if using a raidz1 of 5 drives (at ashift=12) and the proxmox default 8k volblocksize (like my setup now) is causing a overhead of +100%. So in this case using raidz1 isn't saving any space compared to to just using a zfs mirror.
At a volblocksize of 32K the overhead is minimal for 5 drives (at ashift=12) with an overhead of just 25%.
So I should recreate the zvols with volblocksize of 32k instead of 8k.
I did this by:
1.) Proxmox GUI -> datacenter -> storage -> MyPool -> Changing "Block Size" from "8k" to "32k"
2.) This only works for newly created zvols so I backed up my VM and added a new virtual harddrive to it using the same size
3.) I changed boot order to CD first, mounted a Debian Live CD ISO and booted that Live Debian
4.) I used "sudo lsblk" to find out the old (sda) and new (sdb) virtual harddrive
5.) I used "sudo dd conv=sparse if=/dev/sda of=/dev/sdb bs=32K status=progress" to copy the whole contents (inlcuding partition tables, bootloader, ...) from the old virtual harddisk with blocksize of 8k to the new virtual harddisk with blocksize of 32k.
6.) I detached the old virtual harddrive from the VM (because UUIDs are the same on both drives now) and set the boot order to boot first from the new virtual harddisk
Looks like all works fine and if I use "zfs list" the new zvol only got 1/3 of the size at "USED" and 1/2 of the size of "REFER". Also write amplification looks to be a little (10%) bit lower.
Do I need to change the blocksize and/or clustersize of the ext4 partitions inside the VMs too? Or is it no problem that the VM is writing blocks of 512B if the zvol is using 32K and the physical harddisks 4K because KVM does some virtualization magic and converts it somehow?
What would be the right blocksize/clustersize for a virtual ext4 partition?
Also the chart doesn't mention the use of native encryption nor lz4 compression wich I both use. Should I use an even higher volblocksize like 64k instead of 32k for encryption/compression to be better working or is there 32k fine too?
I used this setup...
pool (ashift=12) -> zvol (volblocksize=8k, raw) -> ext4 (blocksize=512, no clustersize)
...and got a write amplification of about 10 from virtual drive writes on guest to physical drive writes on the host.
I researched and found this chart showing parity/padding overhead in percent using different numbers of drives in a raidz1 and different volblocksizes.
The chart is showing that if using a raidz1 of 5 drives (at ashift=12) and the proxmox default 8k volblocksize (like my setup now) is causing a overhead of +100%. So in this case using raidz1 isn't saving any space compared to to just using a zfs mirror.
At a volblocksize of 32K the overhead is minimal for 5 drives (at ashift=12) with an overhead of just 25%.
So I should recreate the zvols with volblocksize of 32k instead of 8k.
I did this by:
1.) Proxmox GUI -> datacenter -> storage -> MyPool -> Changing "Block Size" from "8k" to "32k"
2.) This only works for newly created zvols so I backed up my VM and added a new virtual harddrive to it using the same size
3.) I changed boot order to CD first, mounted a Debian Live CD ISO and booted that Live Debian
4.) I used "sudo lsblk" to find out the old (sda) and new (sdb) virtual harddrive
5.) I used "sudo dd conv=sparse if=/dev/sda of=/dev/sdb bs=32K status=progress" to copy the whole contents (inlcuding partition tables, bootloader, ...) from the old virtual harddisk with blocksize of 8k to the new virtual harddisk with blocksize of 32k.
6.) I detached the old virtual harddrive from the VM (because UUIDs are the same on both drives now) and set the boot order to boot first from the new virtual harddisk
Looks like all works fine and if I use "zfs list" the new zvol only got 1/3 of the size at "USED" and 1/2 of the size of "REFER". Also write amplification looks to be a little (10%) bit lower.
Do I need to change the blocksize and/or clustersize of the ext4 partitions inside the VMs too? Or is it no problem that the VM is writing blocks of 512B if the zvol is using 32K and the physical harddisks 4K because KVM does some virtualization magic and converts it somehow?
What would be the right blocksize/clustersize for a virtual ext4 partition?
Also the chart doesn't mention the use of native encryption nor lz4 compression wich I both use. Should I use an even higher volblocksize like 64k instead of 32k for encryption/compression to be better working or is there 32k fine too?