Questions regarding blocksize and write amplification

Eizock

New Member
Nov 28, 2020
1
2
3
25
Hello,

I have been running two different setups of ZFS Pools with Proxmox and VMs on them.
  1. Raid-z2 (6 disks, ashift=12, volblocksize=64k) <-- Virtualdisk (reports blocksize as 512 in VM) <-- NTFS/Ext4 (blocksize 4k)
  2. Mirror (6 disks, ashift=12, volblocksize=8k (default)) <-- Virtualdisk (reports blocksize as 512 in VM) <-- NTFS/Ext4 (blocksize 4k)
After looking in the Forums I only found some inconclusive Information:
  1. ZFS size / allocated difference?
  2. Improve write amplification?
  3. ashift, volblocksize, clustersize, blocksize
Then searching in the Internet I found some more helpfull writedowns:
  1. ZFS RAIDZ stripe width, or: How I Learned to Stop Worrying and Love RAIDZ
  2. RAID-Z parity cost
  3. Please help me understand ZFS space usage
I was able to answer most of my questions, but a few are remaining:
  1. Why does the virtual disk always report as 512b blocksize?
  2. Does this not matter because of some trickery in VirtIO/KVM?
  3. Why is the Virtual drive not using the set blocksize of the storage its on?
  4. Why is there no setting on a per VM/per virtual drive basis for blocksize?
  5. Why has 8k volblocksize been choosen as a default for Proxmox (potentially risking a +200% space usage on ZFS, according to RAID-Z parity cost)?
  6. Is there anywhere a good guideline for tuning the volblocksize of ZFS on Proxmox?
  7. Can i change the blocksize of a ext4 filesystem to 64k or is that considered unstable?
  8. Is the parity/padding/space waste on Mirrors nonexistant?
Some more observations (with more questions :D) :
  1. On setup 1) I have a (Windows) VM that reports 5.41T used, which should be around 4.92TiB, but ZFS reports 5.39TiB. So almost half a TiB more than it should. Discard is enabled. Is it because there are 3 different blocksizes involved?
  2. On setup 2) there is a (Windows) VM, where when I write ~5GiB (sequentially) zfs reports writes of around ~15GiB so a writeamplification of 3. Eventhough there is no report of writeamplification problems on mirrors. (Atleast not that i could find.) Is it again because of the 3 different blocksizes?
Maybe I am being a idiot and missing something obvious. But any hints/ideas/links/pointers are welcome.
Cheers
 
Last edited: