[SOLVED] XFS for the VM on ZFS filesystem?

r0PVEMox

Member
Jun 21, 2020
13
1
8
35
Would there be any complications by using the XFS-filesystem for the VM, while the Proxmox datacentre storage actually uses a ZFS-filesystem? I know its not best practice to use ZFS on ZFS, but I wonder if thats true for XFS on ZFS as well?
 
Last edited:
XFS or ext4 should work fine. Both aren't Copy-on-Write (CoW) filesystems. CoW ontop of CoW should be avoided, like ZFS ontop of ZFS, qcow2 ontop of ZFS, btrfs ontop of ZFS and so on.
 
Last edited:
  • Like
Reactions: r0PVEMox
Beside performance write amplification should be way worse because both got heavy overhead and this will multiply and not sum up. So a lot additional SSD wear in case SSDs are used.
And in case of ZFS ontop of ZFS also less capacity as both pools then should have 20% of capacity unused, as CoW always needs some free space to operate.
So lets say a ZFS mirror with 50% parity loss. 50% can be used and 20% should be kept free so only 40% of raw capacity usable. If you then run another ZFS pool ontop of that you again need to keep 20% free so only 32% of raw capacity usable.
 
Last edited:
Beside performance write amplification should be way worse because both got heavy overhead and this will multiply and not sum up. So a lot additional SSD wear in case SSDs are used.
And in case of ZFS ontop of ZFS also less capacity as both pools then should have 20% of capacity unused, as CoW always needs some free space to operate.
So lets say a ZFS mirror with 50% parity loss. 50% can be used and 20% should be kept free so only 40% of raw capacity usable. If you then run another ZFS pool ontop of that you again need to keep 20% free so only 32% of raw capacity usable.

And qcow2 images on NFS running on ZFS is bad as well?
 
The advantage of copy on write (CoW) is resilience to file system corruption on abrupt power loss. When a file is modified: a new updated copy is created then pointers updated and the old file storage added to the recycling pool. This approach ensures a file is never partly updated. The alernative approach is to use file journaling to try to correct partial updates at a later time if required.

Yes I know hardware should never have an abrupt power loss, and in a data centre this should be a very rare event. However when used in stand alone appliances in the field, power failure happens for all sorts of reasons. An appliance configured to minimise the probability of data corruption on power loss will optimise the real world device reliability particularly in locations without continuous on site IT staff or controlled physical access.

I appreciate:
  • Performance overhead occurs twice
  • Ram buffer requirements are doubled
  • Free space requirement occurs twice (so 80% of 80% = 64% is actually usable) but that probably occurs in all virtual systems unless the host uses a dynamic file system for the VM. Raid adds to the loss in usable capacity but that is independent of chosen file systems.
  • All of the above are important for applications where the file system limits performance, or there is a significant limitation of available Ram or disk space.
  • Write amplification is increased (although I suspect the write block size is increased to the device block size only once). Resulting in increased SSD wear.

So that leaves is the risk of file system corruption on abrupt power loss. Is there any theoretical or measurable difference in file system resilience to corruption on power failure between
  • ZFS client on ZFS hypervisor (CoW on CoW)
  • EXT4 client on ZFS hypervisor (Journaling, meta data checksum & delayed allocation on ZFS/CoW)
  • XFS client on ZFS hypervisor (Meta data logging & delayed allocation on ZFS/CoW)
I suspect any difference would depends on hypervisor write buffering of client writes.
As for performance loss: In a CoW on CoW set up, the client file system will rarely modify files so the Hypervisor will rarely encounter the overhead of CoW to modify files.
 
Last edited:
  • Like
Reactions: xand206
Beside performance write amplification should be way worse because both got heavy overhead and this will multiply and not sum up. So a lot additional SSD wear in case SSDs are used.
And in case of ZFS ontop of ZFS also less capacity as both pools then should have 20% of capacity unused, as CoW always needs some free space to operate.
So lets say a ZFS mirror with 50% parity loss. 50% can be used and 20% should be kept free so only 40% of raw capacity usable. If you then run another ZFS pool ontop of that you again need to keep 20% free so only 32% of raw capacity usable.
okay, space waste ... all the other arguments are performance for me. So nothing that important if you need ZFS and e.g. QCOW2 (tree-like-snapshot-structures).
 
I suspect any difference would depends on hypervisor write buffering of client writes.
As for performance loss: In a CoW on CoW set up, the client file system will rarely modify files so the Hypervisor will rarely encounter the overhead of CoW to modify files

Not true entirly. Even in the hypothetical case that you will not write any block, you will get worst read performance(if your qcow2 file can not fit in the cache backend storage), because you will need to to read portions of qcow2 files and metadata and then other metadata and maybe possible unchache data from the qcow2 vDisk.

Even worst if your qcow file is bigger like hundreds of GiB or when use any DB engine, when most of the time you will have random block reads.

Good luck/Bafta !
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!