[SOLVED] Thin Provisioning and Fragmentation

cshill

Member
May 8, 2024
79
11
8
Hi Everyone,

I've been reading several forum posts in regards to Thin Provisioning and Fragmentation of the disks and I have several questions below.
1. What is the BEST METHOD for preventing Fragmentation while keeping Thin Provisioning. Is there a specific file system and VM file format that works best?

2. I originally wanted to work with only ZFS file system and the raw format for the quickest speeds and snapshot capability. I believe this to be the ideal format to work with Proxmox but open to suggestions. With this layout I believe I just turn on Discard option and the more modern PVE versions have it work natively within Windows and Linux VMs to remove excess discarded files from the VM. This sounds like an effective method of removing unused files but how does the fragmentation work? Is there something that can work towards preventing fragmentation?

3. As of now my boss specifically wants the EXT4 file system, I suggest QCOW2 for at least we can have snapshots for the developers. Originally these VMs were all thin provisioned and work fine but my boss wants thick provisioning. I have switched it to thick but now I have no snapshot options. With a DIR file system how do I continue to use thin provisioning and prevent fragmentation?
 
1. What is the BEST METHOD for preventing Fragmentation while keeping Thin Provisioning. Is there a specific file system and VM file format that works best?
No, fragmentation is just how things get over time. If you don't want it, don't use thin provisioning.

I originally wanted to work with only ZFS file system and the raw format for the quickest speeds and snapshot capability. I believe this to be the ideal format to work with Proxmox but open to suggestions. With this layout I believe I just turn on Discard option and the more modern PVE versions have it work natively within Windows and Linux VMs to remove excess discarded files from the VM. This sounds like an effective method of removing unused files but how does the fragmentation work? Is there something that can work towards preventing fragmentation?
No, fragmentation cannot be prevented, it's a byproduct of thin provisioning and it's also common for any filesystem if you change the size of a file or your filesystem is sufficiently filled. With ZFS, you can try to rewrite your dataset, which will - to some extend (depends on the circumstances) - defragment your data via send/receive if you have enough space.

You can just live with it and/or use SSDs, which don't care about fragmentation.

As of now my boss specifically wants the EXT4 file system, I suggest QCOW2 for at least we can have snapshots for the developers. Originally these VMs were all thin provisioned and work fine but my boss wants thick provisioning. I have switched it to thick but now I have no snapshot options. With a DIR file system how do I continue to use thin provisioning and prevent fragmentation?
There is no option to have ext4 with snapshots and NOT using QCOW2. You can compromise and use ZFS with thick provisioning (not enabling the thin provisioning check option). Your boss should not interfere with requirements if he doesn't understand how the technology works.
 
No, fragmentation is just how things get over time. If you don't want it, don't use thin provisioning.


No, fragmentation cannot be prevented, it's a byproduct of thin provisioning and it's also common for any filesystem if you change the size of a file or your filesystem is sufficiently filled. With ZFS, you can try to rewrite your dataset, which will - to some extend (depends on the circumstances) - defragment your data via send/receive if you have enough space.

You can just live with it and/or use SSDs, which don't care about fragmentation.


There is no option to have ext4 with snapshots and NOT using QCOW2. You can compromise and use ZFS with thick provisioning (not enabling the thin provisioning check option). Your boss should not interfere with requirements if he doesn't understand how the technology works.
Hi Bill,

I appreciate your response. I had a whole game plan set to utilize several SSDs in a ZFS cluster and then another cluster utilizing large HDDs for data storage. My boss didn't like the idea of using a file system he wasn't familiar with, ZFS requires ram overhead, and it's utilizing a raid system without the raid controller. The lack of a controller and the ram overhead killed that proposition.

I think this is just how it's going to workout unfortunately. An EXT4, thick provisioned system with no snapshots.
 
ZFS cluster and then another cluster utilizing large HDDs for data storage.
What do you mean by cluster? Neither ZFS or EXT4 are clusteraware or -capable.

My boss didn't like the idea of using a file system he wasn't familiar with, ZFS requires ram overhead, and it's utilizing a raid system without the raid controller. The lack of a controller and the ram overhead killed that proposition.
Almost all storage systems (SANs) nowadays are NOT using hardware raid, but their own software raid implementation and use a lot of caching in RAM with their own incarnation of ZFS. People fear what they don’t understand. I can understand. ZFS has a steep learning curve.
 
  • Like
Reactions: Johannes S