Optiming Virtual Disk Volume Block Size (Host) and EXT4 Blocks/Groups (Guest)

silverstone · Dec 22, 2024

It just occurred to me that the Reason why I'm getting such a MASSIVE overhead on my Backup Server (which has 2 x RAIDZ-2 x 6 Disks each) is because the Default ZVOL on Proxmox VE is (or probably "was" for a long Time, and many of my Virtual Disks are quite Old) 8k, which results in a Space Utilization of 2.0x.

It seems that a RAIDZ2 works very well with a ZVOL of 16k on the other Hand, without any additional Overhead (Space Utilization of 1.0x).

Since at the very least I'd need to either set the new Value for volblocksize and do a zfs send | zfs receive (or possibly even creating a new ZVOL with the correct Value, then dd it to the new ZVOL), I was wondering if there was also something to be done inside the Guest VM, particularly with regards to mkfs.ext4.

Of course this will require creating first of all the new ZVOL with the correct volblocksize, then format it with mkfs.ext4 correctly.

I had a similar Issue/Experience at Work on a Windows 11 Workstation with HyperV Manager and having to do a full mkfs.ext -G 4096 /dev/targetpartition (HyperV uses HUGE blocks, default is 32MB, minimum is 1MB but must be set manually when creating the Disk from Powershell Terminal) and move all data from one virtual disk to the other using rsync (plus restorecon -rv / for Fedora with SELinux).

That seems to be to both reduce Fragmentation and making sure that consecutive writes will "fall" onto the same Host "Block".

I'm not fully sure on the Math of that Choice though, since 4096 (number-of-groups) x 4096 (blocksize) = 16777216 B = 16MB.

I previously just used the default install Parameters, but that caused the Virtual Disk to be 2.0-2.5x bigger on the Windows Host than from within the VM ! When specifying mkfs.ext4 -G 4096 /dev/targetpartition I could however indeed observe that the Overhead got down from 2.0-2.5x to basically none.

Kinda similar situation here. I'm just thinking that maybe I would need to also tell EXT4 which blocks to "expect" and write consecutively in to reduce fragmentation as well as overhead/write amplification on the Proxmox Host SSD.

The Default EXT4 Settings are defined in /etc/mke2fs.conf:

Code:

[defaults]
    base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
    default_mntopts = acl,user_xattr
    enable_periodic_fsck = 0
    blocksize = 4096
    inode_size = 256
    inode_ratio = 16384

[fs_types]
    ext3 = {
        features = has_journal
    }
    ext4 = {
        features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
    }
    small = {
        blocksize = 1024
        inode_ratio = 4096
    }
    floppy = {
        blocksize = 1024
        inode_ratio = 8192
    }
    big = {
        inode_ratio = 32768
    }
    huge = {
        inode_ratio = 65536
    }
    news = {
        inode_ratio = 4096
    }
    largefile = {
        inode_ratio = 1048576
        blocksize = -1
    }
    largefile4 = {
        inode_ratio = 4194304
        blocksize = -1
    }
    hurd = {
         blocksize = 4096
         inode_size = 128
         warn_y2038_dates = 0
    }

If I now set volblocksize = 16k for all my Virtual Disks and do the Conversion, which Settings would you Reccomend for inside the Guest EXT4 Filesystem in order to:
- Reduce Write Amplification
- Improve Performance
- Reduce Host Fragmentation
- Reduce Guest Fragmentation
- Reduce Space Overhead
- Other ?

Surely there are some "Conflicting Goals" listed above, but what would be a "Reasonable" Configuration ?

For Reference, this is what I currently have on one of my old VMs:

Code:

dumpe2fs 1.47.0 (5-Feb-2023)
Filesystem volume name:   <none>
Last mounted on:          /
Filesystem UUID:          e5d187b4-49d9-419e-a521-200948759e55
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1966080
Block count:              7864320
Reserved block count:     393216
Free blocks:              1230720
Free inodes:              1884064
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Tue Feb 12 19:55:41 2019
Last mount time:          Sun Nov 24 09:53:15 2024
Last write time:          Sun Nov 24 09:53:14 2024
Mount count:              92
Maximum mount count:      -1
Last checked:             Tue Feb 12 19:55:41 2019
Check interval:           0 (<none>)
Lifetime writes:          992 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       1835082
Default directory hash:   half_md4
Directory Hash Seed:      ebe4e82c-fd35-4929-adc8-fb06e13c0d3e
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xe2431bf9
Journal features:         journal_incompat_revoke journal_64bit journal_checksum_v3
Total journal size:       128M
Total journal blocks:     32768
Max transaction length:   32768
Fast commit length:       0
Journal sequence:         0x007eaef9
Journal start:            28283
Journal checksum type:    crc32c
Journal checksum:         0xf4969f2a

Search

Search

Optiming Virtual Disk Volume Block Size (Host) and EXT4 Blocks/Groups (Guest)

silverstone

Well-Known Member

We value your privacy