Multiple times I have configured proxmox from scratch and storage is my weak spot. I am always trying
to set the best possible options for the use case scenario, still finding myself weak at determining which these
best options are, for storage.
Once more, the disks used are 512 native (sas 10k sas3/12G disks) so I have created the HH3VM pool with ashift 9.
This pool is dedicated for VM storage I should mention here.
I enabled afterwards from GUI Thin provision. Block size is by default 8K but my personal guides
(written via extended discussions in forums and tech tutorials in tube..etc) dictate it would be better for block size
to be 16k (correlates with NTFS file system of Win or something). On top of that i have written down that this option can t be changed
afterwards (though it can probably the meaning is doesn t make a difference to the already created VMs).
-Does this mean I had to create the raid10 zfs storage from cli only in order to add to the command the extra option of block size?
-Do I even need to change it since the OSes will be 5-6 Win Serv VMs?
-Any other option I should consider?
Below follows all the options for the (stll empty) pool I have created.
zpool get all HH3VM
-Why is listsnapshots and autoexpand off? Is it pointless to know where and how many, these snapshots are?
-As for autoexpand, doesn t seem to me, off option helps somewhere. Is this option being used for something else?
-Is autotrim off because it didn t detect any ssds? If that is the case, then why this option is
off as well for the rpool which is based on mirrored ssds?
.....and we move to the Datasets/zvols (zvols here since as I already mentioned, this storage is only for VMs)
Since for every zpool creation a correlated root zvol/dataset is being created, that means properties of this are
being set automatically and are not always the correct ones. Below follows mine
zfs get all HH3VM
It confuses me the fact that zvol HH3VM (root dataset of HH3VM zpool) has attributes since it is a block level
Other options here that might make a difference would be:
zfs set atime=off Disables the Accessed attribute on every file that is accessed, this can double IOPS.
zfs set relatime=on On the other side, if some apps need that access time to work and you have it disabled
then disfunction of the app will follow. In such a case, let atime on along with relatime
I don t know what to choose here. Apps will use atime of the OS inside the VM and not the
underlying storage option.
zfs set xattr=sa : According to this statement and it seems logical <<<zvols don't have an xattr property
as there are no xattrs that could be stored >>, why then there is a value for the option
and it is set to on.
Also by definition for xattr: sets the Linux extended attributes as so, this will stop
the file system from writing tiny files and write directly to the inodes.
Does it make a difference for Linux only Vms? As i mentioned above only Win Vms will be used.
zfs set recordsize=16 The recordsize value will be determined by the type of data
on the file system, 16K for VM images and databases or an exact match,
or 1M for collections of 5-9MB JPG files and GB+ movies ETC.
If you are unsure, the default of 128K is good enough for all around
mixes of file sizes.
Should I change the default 128 to 16 before starting creating Vms? (Again Win Vms)
acltype=posixacl, default acltype=off i don t even know about that. Anyone with further info?
primarycache=metadata, all, none Controls what is cached in the primary cache (ARC).
secondarycache=metadata, all, none If this property is set to all, both user data and metadata is cached.
If this property is set to none, neither user data nor metadata is cached.
If this property is set to metadata, then only metadata is cached.
The default value is all.
Primarycache option does have impact on the performance
but not for every workload. In some test, we don't see any differences,
While in other tests, it provides more than 200% boost.
With all this information, you might be lost about whether it’s good
or not to enable the primarycache and which option is better for you.
Here, as a rule of doom: set all VM and LXC to primarycache=metadata
and for very, very specific workload, set it to primarycache=all.
What about secondarycache?
zfs set compression=lz4 We are ok with that since it is the default value
What do you think?
Thank you
to set the best possible options for the use case scenario, still finding myself weak at determining which these
best options are, for storage.
Once more, the disks used are 512 native (sas 10k sas3/12G disks) so I have created the HH3VM pool with ashift 9.
This pool is dedicated for VM storage I should mention here.
I enabled afterwards from GUI Thin provision. Block size is by default 8K but my personal guides
(written via extended discussions in forums and tech tutorials in tube..etc) dictate it would be better for block size
to be 16k (correlates with NTFS file system of Win or something). On top of that i have written down that this option can t be changed
afterwards (though it can probably the meaning is doesn t make a difference to the already created VMs).
-Does this mean I had to create the raid10 zfs storage from cli only in order to add to the command the extra option of block size?
-Do I even need to change it since the OSes will be 5-6 Win Serv VMs?
-Any other option I should consider?
Below follows all the options for the (stll empty) pool I have created.
zpool get all HH3VM
Code:
HH3VM size 4.34T -
HH3VM capacity 0% -
HH3VM altroot - default
HH3VM health ONLINE -
HH3VM guid 10.......08....79... -
HH3VM version - default
HH3VM bootfs - default
HH3VM delegation on default
HH3VM autoreplace off default
HH3VM cachefile - default
HH3VM failmode wait default
HH3VM listsnapshots off default
HH3VM autoexpand off default
HH3VM dedupratio 1.00x -
HH3VM free 4.34T -
HH3VM allocated 220K -
HH3VM readonly off -
HH3VM ashift 9 local
HH3VM comment - default
HH3VM expandsize - -
HH3VM freeing 0 -
HH3VM fragmentation 0% -
HH3VM leaked 0 -
HH3VM multihost off default
HH3VM checkpoint - -
HH3VM load_guid 9992307875857252571 -
HH3VM autotrim off default
HH3VM compatibility off default
HH3VM feature@async_destroy enabled local
HH3VM feature@empty_bpobj enabled local
HH3VM feature@lz4_compress active local
HH3VM feature@multi_vdev_crash_dump enabled local
HH3VM feature@spacemap_histogram active local
HH3VM feature@enabled_txg active local
HH3VM feature@hole_birth active local
HH3VM feature@extensible_dataset active local
HH3VM feature@embedded_data active local
HH3VM feature@bookmarks enabled local
HH3VM feature@filesystem_limits enabled local
HH3VM feature@large_blocks enabled local
HH3VM feature@large_dnode enabled local
HH3VM feature@sha512 enabled local
HH3VM feature@skein enabled local
HH3VM feature@edonr enabled local
HH3VM feature@userobj_accounting active local
HH3VM feature@encryption enabled local
HH3VM feature@project_quota active local
HH3VM feature@device_removal enabled local
HH3VM feature@obsolete_counts enabled local
HH3VM feature@zpool_checkpoint enabled local
HH3VM feature@spacemap_v2 active local
HH3VM feature@allocation_classes enabled local
HH3VM feature@resilver_defer enabled local
HH3VM feature@bookmark_v2 enabled local
HH3VM feature@redaction_bookmarks enabled local
HH3VM feature@redacted_datasets enabled local
HH3VM feature@bookmark_written enabled local
HH3VM feature@log_spacemap active local
HH3VM feature@livelist enabled local
HH3VM feature@device_rebuild enabled local
HH3VM feature@zstd_compress enabled local
HH3VM feature@draid enabled local
-Why is listsnapshots and autoexpand off? Is it pointless to know where and how many, these snapshots are?
-As for autoexpand, doesn t seem to me, off option helps somewhere. Is this option being used for something else?
-Is autotrim off because it didn t detect any ssds? If that is the case, then why this option is
off as well for the rpool which is based on mirrored ssds?
.....and we move to the Datasets/zvols (zvols here since as I already mentioned, this storage is only for VMs)
Since for every zpool creation a correlated root zvol/dataset is being created, that means properties of this are
being set automatically and are not always the correct ones. Below follows mine
zfs get all HH3VM
Code:
HH3VM type filesystem -
HH3VM creation Thu Mar 17 13:04 2022 -
HH3VM used 225K -
HH3VM available 4.22T -
HH3VM referenced 24K -
HH3VM compressratio 1.00x -
HH3VM mounted yes -
HH3VM quota none default
HH3VM reservation none default
HH3VM recordsize 128K default
HH3VM mountpoint /HH3VM default
HH3VM sharenfs off default
HH3VM checksum on default
HH3VM compression on local
HH3VM atime on default
HH3VM devices on default
HH3VM exec on default
HH3VM setuid on default
HH3VM readonly off default
HH3VM zoned off default
HH3VM snapdir hidden default
HH3VM aclmode discard default
HH3VM aclinherit restricted default
HH3VM createtxg 1 -
HH3VM canmount on default
HH3VM xattr on default
HH3VM copies 1 default
HH3VM version 5 -
HH3VM utf8only off -
HH3VM normalization none -
HH3VM casesensitivity sensitive -
HH3VM vscan off default
HH3VM nbmand off default
HH3VM sharesmb off default
HH3VM refquota none default
HH3VM refreservation none default
HH3VM guid 5...01...8792037... -
HH3VM primarycache all default
HH3VM secondarycache all default
HH3VM usedbysnapshots 0B -
HH3VM usedbydataset 24K -
HH3VM usedbychildren 201K -
HH3VM usedbyrefreservation 0B -
HH3VM logbias latency default
HH3VM objsetid 54 -
HH3VM dedup off default
HH3VM mlslabel none default
HH3VM sync standard default
HH3VM dnodesize legacy default
HH3VM refcompressratio 1.00x -
HH3VM written 24K -
HH3VM logicalused 79K -
HH3VM logicalreferenced 12K -
HH3VM volmode default default
HH3VM filesystem_limit none default
HH3VM snapshot_limit none default
HH3VM filesystem_count none default
HH3VM snapshot_count none default
HH3VM snapdev hidden default
HH3VM acltype off default
HH3VM context none default
HH3VM fscontext none default
HH3VM defcontext none default
HH3VM rootcontext none default
HH3VM relatime off default
HH3VM redundant_metadata all default
HH3VM overlay on default
HH3VM encryption off default
HH3VM keylocation none default
HH3VM keyformat none default
HH3VM pbkdf2iters 0 default
HH3VM special_small_blocks 0 default
It confuses me the fact that zvol HH3VM (root dataset of HH3VM zpool) has attributes since it is a block level
Other options here that might make a difference would be:
zfs set atime=off Disables the Accessed attribute on every file that is accessed, this can double IOPS.
zfs set relatime=on On the other side, if some apps need that access time to work and you have it disabled
then disfunction of the app will follow. In such a case, let atime on along with relatime
I don t know what to choose here. Apps will use atime of the OS inside the VM and not the
underlying storage option.
zfs set xattr=sa : According to this statement and it seems logical <<<zvols don't have an xattr property
as there are no xattrs that could be stored >>, why then there is a value for the option
and it is set to on.
Also by definition for xattr: sets the Linux extended attributes as so, this will stop
the file system from writing tiny files and write directly to the inodes.
Does it make a difference for Linux only Vms? As i mentioned above only Win Vms will be used.
zfs set recordsize=16 The recordsize value will be determined by the type of data
on the file system, 16K for VM images and databases or an exact match,
or 1M for collections of 5-9MB JPG files and GB+ movies ETC.
If you are unsure, the default of 128K is good enough for all around
mixes of file sizes.
Should I change the default 128 to 16 before starting creating Vms? (Again Win Vms)
acltype=posixacl, default acltype=off i don t even know about that. Anyone with further info?
primarycache=metadata, all, none Controls what is cached in the primary cache (ARC).
secondarycache=metadata, all, none If this property is set to all, both user data and metadata is cached.
If this property is set to none, neither user data nor metadata is cached.
If this property is set to metadata, then only metadata is cached.
The default value is all.
Primarycache option does have impact on the performance
but not for every workload. In some test, we don't see any differences,
While in other tests, it provides more than 200% boost.
With all this information, you might be lost about whether it’s good
or not to enable the primarycache and which option is better for you.
Here, as a rule of doom: set all VM and LXC to primarycache=metadata
and for very, very specific workload, set it to primarycache=all.
What about secondarycache?
zfs set compression=lz4 We are ok with that since it is the default value
What do you think?
Thank you
Last edited: