Feature Request : Creating & Managing datasets

fabian · Apr 29, 2021

ieronymous said:
Since it is still related to the post thematology, after reading all https://www.freebsd.org/cgi/man.cgi?query=zfs&sektion=8&manpath=FreeBSD+7.0-RELEASE for Datasets and my head melted from terminologies/options/parameters/etc..... please enlighten me in this use case of a Dataset.

man zfsprops and man zfs-create might be more appropriate, since those match your local system.

ieronymous said:
Assuming I have created a zpool (zvolume) of 2Tb with name HH and inside there I need to create a Dataset for proxmox with name bckup and capacity of 500G to have space for VM/LXC backups.
So by running
zfs create -o=mountpoint=/...... HH/bckup (I typed ..... on purpose because if I dont specify mountpoint, automatically it will mount it HH/bckup right? Do I have the option to mount it in wahtever path I choose even if it is to /mnt/backups? which links to another zpool-
the main one?)

if you leave the mountpoint out, it will depend on the parent dataset(s) (properties are inherited if not set explicitly). the default mountpoint is $POOL/$DATASET, so in your case /HH/bckup. but if for example, you set the mountpoint of HH/bckup to /mnt/backups, and then add another dataset HH/bckups/foobar without specifying a mountpoint, it will be mounted under /mnt/backup/foobar.

ieronymous said:
zfs set quota=500G HH/bckup (but how to force that dataset not to exceed the 85% of that space, cant find the command for that)
zfs set compression=on HH/bckup (or better from scratch set compression=lz4 which is the default value)

why would you want to set a quota, and then enforce another limit below that quota? that is what the quota is for

ieronymous said:
....and last questions follows up
- Do I need to (after Dataset creation) create seperate folders inside for daily/weekly/monthly backups or it will be done automatically by proxmox
when I present that bckup space to it? I mean there is got to be a way for things inside backups to be organized and my assumption is that prox
will create default folders for that. Right? Do we know how these paths would be named?

PVE will create directories below the storage mountpoint like 'dump', 'images', .. there are no separate directories for daily/weekly/monthly backups, pruning works just by thinning out ALL backups.

ieronymous said:
- Even though it explains a way to run all the above options as one command line I didnt found an example of doing that instead it creates the
Dataset and then begins issuing the set sub command. Should I use -o for each option like
zfs create -o mountpoint=.... -o quote=.... -o compression=..... HH/bckup

you can do that, the result will be the same.

ieronymous said:
- Is the default recordsize during Dataset creation 128k (Is there a better number for Dataset for backups on disks with 512b logical and
4096physical? disk attributes)

128k should be fine for most cases (it's just an upper limit, smaller files won't waste that much space!). you can enable larger dnodes if your workload benefits from it (a dataset solely used for backups might, since it has less metadata overhead when mostly storing large files), but then your pool might be incompatible with older ZFS releases.

ieronymous · Apr 29, 2021

fabian said:
if you leave the mountpoint out, it will depend on the parent dataset(s)

That was my first moment misconception. Kinda thought that HH is zvol while it isnt. Its the uplevel parent. All other about mount points are a bit more clear now except the fact that zfs creates the intermediate paths like if I had a custom mountpoint to /mnt/bckup it would create that intermediate /mnt right? As a folder or extra Dataset? I am asking this because I dont have a final assumption if inside a Dataset, what you create is other datasets as well or could be folders . In my case would that mnt be a folder or dataset.

fabian said:
why would you want to set a quota, and then enforce another limit below that quota? that is what the quota is for

Because the Dataset which is going to be created needs a space of a specific number value whether that would be in kl,Gb,Tbytes. If I dont set a quote it is to my observation from other datasets I have seen that the default space is 7Gb??? If not what is the default number, or by default it is automatically expands , like been thin-provisioned?
Another reason that I want to set a quota is, the remaining space (2tb-500Gb) will be given in portions to VMs for additional space. So there would be a conflict if each individual space of the same Disk expands indefinitely, or how I am supposed to know which one of the (lets assume a number here) of 4 or 5 space storage would have precedence over the other ones?
Last reason would be that if the space is filled more than 92% the zfs will probably stop working correctly or at all. I dont think what you ve said here is correct since quota is for reservation not for preventing the Dataset to reach 85% (which by the way is a custom percentage of mine that I want to set) So is there a way to stop it before a certain amount of data is reached?

Oh, forgot last thing about the mounting method after the creation of the Dataset. Is it still the case of the Disappearing Datasets where I need to .... ok i ve gathered some info about that one follows below (I know the reason is happening )


How to set overlay option on which does revert the mount behavior to Linux stand one, allowing mounting in non-empty directories.
zfs set overlay=on dataset_name
zfs get overlay dataset_name

set a proper ZFS cache
zpool set cachcefile=/etc/zfs/zpool.cache r720xd1
update-initramfs -k all -u


you can set the "mkdir" or "is_mountpoint" options for the directory storage, see "man pvesm" for details.

Not good practice
path /lib/systemd/system/zfs-mount.service
change line ExecStart=/sbin/zfs mount -a with
ExecStart=/sbin/zfs mount -O -a
Might overwritten after updates of proxmox

proper way
path /etc/pve/storage.cfg
Example storage:

dir: somename
        path /tank/dirstore
        mkdir 0
        is_mountpoint 1    (I recommend trying the `mkdir 0` and `is_mountpoint 1` options on directory storages on ZFS.
                            The former prevents the creation of the path leading up to the storage, and the latter prevents
                            the activation/use/creation of data *on* the storage if it is not ounted)

                info: Maybe each time you access via gui that storage the 1 in is_mountpoint 1 dissapears.
                            It'll still interpret that as true, but this still needs to be changed since the mkdir option
                            is a true-by-default option and could thus be lost unintentionally

Which of the above I need to do, if any of course.

New edit: No I was right the first place HH is the zvol up[on which the bckup dataset will be created. So bckup dont have a parent Dataset to inherit attributes from

zfs set compression=on HH/bckup (or better from scratch set compression=lz4 which is the default value). Do i have to set it to lz4 in order to be able to see the type with zfs get all command for example. Else if I set it to on it will just mention compression on and not
the type is based of.

fabian · Apr 29, 2021

ieronymous said:
That was my first moment misconception. Kinda thought that HH is zvol while it isnt. Its the uplevel parent. All other about mount points are a bit more clear now except the fact that zfs creates the intermediate paths like if I had a custom mountpoint to /mnt/bckup it would create that intermediate /mnt right? As a folder or extra Dataset? I am asking this because I dont have a final assumption if inside a Dataset, what you create is other datasets as well or could be folders . In my case would that mnt be a folder or dataset.

a dataset can have children on the ZFS level (other datasets, possibly not mounted at all or not mounted into the same hierarchy). a mounted dataset contains file and directories. so you can have both, two datasets pool/foo and pool/foo/bar (mounted by default on /pool/foo and /pool/foo/bar), as well as a dataset pool/foo mounted on /pool/foo that contains a sub-directory called 'bar', without that 'bar' being a separate dataset.

ieronymous said:
Because the Dataset which is going to be created needs a space of a specific number value whether that would be in kl,Gb,Tbytes. If I dont set a quote it is to my observation from other datasets I have seen that the default space is 7Gb??? If not what is the default number, or by default it is automatically expands , like been thin-provisioned?
Another reason that I want to set a quota is, the remaining space (2tb-500Gb) will be given in portions to VMs for additional space. So there would be a conflict if each individual space of the same Disk expands indefinitely, or how I am supposed to know which one of the (lets assume a number here) of 4 or 5 space storage would have precedence over the other ones?
Last reason would be that if the space is filled more than 92% the zfs will probably stop working correctly or at all. I dont think what you ve said here is correct since quota is for reservation not for preventing the Dataset to reach 85% (which by the way is a custom percentage of mine that I want to set) So is there a way to stop it before a certain amount of data is reached?

yes, a quota. but a quota is not expressed in percent, but in amount of space taken. if you want a dataset to use at most 500G, set a corresponding quota. there is no way to say "this dataset has a quota of 500G, but make it stop accepting writes once it reaches XX% of that" as that does not make any sense.

ieronymous said:
Oh, forgot last thing about the mounting method after the creation of the Dataset. Is it still the case of the Disappearing Datasets where I need to .... ok i ve gathered some info about that one follows below (I know the reason is happening )
How to set overlay option on which does revert the mount behavior to Linux stand one, allowing mounting in non-empty directories. zfs set overlay=on dataset_name zfs get overlay dataset_name set a proper ZFS cache zpool set cachcefile=/etc/zfs/zpool.cache r720xd1 update-initramfs -k all -u [/QUOTE] I have a bit of trouble parsing your question above. I'd never set overlay on a dataset, it's not needed for PVE. you do want to set "is_mountpoint 1" on any directory storage that is defined on a mountpoint so that storage activation knows to wait for something being mounted there. [QUOTE="ieronymous, post: 387285, member: 69744"] you can set the "mkdir" or "is_mountpoint" options for the directory storage, see "man pvesm" for details. Not good practice path /lib/systemd/system/zfs-mount.service change line ExecStart=/sbin/zfs mount -a with ExecStart=/sbin/zfs mount -O -a Might overwritten after updates of proxmox proper way path /etc/pve/storage.cfg Example storage: dir: somename path /tank/dirstore mkdir 0 is_mountpoint 1 (I recommend trying the `mkdir 0` and `is_mountpoint 1` options on directory storages on ZFS. The former prevents the creation of the path leading up to the storage, and the latter prevents the activation/use/creation of data *on* the storage if it is not ounted) info: Maybe each time you access via gui that storage the 1 in is_mountpoint 1 dissapears. It'll still interpret that as true, but this still needs to be changed since the mkdir option is a true-by-default option and could thus be lost unintentionally

Which of the above I need to do, if any of course.

on current PVE installations, just setting is_mountpoint should be enough.

ieronymous said:
New edit: No I was right the first place HH is the zvol up[on which the bckup dataset will be created. So bckup dont have a parent Dataset to inherit attributes from

no, HH cannot be a zvol since zvols cannot have children.
all datasets except the top-level one (the one named like the zpool, which you should not use to store data) have exactly one direct parent.

ieronymous said:
zfs set compression=on HH/bckup (or better from scratch set compression=lz4 which is the default value). Do i have to set it to lz4 in order to be able to see the type with zfs get all command for example. Else if I set it to on it will just mention compression on and not
the type is based of.

yes, on will choose whatever algorithm is the default (which depends on ZFS version and enabled features).

I'd suggest playing around with a virtual PVE instance to get to know ZFS and all the features it offers better before heading off into production. mistakes are easier to recover if you are in a VM that you can just rollback to a previous state via snapshots, or restore from a backup in a few minutes..

ieronymous · Apr 29, 2021

fabian said:
a quota is not expressed in percent, but in amount of space taken

exactly thats why I asked if there was an attribute to set that kind of limitation also.

fabian said:
if you want a dataset to use at most 500G, set a corresponding quota

I am not in any way, offensive to my follow up answer (I mention it thought because it is just the way is going to be read and have no other way to express it otherwise with my eng) but where exactly did I say that I care about that. I already know how to setup the max amount of space taken (after all is just as simple as setting option quota with a number afterwards)

fabian said:
there is no way to say "this dataset has a quota of 500G, but make it stop accepting writes once it reaches XX% of that"

That should be the answer from start to save us some time, refreshing my question in other ways (still not being offensive here

)

fabian said:
as that does not make any sense.

well make sense to those zfs experts who initially said that if the storage reaches 85% of it's limits becomes slow and if it reaches 92% becomes unstable. Not my words. I just try to avoid future problems here according to what I ma reading not from plain users but pros of the zfs architecture. After that I was going to make a custom guide according to my needs and if my eye caught to a similar question to help to the degree I can. After all isnt that the purpose of forums in general, helping not re invent the wheel? Imagine how many intermediate replies would have been avoided if more experienced users answered ... oh here is the link read and afterwards users that originally made the post returnign back asking the same question since to understand the man pages some times if not all the times you need a manual for the manual itself

fabian said:
a dataset can have children on the ZFS level (other datasets, possibly not mounted at all or not mounted into the same hierarchy). a mounted dataset contains file and directories. so you can have both, two datasets pool/foo and pool/foo/bar (mounted by default on /pool/foo and /pool/foo/bar), as well as a dataset pool/foo mounted on /pool/foo that contains a sub-directory called 'bar', without that 'bar' being a separate dataset.

Ok even though i knew half of the staff here , the way you expressed it will become total sticky to my custom guides.

fabian said:
a dataset pool/foo mounted on /pool/foo that contains a sub-directory called 'bar', without that 'bar' being a separate dataset.

Can this bar though be used on the GUI / Storage /Add Storage to create Directory and to the mount point to type /pool/foo/bar if you want to store something inside there or Proxmox will consider valid only the path pool/foo

fabian said:
on current PVE installations, just setting is_mountpoint should be enough.

without 1 at the end because it assumes it? And without mkdir 0 above the line you say?

fabian said:
no, HH cannot be a zvol since zvols cannot have children

My initial post started with/: Assuming I have created a zpool (zvolume) of 2Tb with name HH and inside there I need to create a Dataset for proxmox with name bckup and capacity of 500G to have space for VM/LXC backups.
So how can HH cant be a zvol? I created from GUI->pve node->disks giving it a name.
Now if proxmox automatically creates with a zvol a dataset as well with the same name that is something I never came across as knowledge

By the way
Can an existing dataset grow just by setting option quota to a bigger number?

ieronymous · Apr 29, 2021

fabian said:
I'd suggest playing around with a virtual PVE instance to get to know ZFS and all the features it offers better before heading off into production. mistakes are easier to recover if you are in a VM that you can just rollback to a previous state via snapshots, or restore from a backup in a few minutes..

I am doing that for 2 years know and in several machines not just one. When there is no guideline on doing essential things like
-installation
-setting Disks
-creating any additional zvols and Datasets (some examples with use cases)
-setting up backups
-setting up network with bonds / virtual bridges etc
-setting up replication / proxmox backup server /HA or whatever a production level environment would need
....then, not only 2 but 10 years would bring someone to the same level as it was back then.
I am not saying that I dont get anything from proxmox, on the contrary, I have setup and tried many of its features, but the main problem is I dont get straight answers to my questions (call them youtuve videos, communities..etc) so I have gathered over 18Gb of videos and custom guides of how to set things up which I update them frequently but I dont have them in order and that messes with the general idea of steps to set it up conrrectly

fabian · Apr 30, 2021

ieronymous said:
well make sense to those zfs experts who initially said that if the storage reaches 85% of it's limits becomes slow and if it reaches 92% becomes unstable. Not my words. I just try to avoid future problems here according to what I ma reading not from plain users but pros of the zfs architecture. After that I was going to make a custom guide according to my needs and if my eye caught to a similar question to help to the degree I can. After all isnt that the purpose of forums in general, helping not re invent the wheel? Imagine how many intermediate replies would have been avoided if more experienced users answered ... oh here is the link read and afterwards users that originally made the post returnign back asking the same question since to understand the man pages some times if not all the times you need a manual for the manual itself

you got pool and dataset levels confused. getting a pool full above a certain threshold will absolutely trash performance (just like approaching 100% actual memory usage will slow down allocations, because the effort to find big enough chunks gets higher or data needs to be fragmented if there are none). getting a dataset 100% full is not a problem (as long as the pool itself still has available space) - except of course that you can't write more data to it anymore once it's full.

ieronymous said:
a dataset pool/foo mounted on /pool/foo that contains a sub-directory called 'bar', without that 'bar' being a separate dataset.

Click to expand...

Can this bar though be used on the GUI / Storage /Add Storage to create Directory and to the mount point to type /pool/foo/bar if you want to store something inside there or Proxmox will consider valid only the path pool/foo

not on the GUI, but on the CLI it is possible (set path to /pool/foo/bar, and is_mountpoint to /pool/foo, and PVE will wait for something being mounted on /pool/foo before considering the /pool/foo/bar storage activatable)

ieronymous said:
without 1 at the end because it assumes it? And without mkdir 0 above the line you say?

yes (or see above if your storage path and mountpoint are not the same)

ieronymous said:
My initial post started with/: Assuming I have created a zpool (zvolume) of 2Tb with name HH and inside there I need to create a Dataset for proxmox with name bckup and capacity of 500G to have space for VM/LXC backups.
So how can HH cant be a zvol? I created from GUI->pve node->disks giving it a name.
Now if proxmox automatically creates with a zvol a dataset as well with the same name that is something I never came across as knowledge

okay, now I see where the confusion comes from

a zpool is not the same as a (ZFS) volume/zvol!

in ZFS the terminology is as follows:
- vdev: a block device or raw image file used to store ZFS pool data (in LVM, this would roughly be a PV)
- zpool: a collection of vdevs + redundancy scheme and/or allocation class (in LVM, this would roughly be a VG)
- dataset: either a filesystem or zvol stored on a pool (this does not really exist in LVM, as LVM doesn't manage hierarchies of file systems but just volumes)
- zvol: a special kind of leaf dataset, can't have children, doesn't use the filesystem part of ZFS to store data, but is exposed as block device instead (in LVM, this would roughly be a LV)

so vdevs and pools are how the data is stored/organized physically on disk(s) (where, and distributed how?). datasets including zvols are how the data is logically stored (hierarchy of datasets containing either filesystems or logical block device data). one is managed with the 'zpool' command, the other with the 'zfs' command.

in PVE we use regular filesystem datasets for containers, and zvols for VMs. if you use the installer to install a system with / on ZFS, we also use regular filesystem datasets to store the system data, including setting up a directory storage on top of ZFS.

when you create a zpool, you automatically also create the root dataset for that pool (with the same name). that dataset can never be a zvol.

ieronymous said:
By the way
Can an existing dataset grow just by setting option quota to a bigger number?

for filesystem datasets, yes, you can arbitrarily set the quota lower or higher or even unset it to lift the restriction altogether. note that there are different kinds of quotas (like quota vs refquota, but also user/groupquota vs userobj/groupobjquota - please see the manpage for details!). PVE sets up refquotas for container volumes (as we want to use the quota for the size of the filesystem as exposed to the container, and don't want snapshots to count against the quota like they would with the regular 'quota' property).

for zvols it's a bit more complicated - they always need a (vol)size, so that determines how big the exposed block device is. a corresponding refreservation is set by default to ensure all that space is also available. if you set the 'thin provisioning' flag, no such reservation will be made, and you (or the VM) can run out of space on the zvol, leading to I/O errors. you can change the volsize (which will by default also change the refreservation), but you also have to ensure that the filesystem/partition table stored on the zvol is adapted accordingly, especially when shrinking the zvol.

ieronymous · Apr 30, 2021

fabian said:
getting a dataset 100% full is not a problem (as long as the pool itself still has available space) - except of course that you can't write more data to it anymore once it's full.

After this, its obvious that I somehow mixed rules for pools and inherit them to datasets by myself

fabian said:
okay, now I see where the confusion comes from a zpool is not the same as a (ZFS) volume/zvol!

in ZFS the terminology is as follows:
- vdev: a block device or raw image file used to store ZFS pool data (in LVM, this would roughly be a PV)
- zpool: a collection of vdevs + redundancy scheme and/or allocation class (in LVM, this would roughly be a VG)
- dataset: either a filesystem or zvol stored on a pool (this does not really exist in LVM, as LVM doesn't manage hierarchies of file systems but just volumes)
- zvol: a special kind of leaf dataset, can't have children, doesn't use the filesystem part of ZFS to store data, but is exposed as block device instead (in LVM, this would roughly be a LV)

so vdevs and pools are how the data is stored/organized physically on disk(s) (where, and distributed how?). datasets including zvols are how the data is logically stored (hierarchy of datasets containing either filesystems or logical block device data). one is managed with the 'zpool' command, the other with the 'zfs' command.

in PVE we use regular filesystem datasets for containers, and zvols for VMs. if you use the installer to install a system with / on ZFS, we also use regular filesystem datasets to store the system data, including setting up a directory storage on top of ZFS.

when you create a zpool, you automatically also create the root dataset for that pool (with the same name). that dataset can never be a zvol.

Whoa a lot of great info here. I ll stand only to the point
-Vdev: so in GUI doesnt give you the ability to mess with vdevs but after initializing disks and making a raidz type level, the product of that creation is directly a zpool and not a vdev right? I cant remember anywhere command that lets you create vdevs exccept if vdev is a logical term only for understanding purposes and not an existing one

fabian said:
for zvols it's a bit more complicated - they always need a (vol)size, so that determines how big the exposed block device is. a corresponding refreservation is set by default to ensure all that space is also available. if you set the 'thin provisioning' flag, no such reservation will be made, and you (or the VM) can run out of space on the zvol, leading to I/O errors. you can change the volsize (which will by default also change the refreservation), but you also have to ensure that the filesystem/partition table stored on the zvol is adapted accordingly, especially when shrinking the zvol.

So if I got that right if a VM during creation or better afterwards was given to see an extra disk by adding from its hardware properties an new storage (actually would that be a vdev or zvol, I tent to believe that would be a zvol by your definition above) of lets say 50G and afterwards you go and edit that storage again increase the size, that size would be unpartitioned and would need to extend the previous partition to cover the new space also. If yes isnt this something tha tcould be done by the VM itself? Since for instance Windows in disk management can see unpartitioned space and can do merge partitions etc

fabian · Apr 30, 2021

ieronymous said:
-Vdev: so in GUI doesnt give you the ability to mess with vdevs but after initializing disks and making a raidz type level, the product of that creation is directly a zpool and not a vdev right? I cant remember anywhere command that lets you create vdevs exccept if vdev is a logical term only for understanding purposes and not an existing one

you don't create vdevs with any ZFS commands - ZFS consumes them

a vdev is usually a disk or a partition on a disk (i.e., what you see when you do zpool status), but it's possible to use raw image files as well for development/test purposes.

ieronymous said:
So if I got that right if a VM during creation or better afterwards was given to see an extra disk by adding from its hardware properties an new storage (actually would that be a vdev or zvol, I tent to believe that would be a zvol by your definition above) of lets say 50G and afterwards you go and edit that storage again increase the size, that size would be unpartitioned and would need to extend the previous partition to cover the new space also. If yes isnt this something tha tcould be done by the VM itself? Since for instance Windows in disk management can see unpartitioned space and can do merge partitions etc

yes, if you create volume and give it to a VM as disk (e.g., via the GUI), that would be a zvol. if you then later on resize that using the GUI (or qm resize ... or zfs set volsize=... ...), then the guest OS inside the VM sees that the virtual disk got bigger (potentially only after a reboot, depending on guest OS + drivers and virtual HW configuration), but to actually use that additional space you need to adjust the partition table/LVM/filesystem/... . how that works depends on the guest OS and how the disk was setup, you can find some pointers on https://pve.proxmox.com/wiki/Resize_disks or in the docs of your OS and its storage tools.

edit: this latter part is not specific to ZFS btw, the same applies no matter on which storage your VM's virtual disks are located.. PVE can only resize the storage part (zvol, LVM volume, raw/qcow2 image file, ...), the logical contents (partitions/.../file systems) of the virtual disk are up to the guest. this is also the reason why we don't expose shrinking in our API/GUI, because shrinking safely requires knowing what is stored on the disk and shrinking the contents first.

ieronymous · Apr 30, 2021

@fabian Thank you extremely (very sounds too little) much for your info on these, almost 2 pages, of messages.

Search

Search

Feature Request : Creating & Managing datasets

fabian

Proxmox Staff Member

ieronymous

Well-Known Member

fabian

Proxmox Staff Member

ieronymous

Well-Known Member

ieronymous

Well-Known Member

fabian

Proxmox Staff Member

ieronymous

Well-Known Member

fabian

Proxmox Staff Member

ieronymous

Well-Known Member

We value your privacy