Best to store VMs/CTs in zpool or in separate datasets?

May 21, 2020
25
8
8
I have a single NVMe drive for Proxmox to boot from. I'm going to add 2x SATA enterprise SSDs to my host, create a ZFS mirror, then move my VMs/CTs to that mirror.

Is there any benefit from having separate datasets for VMs and CTs (like below), or should I put everything on the pool itself?

Code:
zpool create -f -o ashift=12 intel_mirror mirror/dev/disk/by-id/xxxxxx /dev/disk/by-id/yyyyyy
zfs set compression=lz4 intel_mirror
zfs set relatime=on intel_mirror
zfs create intel_mirror/cts0
zfs create intel_mirror/vms0
zpool status
pvesm add zfspool cts0 -pool intel_mirror/cts0
pvesm add zfspool vms0 -pool intel_mirror/vms0

Note that backups, ISOs, and container templates are stored on a physically separate NAS.
 
Last edited:
  • Like
Reactions: SInisterPisces
I don't see any benefit, except you want different options for LXCs and VMs (ZFS options of child datasets/zvols will be inherited from the parent dataset) or be able to do recursive snapshots for just LXCs or VMs.

And I personally wouldn't use that NVMe as system disk as it doesn't got redundancy. PVE itself just needs only about 16-32GB of disk space and performance isn't important (even slow HDDs would be fine) and it's no problem to use the same pool for guests virtual disks and the root filesystem. I would just install PVE as a ZFS mirror on those two new SSDs. That way your root disk also got redundancy and there is no dataloss or downtime when your a SSD fails.
 
Last edited:
I have a single NVMe drive for Proxmox to boot from. I'm going to add 2x SATA enterprise SSDs to my host, create a ZFS mirror, then move my VMs/CTs to that mirror.

Is there any benefit from having separate datasets for VMs and CTs (like below), or should I put everything on the pool itself?

Code:
zpool create -f -o ashift=12 intel_mirror mirror/dev/disk/by-id/xxxxxx /dev/disk/by-id/yyyyyy
zfs set compression=lz4 intel_mirror
zfs set relatime=on intel_mirror
zfs create intel_mirror/cts0
zfs create intel_mirror/vms0
zpool status
pvesm add zfspool cts0 -pool intel_mirror/cts0
pvesm add zfspool vms0 -pool intel_mirror/vms0

Note that backups, ISOs, and container templates are stored on a physically separate NAS.
This is exactly what I've been searching around trying to find for the past 3 days. Thank you so much. :)

I'm a brand new PVE/ZFS user, and wanted to be able to control the volblocksize of my VM disk images, but I could only find tutorials for bare metal ZFS, without the PVE-specific command.
 
Only way to persistently change the volblocksize is to set the storage wide "Block size" at "Datacenter -> Storage -> YourZFSStorage -> Edit". Because no matter what volblocksize you create the zvol with, when restoring a VM from a backup or mgrating a VM PVE will destroy the zvols and create a new zvol but with the volblocksize set at "Datacenter -> Storage -> YourZFSStorage -> Edit -> Block size". If you want zvols with different volblocksizes you would need to create different ZFS storages. But you don'T need different pools for that, a dataset can be used as a ZFS storage.
 
  • Like
Reactions: SInisterPisces
Only way to persistently change the volblocksize is to set the storage wide "Block size" at "Datacenter -> Storage -> YourZFSStorage -> Edit". Because no matter what volblocksize you create the zvol with, when restoring a VM from a backup or mgrating a VM PVE will destroy the zvols and create a new zvol but with the volblocksize set at "Datacenter -> Storage -> YourZFSStorage -> Edit -> Block size". If you want zvols with different volblocksizes you would need to create different ZFS storages. But you don'T need different pools for that, a dataset can be used as a ZFS storage.
Indeed.
The instructions lmm5247 provided did exactly that. It looks like (and I don't 100 percent understand this yet), they create a new dataset as a child of the pool we want to put the VM storage on, and then the pvesm command tells Proxmox to be aware of the new ZFS dataset and expose it inside the web UI as if it's a pool, which means its volblocksize can be adjusted.

I tested this by creating a new dataset with this method, setting the volblocksize to 64k, and then creating a new VM disk there. It has a volblocksize of 64k, as I want it to. :)
 
Jup. The pools root is just like a dataset. PVE won't care if you point a storage of type "zfspool" to a pools root or to any dataset as it behaves the same.
 
  • Like
Reactions: SInisterPisces
Jup. The pools root is just like a dataset. PVE won't care if you point a storage of type "zfspool" to a pools root or to any dataset as it behaves the same.
I'm glad to understand how it works, but it sure seems like it'd be easier if we could actually adjust the volblocksize for virtual scsi disks on new VMs from within the interface--this should be possible if they're just zvols, right?

The fact that we can't do that yet tells me it must be a non-trivial change, as it's apparently been a feature request for a while. Fingers crossed ... someday. :)
 
I'm glad to understand how it works, but it sure seems like it'd be easier if we could actually adjust the volblocksize for virtual scsi disks on new VMs from within the interface--this should be possible if they're just zvols, right?

The fact that we can't do that yet tells me it must be a non-trivial change, as it's apparently been a feature request for a while. Fingers crossed ... someday. :)
I can manually create zvols with different volblocksizes on the same zfspool storage and they will work absolutely fine. Only problem is that PVE will replace that volblocksize when doing a backup restore or migration. So I don't see why PVE shouldn't be able to do what I am manually doing with the CLI. Shouldn't be hard to implement it if really wanted.
 
Last edited:
  • Like
Reactions: SInisterPisces
I can manually create zvols with different volblocksizes on the same zfspool storage and they will work absolutely fine. Only problem is that PVE will replace that volblocksize when doing a backup restore or migration. So I don't see why PVE shouldn't be able to do what I am manually doing with the CLI. Shouldn't be hard to implement it if really wanted.
Yeah.

That's why I set up the datasets. I can force preservation of the volblocksize on the zVols in each dataset (so, I have one for 64k volblocksize VMs, etc.)
 
Hello!

So, quick followup question.

I created a new dataset and got it added to PVE using the above commands. Then I set it to use 64k Block Size within PVE's Datacenter Storage settings.

1681523894164.png

So that seems correct.
But then I see this when I get all properties via the zfs command.
Code:
~# zfs get all vmStore/vmDisks64k | grep size
vmStore/vmDisks64k  recordsize            128K                   default

What's going on? Did I do something wrong?
Or is it that I set volblocksize, which is what is used by VMs, and that command doesn't list volblocksize?
(I think that's what's going on, but. now I'm paranoid.)

I've managed to confuse myself again. :(
 
Last edited:
Hello!

So, quick followup question.

I created a new dataset and got it added to PVE using the above commands. Then I set it to use 64k Block Size within PVE's Datacenter Storage settings.

View attachment 49225

So that seems correct.
But then I see this when I get all properties via the zfs command.
Code:
~# zfs get all vmStore/vmDisks64k | grep size
vmStore/vmDisks64k  recordsize            128K                   default

What's going on? Did I do something wrong?
Or is it that I set volblocksize, which is what is used by VMs, and that command doesn't list volblocksize?
(I think that's what's going on, but. now I'm paranoid.)

I've managed to confuse myself again. :(

The base has no volblocksize set:
Bash:
zfs get volblocksize rpool/data
NAME        PROPERTY      VALUE     SOURCE
rpool/data  volblocksize  -         -
but the actual VM-vdisk has:
Bash:
zfs get volblocksize rpool/data/vm-100-disk-0
NAME                      PROPERTY      VALUE     SOURCE
rpool/data/vm-100-disk-0  volblocksize  8K        default
 
  • Like
Reactions: SInisterPisces
The base has no volblocksize set:
Bash:
zfs get volblocksize rpool/data
NAME        PROPERTY      VALUE     SOURCE
rpool/data  volblocksize  -         -
but the actual VM-vdisk has:
Bash:
zfs get volblocksize rpool/data/vm-100-disk-0
NAME                      PROPERTY      VALUE     SOURCE
rpool/data/vm-100-disk-0  volblocksize  8K        default

Thanks!

That makes sense; I'm glad I hadn't accidentally broke something.

Is there any benefit at all to setting the volblocksize at the base level? I'm guessing no, but I'm enough of a newbie to be curious. :)
 
Thanks!

That makes sense; I'm glad I hadn't accidentally broke something.

Is there any benefit at all to setting the volblocksize at the base level? I'm guessing no, but I'm enough of a newbie to be curious. :)
Doesn't make sense to set the volblocksize anywhere. Even manually creating zvols with setting a volblocksize won't make much sense, as PVE doesn`t care what volblocksize was used. As soon as you restore a guest from a backup PVE will wipe those existing zvols and create new ones with a volblocksize matching the value of the "Block Size" textfield of your "ZFSPool" type storage.
 
  • Like
Reactions: SInisterPisces
Doesn't make sense to set the volblocksize anywhere. Even manually creating zvols with setting a volblocksize won't make much sense, as PVE doesn`t care what volblocksize was used. As soon as you restore a guest from a backup PVE will wipe those existing zvols and create new ones with a volblocksize matching the value of the "Block Size" textfield of your "ZFSPool" type storage.
Awesome. Thank you; that makes perfect sense.

I'm surprised this isn't on the wiki somewhere.
I mean, blocksize is mentioned here, but just for the command line and without an explanation of why you'd want to alter it for various VMs: https://pve.proxmox.com/wiki/Storage:_ZFS

The instructions for adding the storage to PVE in the GUI are here: https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks

So you'd need to know enough to figure out how to hit both those wiki pages, and then put the info together, which is tricky at n00b level zero. ;)

I only understood it and figured out how to do it once I stumbled on this thread via random googling and forum searching. Before I got here I had a ton of resources for ZFS in general, but was missing that last bit for ZFS-in-Proxmox-for-VMs.
 
I started reading this and thought, "wow, yes, this would be an excellent feature," and then scrolled down and saw my own enthusiastic reply to your original post. So, I'm having a great brain day. ;) But seriously, I would love to see that feature; it would make things so much less confusing.

As it is, I've gone the route of creating multiple datasets on the same pool and adding each dataset as its own ZFS storage, as you discussed in your thread. It's not exactly intuitive when you're still learning ZFS, but it works.

One thing I'm still not clear on: what is the equivalent process for container storage? Say I wanted to create a MariaDB instance in a container. I know that actual database storage needs to be 16k blocks. But I'm not sure what the best way to do that is with Proxmox and ZFS. I know I can set the recordsize ZFS property for a dataset, but I'm not sure if that's best practice, or if setting recordsize=16k is actually enough, or if I need to do something else.

I'm much newer to containers. Maybe there's actually a way to do this in the Proxmox GUI? I know recordsize itself is a max value, so it makes sense that a container storage's recordsize could be set lower than the max?
 
One thing I'm still not clear on: what is the equivalent process for container storage? Say I wanted to create a MariaDB instance in a container. I know that actual database storage needs to be 16k blocks. But I'm not sure what the best way to do that is with Proxmox and ZFS. I know I can set the recordsize ZFS property for a dataset, but I'm not sure if that's best practice, or if setting recordsize=16k is actually enough, or if I need to do something else.
Recordsize will be inherited by the upper dataset/poolroot. So you could do the same as for VMs (different datasets as different ZFSpool storages in PVE) but then instead of setting the "Block Size" for the ZFSpool storage, you set the recordsize using "zfs set recordsize=16K YourPool/YourDataset" in CLI for the dataset. Datasets that PVE creates for LXC should then inherit that, as PVE isn't creating those datasets with a custom recordsize set.
 
Last edited:
  • Like
Reactions: SInisterPisces
Recordsize will be inherited by the upper dataset/poolroot. So you could do the same as for VMs (different datasets as different ZFSpool storages in PVE) but then instead of setting the "Block Size" for the ZFSpool storage, you set the recordsize using "zfs set recordsize=16K YourPool/YourDataset" in CLI for the dataset. Datasets that PVE creates for LXC should then inherit that, as PVE isn't creating those datasets with a custom recordsize set.
Excellent. Thanks!
 
I have a single NVMe drive for Proxmox to boot from. I'm going to add 2x SATA enterprise SSDs to my host, create a ZFS mirror, then move my VMs/CTs to that mirror.

Is there any benefit from having separate datasets for VMs and CTs (like below), or should I put everything on the pool itself?

Code:
zpool create -f -o ashift=12 intel_mirror mirror/dev/disk/by-id/xxxxxx /dev/disk/by-id/yyyyyy
zfs set compression=lz4 intel_mirror
zfs set relatime=on intel_mirror

[deletia]
[/QUOTE]
I'm coming back to this thread again for reference, and just noticed you deliberately enabled relatime. I've been disabling atime and relatime where I can to avoid unnecessary writes. I'm curious why you chose to leave relatime on.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!