Difference between two ZFS views of free space

Jan 17, 2024
10
0
1
Hello everyone,

We recently completed a migration from ESXi to Proxmox. The migration went smoothly, and everything worked perfectly.

However, after one week, we noticed two different storage views in Proxmox, and we're unsure which one is correct. One view shows only 900 GB free, while the other shows 9.29 TB free.

1744273608751.png


1744274003399.png


In the shell, we found the following:

root@lkppve0001:~# zfs list -r rpool-data

NAME USED AVAIL REFER MOUNTPOINT
rpool-data/vm-118-disk-0 101G 779G 37.1G -
rpool-data/vm-118-disk-1 3.82T 2.74T 1.78T -
rpool-data/vm-118-disk-2 2.23T 2.20T 758G -

What do these values mean? When I check the disk sizes inside the VM, they don't seem to match this output.

1744274313939.png

Thank you all in advance.
 

Attachments

  • 1744273623804.png
    1744273623804.png
    2.1 KB · Views: 1
in ZFS you basically have two views:
- the pool view, which looks at the actual usage on the disks/partitions which make up the zpool
- the dataset view, which looks at logical usage (accounting for things like compression, reservations and quotas, and so on)

most likely the difference in your case is that you have guest volumes with reservations (where the dataset view treats the full size as used, even though no data has been written to parts of it yet, so it is not really physically used on disk)

if you are using zvols for VMs, there's another layer involved - whatever the guest does inside, some part of the disk might be used from ZFS point of view, but the guest sees it as free.
 
Hi Fabian,
Is this reservation coming from the ESXi setup, or is it a configuration in ZFS on Proxmox?
Is this reservation important, and can it be removed somehow? Right now, it's a problem because the Proxmox server have enough too much free space due to it.
 
it's a property of the storage on the PVE side (https://pve.proxmox.com/pve-docs/chapter-pvesm.html#_configuration_6) but you can also change it after the fact for existing datasets. there is a risk associated with it - if you don't reserve the full space for each dataset/volume, then you are overprovisioning storage - if each guest attempts to use their disks/mountpoints to the full extent, you will run out of space and into I/O errors..
 
We've already done about six migrations from ESXi, and I just can't understand why this issue happened now and not during the other five migrations. Everything seems normal and fine.

I’ve compared both configurations, and they appear to be the same, so I don’t understand why they’re behaving differently.
 
you could check storage.cfg on both systems, and "zfs list -o name,space,reservation,refreservation"
 
I've attached two files with the results of the command:
zfs list -o name,space,reservation,refreservation
One file shows the output from an older migration, and the other is from the new migration.
The only difference I noticed in the storage.cfg file is that the newly migrated server does not have the sparse variable under rpool, while the older one has sparse 0.
 

Attachments

yes, sparse is exactly what controls whether (new) volumes will be thin-provisioned (no reservation) or fully provisioned (reservation for the full size)
 
We may have found the cause of the issue. We believe that because our ZFS pool is thick provisioned and we use ZFS auto-snapshots, Proxmox is reserving space up front for the snapshots.

For example, take the disk from VM118:
We have one disk with a size of 2 TB. The VM is using 1.78 TB of that space, and Proxmox is reserving an additional 2 TB for snapshots. That results in:
Used + Reserved = 3.78 TB,
which matches the value we see when running the zfs list command.

Now the question is:
If I disable auto-snapshots and convert the entire datastore to thin provisioning, will the "used" space reflect the actual usage more accurately?

Also, is it possible to make this change when VM disks already exist on the datastore?
 
Changing the sparse/Thin Provisioning option for the data store will not affect existing ones but you can just do
Code:
zfs set refreservation=none rpool-data/vm-106-disk-0
and so on.
 
Last edited: