ZFS 2.4.0 Special Device + Data VDEV Report Very Wrong Remaining Space

gogito

Member
Jan 12, 2022
25
2
23
27
Distribution Name | Proxmox 9.1.5 Debian Trixie 13
Distribution Version | Proxmox 9.1.5 Debian Trixie 13
Kernel Version | Linux 6.17.9-1-pve
Architecture | x86_64
OpenZFS Version | zfs-2.4.0-pve1 - zfs-kmod-2.4.0-pve1

My zpool has 2 device: sdb (HDD) and sda (nvme)

The issue is all dataset report the remaining free space by using the “HDD (data) size - (HDD + NVME usage) = 448GB” while hardware wise the HDD has 1.6T and the NVME has 800G left.

My expectation would be for it to consider the HDD remaining (1.6T) as actual free space. As in report the free space by what’s available left to write to, not what using the substraction since that gives the wrong info to other programs.

Is this a bug?

1770622326996.png
 
What is the use case for using zfs and disks in this way? This is not recommended way, if i have only two disks this way i would use bcachefs.
 
What is the use case for using zfs and disks in this way? This is not recommended way, if i have only two disks this way i would use bcachefs.
Well ZFS allow a special_device to store metadata as well as small file to speed up the pool. Previously, I split my NVME in 2, 1800G for a nvme only pool for zvol and 200GB as special for the HDD pool (zfs_main).

With zfs 2.4.0 zvol write can now be allocated to the special device also so I combine them.


If I keep the previous 2 pool setup then it's very inconvenience if I want to add more special_device to the hdd pool. I will first need to reduce the nvme pool then add to the hdd pool, which with zfs is never that straightforward. With 2.4.0, the idea is now this nvme can be shared much more easily.


I did try bcachefs but it has a lot of issue for me and now that it's no longer in the kernel it's not really appealing with the maintenance.
 
This is not a setup I would trust my data with. Since it lacks redundancy your pool will be lost if one of the devices get broken
 
  • Like
Reactions: news and UdoB
This is not a setup I would trust my data with. Since it lacks redundancy your pool will be lost if one of the devices get broken
Ah forgot to mention, it's kinda like my backup pool. My main pool is mirror special device + raidz1 4xhdd
 
Ah forgot to mention, it's kinda like my backup pool. My main pool is mirror special device + raidz1 4xhdd

This would work if I understand the documentation section correctly:
The redundancy of the special device should match the one of the pool, since the special device is a point of failure for the whole pool.

https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_special_device

So you can loose one HDD or one SSD and your pool will still be ok (although you should replace the failed disc as soon as possible) , if I didn't miss something.
 
I reread the op and for the life of me I dont understand what you're asking.

Actual free space is exactly what you think it is, but its not USABLE. Usable free space depends on the makeup of your data, since you have a special device that (presumably default) is eating all <128k writes. And thats assuming you havent enabled compression (pro tip- you should.)
 
I reread the op and for the life of me I dont understand what you're asking.

Actual free space is exactly what you think it is, but its not USABLE. Usable free space depends on the makeup of your data, since you have a special device that (presumably default) is eating all <128k writes. And thats assuming you havent enabled compression (pro tip- you should.)
Basically the issue is:
My ssd has 780G available
My HDD has 1.6TB available

But free space is only 450gb available.

My expectation is it should say 1.6TB available since that's how much data my "data" vdev can hold.

Compressions is zstd-4.

My data makeup is normal, there is no snapshot or reserved.
 
Last edited:
zfs list -o name,quota,reservation
Here's the output

Code:
root@beta:~# zfs list -o name,quota,reservation
NAME                                    QUOTA  RESERV
rpool                                    none    none
rpool/ROOT                               none    none
rpool/ROOT/pve-1                         none    none
rpool/data                               none    none
rpool/var-lib-vz                         none    none
zfs_main                                 none    none
zfs_main/WORM                            none    none
zfs_main/WORM/PBS_Datastore_backup       none    none
zfs_main/zvol_vm_lxc                     none    none
zfs_main/zvol_vm_lxc/subvol-100-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-101-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-103-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-105-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-106-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-107-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-108-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-109-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-110-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-111-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-112-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-115-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-116-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-117-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-118-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-119-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-120-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-122-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-123-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-124-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-125-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-126-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-128-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-129-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-132-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-133-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-134-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-136-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-137-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-138-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-140-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-142-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-143-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-146-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-148-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-150-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-151-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-152-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-154-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-155-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-156-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-157-disk-1   none    none
zfs_main/zvol_vm_lxc/subvol-159-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-160-disk-1   none    none
zfs_main/zvol_vm_lxc/subvol-167-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-168-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-169-disk-0   none    none
zfs_main/zvol_vm_lxc/subvol-170-disk-0   none    none
zfs_main/zvol_vm_lxc/vm-102-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-102-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-113-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-113-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-114-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-114-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-121-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-121-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-130-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-130-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-130-disk-2          -    none
zfs_main/zvol_vm_lxc/vm-131-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-131-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-135-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-135-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-135-disk-2          -    none
zfs_main/zvol_vm_lxc/vm-135-disk-3          -    none
zfs_main/zvol_vm_lxc/vm-139-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-139-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-141-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-144-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-144-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-145-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-145-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-147-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-147-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-149-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-149-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-153-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-153-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-158-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-158-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-158-disk-2          -    none
zfs_main/zvol_vm_lxc/vm-161-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-161-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-162-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-162-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-162-disk-2          -    none
zfs_main/zvol_vm_lxc/vm-163-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-163-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-165-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-165-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-166-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-166-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-172-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-172-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-173-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-173-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-174-disk-0          -    none
zfs_main/zvol_vm_lxc/vm-174-disk-1          -    none
zfs_main/zvol_vm_lxc/vm-175-disk-0          -    none
 
Its likely that your particular configuration escaped testing. like others noted, its not really a sane configuration.
I think the issue is not the vdev redundancy level. The same would happen even with the following pool layout which should be sane.

3xMirror special device
Raidz3 8xhdd vdev

ZFS 2.4 specially allow zvol write to land on the special vdev so it's kinda odd that this missed testing since it's the most basic setup for the feature.

Though I can see how testers just see the zvol write land on the special device and did not check the free space
 
Last edited: