Nonsensical behaviour from zfs

Mar 19, 2025
1
0
1
Hey,

I know you guys have a bug tracker and whatnot but I'm really not willing to put in the effort and time to read all of the guidelines and look up where I actually should post this.

Please treat this as a drive by report from a lightly sleep deprived nerd.


I just wanted to let you know that there is an edgecase(?) in which it is borderline impossible to actually access a zfs dataset after trying to mount it. You might want to report it upstream at OpenZFS or even the kernel itself, I have no idea who is responsible for this mess.

So the edgecase is that:

- you can create encrypted zfs dataset, in this case bulk01/backups
- then you can zfs send | zfs recv dataset(s) from another pool into bulk01/backups
- what happens is that zfs has a garbage UX and will happily put UNENCRYPTED data as children of encrypted dataset, I expected this level of stupid only on ntfs/bitlocker
- then you reboot the system a few times
- now, some of the child datasets have undefined mountpoints, so you define them in the place you'd expect and tell fs to mount them, you skip one of the datasets in the chain
- then you also decrypt the encrypted dataset and mount it

Now all of the mount state(s) are completely mangled and it seems like the kernel and zfs have no idea whats going on.

ls and stat will report the some of the mountpoints are empty or the directories don't exist
when you try umounting it using zfs umount or plain umount it will tell you that this patho doesnt exist, is not mounted or something else.
you cannot mount the datasets because they are in a superposition and if you want to mount them they already are mounted.

I would normally just reboot the system because it's just my playground home nas, but it's 2026 and we got clankers, so I poked chatgpt for scripts that could fix this mess, forcefully umount everything under bulk01/backups, it generated a few scripts and nothing came out of it so I just rebooted the system.

I have no idea if this is useful at all but here is some output from the script(s), it could also be the fact that it's nearly 2 AM as I type this.
 

Attachments

- you can create encrypted zfs dataset, in this case bulk01/backups
- then you can zfs send | zfs recv dataset(s) from another pool into bulk01/backups
Just to be clear, you don't encrypt the source pool and then just pull ZFS snapshots to your encrypted backup pool and want encryption there, right?
Because simply encrypting on the source would be easier otherwise.
what happens is that zfs has a garbage UX and will happily put UNENCRYPTED data as children of encrypted dataset, I expected this level of stupid only on ntfs/bitlocker
ZFS is a server filesystem that lets you do almost anything, no matter how stupid.
If you create a encrypted dataset, it's children will by default also be enrypted. Depending on how you use ZFS send, you can set it to adapt to the source or copy the settings from the source. Just like you can copy paste text into Word with or without formatting.

Unfortunatly I can't help your further troubleshoot your issues, just some two cents of advice.
Notice how encryption will mount the drives without you inserting a key at boot? That is because the key to decrypt is stored on the boot pool. Unless you are willing to input a password or plugin in a hardware key every time at boot, IMHO encryption is not really worth it for most users.
Second, Proxmox is a great hypervisor, not so great NAS. Run a second host with TrueNAS as NAS and it could save you a lot of headache.
Lastly, running ChatGPT scripts to "fix" ZFS is probably not a good idea.
 
  • Like
Reactions: Johannes S and UdoB