io-error on all VMs - Storage Full (ZFS Pool 0B free)

silverstone

Active Member
Apr 28, 2018
25
0
41
35
As peer the title, I have a host (running Podman/Docker Containers) that is periodically hitting 100% ZFS Pool Usage due to the Snapshots that keep getting created automatically on the Host.

I enabled TRIM on the GUEST ZFS Pool (yes ... I'm running ZFS in the GUEST on top of a ZVOL that Proxmox VE manages on the HOST), but I keep hitting 100% Zpool Usage on the HOST (0B free left).

All VMs fail (expectably) with "io-error" on all VMs. They are left "on" but cannot SSH or do anything about it.

The only solution is to periodically delete ALL snapshots for the affected VM (e.g. `rpool/data/vm-101-disk-1@zfs-auto-snap_daily-2024-04-24-0425`, but also the ones that I create manually e.g. in case of Proxmox VE Host Upgrades etc) on the HOST and reboot Proxmox VE.

Is there a better way to handle this ?

Maybe it's to go away from `zfs-auto-snapshot` and rather use `sanoid` with `syncoid` ?
And in both cases I guess it's possible (and recommended) to set an exclusion rule on the Host for creating the Snapshots in the first place, since anyways the guest already does so.

What's the best practice in this regard ?
And why isn't `zed` / `zfs-zed` sending in any warnings when the Pool is [almost] Full ? Is it possible to configure it to do so ?

Thank you for your help :).
 
Last edited:
What problem are you solving with automatic snapshots? Or is this not a Proxmox thing and happens to everybody running Podman/Docker?
Have you considered using PBS (in a container for example) that can really quickly backup whole running VMs and deduplicate automatically and sync to another (off-site) system.
 
What problem are you solving with automatic snapshots? Or is this not a Proxmox thing and happens to everybody running Podman/Docker?
Have you considered using PBS (in a container for example) that can really quickly backup whole running VMs and deduplicate automatically and sync to another (off-site) system.
What problem ... Right now ... none I guess ... except possibly user error deleting a whole dataset/Container inside the VM or something.

Going forward probably backups (using sanoid + syncoid) but not there yet ...

I plan to use Proxmox Backup Server (PBS) in the future but as I said ... not there yet. Will run it bare metal on several dedicated Backup Servers though.

EDIT: maybe I can just issue
Code:
zfs set com.sun:auto-snapshot=false rpool/data/vm-101-disk-1
 
Last edited:
> enabled TRIM on the GUEST ZFS Pool (yes ... I'm running ZFS in the GUEST on top of a ZVOL that Proxmox VE manages on the HOST), but I keep hitting 100% Zpool Usage on the HOST (0B free left)

A) Migrate your zpool to larger disks, so you have more free space. If snapshots are filling up your pool then I would start deleting all of them after about 5-7 days.

A1) If you're using a mirror pool, add 2x more disks in a "column" to expand after setting autoexpand=on

B) Zfs-on-zfs is a bad idea. Recommend moving the Guest disk to lvm-thin, and then you can still have ZFS in the guest without getting I/O slowdowns, possibly double compression attempts, and write amplification (which wears out your disks faster)
 
A) The Pool would fill up anyways ... For 70GB used inside the guest, it would accumulate 700GB (!!!) of used space on the HOST. Now it's "only" using 181GB ....

A1) No extra bay capacity

B) I'm getting confused ... All the space / disk on the HOST is occupied by ZFS, there is no space for another partition. Or do you mean ZFS HOST Pool -> LVM VDISK on the HOST -> ZFS GUEST Pool ?

Anyway I do not have any experience with LVM. I remember back then trying to recover some "default" installations of Debian on LVM+MDRAID etc it was a total nightmare for which command (lv... , pv...) to use etc. I don't really want to go there.

ZFS compression is enabled on the HOST and ... I thought I had disabled it on the guest but apparently not.
Only the ROOT dataset (basically empty) had it turned off. All the children Datasets in the guest still have it on ...

I'll turn it off now on the GUEST.

EDIT 1: maybe it's just easier to turn it off on the HOST, or ?

EDIT 2: disabled compression on the HOST. Now both Compression and Snapshots are managed by the GUEST. After issueing a `zpool trim zdata` inside the GUEST, all the space on the host got freed, beside the used space that is listed inside the GUEST (so basically used space HOST = used space GUEST). In theory they should stay like this going forward, as the host should only provide a ZVOL, without any compression or snapshotting.

Anything else I should disable to prevent this situation from re-occurring ?
 
Last edited:
> A1) No extra bay capacity

Since you're running a ZFS rpool, you could replace the disk(s) one by one with a larger one in-place with no extra bays, just do ' zpool set autoexpand=on zpoolnamehere ' beforehand. You may (probably) have to run the proxmox boot utility to update your boot config afterward.

> Anyway I do not have any experience with LVM. I remember back then trying to recover some "default" installations of Debian on LVM+MDRAID etc it was a total nightmare

Yah, I understand. LVM basically died for me when ZFS + Samba started working together. Haven't used it since until Proxmox came around. Command-line LVM is the worst, ZFS is so much easier to admin.

SuSE has a nice LVM front-end in Yast, and there's also Webmin (runs on port 10000) if you want to experiment with it in a VM. Makes it much easier to visualize. I actually tested an LVM resize in a VM with webmin before doing it on the host level.
 
Yeah but I'd rather not replace the disks if it's not really needed. It's also $$$.

It's not like it was "real" used space. It was probably caused by ZFS doing snapshots on top of snapshots or compression on top of compression.

Do you think there are other features I should disable on the ZFS HOST ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!