Hello Everyone,
This could be TL;DR
I use Proxmox from the very beginning and I use ZFS for ages back when I had to compile it by myself and official PVE did not support using ZFS. I use smaller clusters (2 - 6 nodes, but a cluster is a cluster in case of 3+ nodes, so, if you see one, you saw all
.
History of my backup tasks
I know, it could start a flamewar regarding filesystems and what is a backup etc., but my main goal is to start a discussion, how to achieve an integrated, zfs based backup utility to be able to use it as easy as the PBS/vzdump, except, it should work
Pros
Some quick notes regarding this:
Thank you for your attention, I hope this thread could help me to find a better solution which could help others, too.
István
This could be TL;DR
I use Proxmox from the very beginning and I use ZFS for ages back when I had to compile it by myself and official PVE did not support using ZFS. I use smaller clusters (2 - 6 nodes, but a cluster is a cluster in case of 3+ nodes, so, if you see one, you saw all

History of my backup tasks
- I took backups in the beginning using vzdump on lvm-ext4: it was horrible slow, suspended VM/openvz cts for a long time, the server hardware was working hard every night,
- when I turned to zfs, I start using its snapshot to get snapshots and use them as backup (copy to elsewhere): it just worked every time, no downtime, not hammering hardware and did not slow down
- now I use zfs based backups using simplesnap and pve-zsync, which are low-level tools without fancy GUI and log analysis and so on:
- they just work, but if something goes wrong and there is no zabbix etc. monitoring, somebody needs to check it
- other thing: replicating to more backup servers could be tricky, due to that, every zfs tool start making new snapshots, making master-backup1-backup2 difficult (in this case, the backup1->backup2 needs custom solution) - as a consultancy company advised to me, I should use some other solution for backups and they said, they recommend PBS and they are using it already, I just installed it and I start to evaluate it
- So, I put together a new backup server locally with a lot of capacity to store backup data and installed the latest and greatest release (PBSlocal).
- I also put it one to Hetzner for remote backup (PBSremote).
- I did some test backups using some smaller VMs/CTs (some GBytes per vm/ct) during the weekend from the production cluster (PVEcluster) to PBSRemote. It went fine, I was surprised.
- I started to make some bigger backups from PVECluster to PBSlocal, it went fine, too.
- I also started to test the sync, which makes life really easy in with backup1->backup2 sync tasks. It seems it just works.
- Now, after almost 3TByte compressed backup I have, I started to sync them between PBClocal -> PBSremote, it runs now for 4 days, due to the uplink speed and still needs several days for initial upload.
- I have to say, I am very impressed with the PBS-PVE ecosystem, but......
- i found that, during the backup, for a short or longer time, VMs (especially windows guests) stopped responding, one of them I had to kill and reboot again.
- I tried to use similar regular backup as I did with pve-zsync: every 4 hours during worktime but I was surprised, users started reporting immediately, their RDP sessions hanged, stopped working, I caused a little panic for some minutes. Cancelling backup jobs solved it. I scheduled the backup to the night, one time per day.
- Yesterday night the backups of the VMs are failed without real reason (I mean, there are info in the logs, but it tells me nothing and at this moment it is not important, what is important, it did just not work).
- We run services, which needs to run 0-24, all the time, maybe on Saturday night we could find a time gap to stop services in a determined order, make backup and start again, because these services are running on several VMs/CTs and they are communicating to each other and if one falls out, it could cause problems (already did some noise in our the monitoring system, the question is not the quality of the used applications).
- I read several threads on the forum after I experienced these issues to figure out, what happened in the last 10+ years, while I was living under a zfs-rock.
- There are still different problems with vzdump on various filesystems and repeated questions regarding zfs (its cow nature, how that related to zvols of VMs etc.).
- Vzdump is a filesystem agnostic backup tool by design, which sounds good for first, but after the reality punches in the face, one should reconsider forcing only this backup method.
- Causing zero downtime, where zero is exactly zero.
- Does not slow down the server or just for a minimum (4ex. network transport of backup).
- Idiot proof: just works, no need to worry.
- Could run frequently during worktime too.
- Whatever else important but I missed.
I know, it could start a flamewar regarding filesystems and what is a backup etc., but my main goal is to start a discussion, how to achieve an integrated, zfs based backup utility to be able to use it as easy as the PBS/vzdump, except, it should work

Pros
- ZFS is mature, just works, reliable, not a joke (like btrfs), PVE/PBS supports installing into zfs natively etc.
- pve-zsync already written and just works. I have a feeling pve-zsync written by a resistant against vzdump
- zero downtime.
- zero CPU overhead.
- the content directly accessible (dataset or zvols are mountable).
- backup machine could be use dedup/zstd.
- zfs could be no-go for lot of users (due i o various reasons, pick one)
- 3-2-1 and its descendants backup strategies do not exist or I do not know about it (there is a high chance, I missed, never looked
- it forces to use zfs everywhere, making ceph and other non-zfs storages useless
- redundant solution for an already solved problem: backup (anyway, as lot of users experienced already, vzdump is a great tool, but with handicaps since the beginning)
- good base to start a flamewar
Some quick notes regarding this:
- zfs snapshot could create an inconsistent filesystem on running VMs/CTs
- based on my own experience, it is not problem with zfs. For CT it is even less problem.
Of course, when we start a new instance based on a snapshot, from the perspective of the VM/CT it is like booting after a crash. I am surprised, but windows' are pretty stable and its filesystem recovered every time. Linux VM (ext4) also can recover. - CTs are the best one, because their filesystem is native zfs, it will be never inconsistent.
- Of course, some in-memory, never written data could lost. If you need these kind of backups, you already lost hours/days of data, so, who cares.
- If one needs to clone in an idiot-proof, always working way, important services or the machine itself should turned off and make snapshot in offline state.
- based on my own experience, it is not problem with zfs. For CT it is even less problem.
- not every hardware is suitable for using zfs, they could use vzdump, no problem
- but for those, who needs their data safe and they use local storage, I strongly recommend zfs, others could stick with lvm/ext4
- restoring strategies/checklist not discussed: it seems most of the users do not have one, just creating backups somehow and hope the best, so, there is no different. Who has that, he/she already prepared for the demands and will not be surprised.
- regular backup checks: please hands in the air, who do this regularly (after every backups)
- I simply forgot everything else, maybe others could write missing things/POVs or link to already written discussions somewhere in the net.
- Do you know a working backup solution, which could be used with PVE? (please check my demands above)
- Do you know, how to make zero downtime backup with vzdump?
- Do you know, hot to make vzdump idiot (me) proof, which always work?
- Do you have working backup strategies based on PVE? (3-2-1, 4-3-2-1-0 whatever fancy names you know)
- Did you ever experienced non-bootable or totally useless backup? (backup type and non-recoverable DB and type, unreadable old files, filesystems)
- Did you ever feel, you need an other solution than you already have for backup? (reason could be useful)
Thank you for your attention, I hope this thread could help me to find a better solution which could help others, too.
István