Replication, HA and Backups

dixie2000

Member
May 16, 2023
73
3
8
Prior to setting up a 2 node cluster with a Raspberry Pi serving as a QDevice along with ZFS, replication and HA - I was backing up the VM's and CT's to a Proxmox backup server on a daily basis using "stop" mode.

My question is, since HA only allows snapshots should I still continue to do backups? How might they differ from the replicated VM's and CT's?

Hope my question makes sense.

Thanks!
 
should I still continue to do backups?
Yes! Definitely.

First the obvious: if the original is tampered with or its data is damaged then this unusable data will soon be integrated in the next snapshot. And this state will then get replicated to the other node...

And if the physical disc dies, all snapshots die in the very same moment.

If you're understanding is that snapshots are sufficient for backup purposes I must say that I disagree. Snapshots are no backups. For me they give me the opportunity to go back a short time easily and quickly. For me backups are for the "long run". I can go back to yesterday, last week, last month or last year - hierarchically pruned by PBS to fit my requirements.

Also: search for "backup strategy 3-2-1", which usually does not talk about snapshots at all.

Of course I do use snapshots. But for a backup I run more than one physically separate backup destinations - both in my job and also at home.

Just my thoughts...
 
@UdoB

Thank you for replying. However when running a cluster with High Availability the only option for backup in the scheduler for HA is snapshot. The job will error and I can't find a way around this.
 
the only option for backup in the scheduler for HA is snapshot. The job will error and I can't find a way around this.
Oh!
I've never experienced this. All of my PVE are running on ZFS (mirrored SSD/NVMe). Some of my VMs have HA enabled. Several are replicated to one (or two or three) neighbor nodes. I run the vast amount of the backups in snapshot mode.

Can you show us /etc/pve/storage.cfg, the config of an example VM with that problem (/etc/pve/local/qemu-server/<vmid>.conf) and the actual task log of a failing backup? (Everything in [CODE]...[/CODE]-blocks please.)
 
@UdoB

Your first response said "Snapshots are no backups" and in your last you said "I run the vast amount of the backups in snapshot mode." I am confused :)

I just attempted to run, through the backup scheduler, a backup on one VM and one CT in "stop" mode. see the log below.

INFO: starting new backup job: vzdump 109 303 --mailto xxxx@gmail.com --node pve --mailnotification always --all 0 --notes-template '{{guestname}}' --mode stop --storage PBS --prune-backups 'keep-all=1' --fleecing 0
ERROR: Backup of VM 109 failed - Cannot execute a backup with stop mode on a HA managed and enabled Service. Use snapshot mode or disable the Service.
INFO: Failed at 2024-04-25 14:26:16
ERROR: Backup of VM 303 failed - Cannot execute a backup with stop mode on a HA managed and enabled Service. Use snapshot mode or disable the Service.
INFO: Failed at 2024-04-25 14:26:16
INFO: Backup job finished with errors
INFO: notified via target `<xxxxx@gmail.com>`
TASK ERROR: job errors
 
Your first response said "Snapshots are no backups" and in your last you said "I run the vast amount of the backups in snapshot mode." I am confused
Sorry. Terminology is confusing when one word is used for different things. My understanding:

You can "snapshot" a VM (and an LXC container) via: " <vm> - Snapshots - Take Snapshot (Button) ". This creates a high level "PVE-snapshot", handled by the PVE middleware. This top-level view tries to be filesystem-agnostic by presenting an abstracted view, but the underlying storage must be able to support this: LVM-thin(!), QCOW2 for files and of course ZFS can do this.

For backups you already mentioned stop- and snapshot-mode. Here snapshot means the QEMU-mechanism to freeze the state of a VM at a given moment in time. The Qemu-Agent is important for this as it handles some tasks inside the guest to get prepared for this. Usually it is fine to use this mode. For some applications (like databases) it may be better to stop the VM to maximize consistency and create the backup from that moment. (I have a mariadb which tends to get into trouble with the snapshot mode..., probably the newly introduced "fleecing" will solve this point of trouble for me. Not checked yet.)

Your log excerpt shows "mode stop" which does not work for HA enabled VMs, also as shown. What happens if you try snapshot mode?

Best regards
 
@UdoB

Thank you for the excellent explanation!

When I run in snapshot mode it all works perfectly. I am updating my backup schedule as I move my VM's to HA and using "snapshot" mode.

I don't have any "mission critical" VM's running so I replicate them about every hour. I am then doing a daily backup late in the evening and keeping the last three at this time.

My original question I guess was not well stated. I was just wondering how backups differed from the replication process. I think I understand now. Replication along with HA makes sure my VM's are always available. Backups give me a prior point in time should something happen to the VM - I can restore it and continue on.

Thanks again for taking the time to respond, it is appreciated!
Al...
 
  • Like
Reactions: UdoB

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!