backup vm configuration / cluster db

RolandK · Feb 14, 2023

i see, there is some backup of /var/lib/pve-cluster/config.db at /var/lib/pve-cluster/backup.

how can i create backups of vm and cluster configuration at regular interval ?

i see there is some cfs_backup_database(); in cfs_backup_database in https://github.com/proxmox/pve-cluster/blob/master/data/PVE/Cluster.pm , but it seems it's not meant to be used by some end-user or cronjob !?

root@pve-pc5:/var/lib/pve-cluster/backup# ls -la
total 26
drwxr-xr-x 2 root root 3 Mar 18 2022 .
drwxr-xr-x 3 root root 7 Dec 22 17:40 ..
-rw-r--r-- 1 root root 13574 Mar 18 2022 config-1647623846.sql.gz

fabian · Feb 14, 2023

the backup is created when joining the cluster (in case something goes wrong). it's not really meant to be user-callable at the moment, all it does is dump the backing sqlite3 DB. if you want to do the same, you can just do that (for consistency you want pmxcfs to be stopped so no concurrent writes can happen, dumping and thus downtime should be rather faster).

RolandK · Feb 14, 2023

ok, thanks.
as there are so many users out there who run proxmox servers standalone, i'm curious if such important file (i.e. config.db) should not get more care/attention ?

why not backing up at a regular interval, if there is no easy way to do consistent online backup ?

what about backing it up on every reboot or service restart - with some backup rotation/retention ?

it's such a small (but important) file, it would be easy to store a couple of backups without any waste of diskspace. that would not do any harm.

apparently, there even seem methods to do online backup , e.g. https://www.sqlite.org/lang_vacuum.html#vacuuminto

fabian · Feb 15, 2023

I've only ever seen two kinds of corruption of the DB (sqlite3 is really stable!):
- users manually messing with it without knowing what they are doing (well..)
- disk failure (in which case a backup on the same disk won't really help)

in almost all cases it's a better idea to backup the contents of /etc/pve, since that allows restoring individual config files if something accidentally gets messed up or deleted. for standalone hosts that even trivially allows to recover from a destroyed backing DB - just remove the DB file, restart pmxcfs, you get a blank /etc/pve and can restore config files that you need. for clustered nodes you likely want to remove the node from the cluster, remove the local corosync config, re-initialize pmxcfs, rejoin, and then restore.

patefoniq · Nov 24, 2023

OK.
How to restore this backup from sql file?

tempacc375924 · Nov 24, 2023

patefoniq said:
OK.
How to restore this backup from sql file?

It's just an sqlite dump:
https://github.com/proxmox/pve-clus...f24c05a11b0f864f5b9dc/src/PVE/Cluster.pm#L884

It gets dumped just before it goes on to wait for quorum (with started corosync and restarted pve-cluster):
https://github.com/proxmox/pve-clus...b0f864f5b9dc/src/PVE/Cluster/Setup.pm#L772C54

What are you trying to achieve? Recover standalone node or a cluster?

patefoniq · Nov 24, 2023

Restore standalone node after failed cluster join.

tempacc375924 · Nov 24, 2023

patefoniq said:
Restore standalone node after failed cluster join.

Have you been trying to add to a cluster a standalone node with VMs on it already? (I can imagine coming from VMware you might have, but PVE is not built for that.)

I only ask because I saw your other thread where you complained of corrupt /etc/pve. I suppose this is a standalone node that could not get added so never finished joining the cluster, but you had stuff running on it that you would like back in business? (And I presume no backups of the VMs at hand?)

In case you were adding a fresh (empty) node, it's basically easier to reinstall it - but I suppose you would not be asking here if that was the case.

tempacc375924 · Nov 24, 2023

I take no responsibility now if you break something in your production setup, but as for experimenting (literally if I were you I would do this somewhere on the side testing with PVEs as VMs in a dry-run):

I would first make a backup of at least everything in /etc/pve on that standalone node, then I would do the usual "unsupported" separating it from a cluster (mentality):

Code:

systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l
rm /etc/pve/corosync.conf
rm -r /etc/corosync/*
killall pmxcfs
systemctl start pve-cluster
rm /var/lib/corosync/*

What state will that end up in?

patefoniq · Nov 25, 2023

tempacc375924 said:
Have you been trying to add to a cluster a standalone node with VMs on it already? (I can imagine coming from VMware you might have, but PVE is not built for that.)

I only ask because I saw your other thread where you complained of corrupt /etc/pve. I suppose this is a standalone node that could not get added so never finished joining the cluster, but you had stuff running on it that you would like back in business? (And I presume no backups of the VMs at hand?)

In case you were adding a fresh (empty) node, it's basically easier to reinstall it - but I suppose you would not be asking here if that was the case.

Before adding the node to the cluster, I removed all VMs (I made a backup and wanted to restore after joining a cluster). I'll create a new thread and try to get all of my problems in it.

PS
It's not my first time creating the Proxmox cluster (I've used it since about 2018). As I wrote in other topics, earlier it was simpler, less destructive, and had the possibility to roll back the previous state without losing the data.

patefoniq · Nov 25, 2023

tempacc375924 said:
I take no responsibility now if you break something in your production setup, but as for experimenting (literally if I were you I would do this somewhere on the side testing with PVEs as VMs in a dry-run):

I would first make a backup of at least everything in /etc/pve on that standalone node, then I would do the usual "unsupported" separating it from a cluster (mentality):

Code:

systemctl stop pve-cluster systemctl stop corosync pmxcfs -l rm /etc/pve/corosync.conf rm -r /etc/corosync/* killall pmxcfs systemctl start pve-cluster rm /var/lib/corosync/*

What state will that end up in?

I'll to do that. Thanks.

PS
I created the new thread:
https://forum.proxmox.com/threads/problems-with-cluster-adding-nodes-and-qdevices.137063/

patefoniq · Nov 25, 2023

tempacc375924 said:
I take no responsibility now if you break something in your production setup, but as for experimenting (literally if I were you I would do this somewhere on the side testing with PVEs as VMs in a dry-run):

I would first make a backup of at least everything in /etc/pve on that standalone node, then I would do the usual "unsupported" separating it from a cluster (mentality):

Code:

systemctl stop pve-cluster systemctl stop corosync pmxcfs -l rm /etc/pve/corosync.conf rm -r /etc/corosync/* killall pmxcfs systemctl start pve-cluster rm /var/lib/corosync/*

What state will that end up in?

Thank you. That procedure rebuilt the damaged node in a single mode. I keep trying.

Search

Search

backup vm configuration / cluster db

RolandK

Renowned Member

fabian

Proxmox Staff Member

RolandK

Renowned Member

fabian

Proxmox Staff Member

patefoniq

Well-Known Member

tempacc375924

Member

patefoniq

Well-Known Member

tempacc375924

Member

tempacc375924

Member

patefoniq

Well-Known Member

patefoniq

Well-Known Member

patefoniq

Well-Known Member