total disaster recovery procedure

michabbs

Member
May 5, 2020
71
12
13
I would like to keep copy of the-most-important datastore somewhere in the cloud in case of earhquake and martian attack.
What is the best way to restore my VM's as soon as possible? Imagine I have a new hardware and nothing else.

Procedure:
1. Download the latest copy of the datastore from the cloud.
2. ???
...
X. Run the virtual machines.
 

oguz

Proxmox Retired Staff
Retired Staff
Nov 19, 2018
5,207
692
118
hi,

1. download backups
2. reconfigure pve (you can have a backup of /etc/pve for this, other configurations like /etc/network/interfaces can also be relevant) on new server
3. restore backups
4. run the virtual machines
 

michabbs

Member
May 5, 2020
71
12
13
Well... How can I extract backup from PBS data store? I assume I have to set up new backup server first, don't I?
 

oguz

Proxmox Retired Staff
Retired Staff
Nov 19, 2018
5,207
692
118
you configure the datastore on your new PVE and restore using the GUI. having the backup client installed should be enough.
 

michabbs

Member
May 5, 2020
71
12
13
I do not understand. How could I configure "raw" datastore in PVE?
I thought I must first set up new PBS server, then set up the datastore in PBS, then configure access to new backup server in PVE and finally restore vm's from there.
By raw datastore I mean the folder with ".chunks" and plenty of subfolders....
 

oguz

Proxmox Retired Staff
Retired Staff
Nov 19, 2018
5,207
692
118
I thought I must first set up new PBS server, then set up the datastore in PBS, then configure access to new backup server in PVE and finally restore vm's from there.
yes correct, but you can set up PVE and PBS on the same machine :)

By raw datastore I mean the folder with ".chunks" and plenty of subfolders....
at the moment it's not possible to do this another way, but there are some patches on the mailing list pending review
 

eoinkim

Member
May 26, 2020
84
8
8
37
Hi @oguz,

Regarding your comment, is there any way to skip restoring backups? For example, if the backup storage is hooked as a datastore on the stand-by cluster, can VMs be powered on? In my case, time is more important than historical data.

Eoin
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,122
1,583
164
@eoinkim a feature called 'live-restore' is being worked on - it allows start a VM directly from a backup snapshot, restoring the data transparently in the background while the VM is already running. obviously you need a fast enough connection and a beefy PBS instance for this to work well in practice.
 

abranca

Active Member
Mar 6, 2017
47
5
28
36
Hi, I'm going to add to your post.
I set up this configuration:
- Proxmox VE with virtual machines;
- VM PBS backing up to NFS in my Qnap NAS;
- I have installed rclone on PBS which makes copy of datastore on cloud mega.nz;

I tried, as you say to do disaster recovery.
- I deleted the PBS VM and made new;
- I deleted the datastore from my Qnap;
- I installed rclone on PBS to download my backup from mega.nz to the NFS folder of Qnap mounted on PBS;
- to the folder that contains the datastore you have to give chown backup:backup -R /path/to/datastore;
- restored the VMs/CTs.

One problem remains, however. On that datastore I can no longer make backups.
Once the restore is done, I create a new datastore (in my case always on NFS drives) and perform the new backups.

I don't know if the problem occurs because the datastore resides on NFS. I haven't tried with a local datastore on PBS actually.
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,122
1,583
164
Hi, I'm going to add to your post.
I set up this configuration:
- Proxmox VE with virtual machines;
- VM PBS backing up to NFS in my Qnap NAS;
- I have installed rclone on PBS which makes copy of datastore on cloud mega.nz;

I tried, as you say to do disaster recovery.
- I deleted the PBS VM and made new;
- I deleted the datastore from my Qnap;
- I installed rclone on PBS to download my backup from mega.nz to the NFS folder of Qnap mounted on PBS;
- to the folder that contains the datastore you have to give chown backup:backup -R /path/to/datastore;
- restored the VMs/CTs.

One problem remains, however. On that datastore I can no longer make backups.
Once the restore is done, I create a new datastore (in my case always on NFS drives) and perform the new backups.

I don't know if the problem occurs because the datastore resides on NFS. I haven't tried with a local datastore on PBS actually.
do you get an error? which PBS version are you using?
 

abranca

Active Member
Mar 6, 2017
47
5
28
36
do you get an error? which PBS version are you using?
I am including here the link to a post of mine regarding this issue.
After that date (December 17, 2020), I did not test again.

For me, the important thing was to be able to restore from the cloud and import VMs/CTs.
Creating a new datastore for new backups is not a priority issue for me.

With the new updates, I did not test anymore.
I can do some testing over the weekend and give you more information.
 
Last edited:

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,122
1,583
164
yeah, please retry with 1.0.11-1 and post the PVE and PBS logs for the failing backup if it still occurs.
 

abranca

Active Member
Mar 6, 2017
47
5
28
36
yeah, please retry with 1.0.11-1 and post the PVE and PBS logs for the failing backup if it still occurs.
Ok!
Now I've installed proxmox-backup-client: 1.0.11-1.

Over the weekend I can run some tests and post what is required.
 

abranca

Active Member
Mar 6, 2017
47
5
28
36
yeah, please retry with 1.0.11-1 and post the PVE and PBS logs for the failing backup if it still occurs.
Hi,
I tried again as promised.

I explain my steps:
1. created datastore on PBS with NFS share destination on QNAP;
2. connected in PVE;
3. backup of a VM of 2.25gb;
4. with rclone I moved the backup to the cloud;
5. deleted the folder from QNAP to simulate disaster recovery;
6. created a new NFS folder on QNAP;
7. mount the share on PBS;
8. rclone to download the previous backup and set chown backup:backup -R on the mount point (in the /mnt/pbs-android log);
9. added in /etc/proxmox-backup/datastore.cfg the path;
10. from PVE restored the backup successfully;
11. I try to make a new backup of the VM and I get an error that I attach.

PVE log:
Code:
INFO: starting new backup job: vzdump 108 --storage pbs-android --remove 0 --mode snapshot --node pve
INFO: Starting Backup of VM 108 (qemu)
INFO: Backup started at 2021-03-26 17:37:25
INFO: status = running
INFO: VM Name: android
INFO: include disk 'scsi0' 'local-lvm:vm-108-disk-0' 8G
  /dev/sdc: open failed: No medium found
  /dev/sdd: open failed: No medium found
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/108/2021-03-26T16:37:25Z'
INFO: started backup task 'c283d894-5f9c-45ef-b555-9d6f06d6cf31'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: OK (260.0 MiB of 8.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 260.0 MiB dirty of 8.0 GiB total
INFO:   1% (4.0 MiB of 260.0 MiB) in 1s, read: 4.0 MiB/s, write: 4.0 MiB/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: No such file or directory (os error 2)
INFO: aborting backup job
ERROR: Backup of VM 108 failed - backup write data failed: command error: write_data upload error: pipelined request failed: No such file or directory (os error 2)
INFO: Failed at 2021-03-26 17:37:27
INFO: Backup job finished with errors
TASK ERROR: job errors

PVE Package version:
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.103-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.103-1-pve: 5.4.103-1
pve-kernel-5.4.101-1-pve: 5.4.101-1
pve-kernel-5.4.98-1-pve: 5.4.98-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: not correctly installed
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.11-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.2.0-4
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-8
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1


PBS log:
Code:
()
2021-03-26T17:37:25+01:00: starting new backup on datastore 'pbs-android': "vm/108/2021-03-26T16:37:25Z"
2021-03-26T17:37:25+01:00: download 'index.json.blob' from previous backup.
2021-03-26T17:37:25+01:00: register chunks in 'drive-scsi0.img.fidx' from previous backup.
2021-03-26T17:37:25+01:00: download 'drive-scsi0.img.fidx' from previous backup.
2021-03-26T17:37:26+01:00: created new fixed index 1 ("vm/108/2021-03-26T16:37:25Z/drive-scsi0.img.fidx")
2021-03-26T17:37:26+01:00: add blob "/mnt/pbs-android/vm/108/2021-03-26T16:37:25Z/qemu-server.conf.blob" (312 bytes, comp: 312)
2021-03-26T17:37:26+01:00: POST /fixed_chunk: 400 Bad Request: No such file or directory (os error 2)
2021-03-26T17:37:26+01:00: backup failed: connection error: bytes remaining on stream
2021-03-26T17:37:26+01:00: removing failed backup
2021-03-26T17:37:26+01:00: TASK ERROR: removing backup snapshot "/mnt/pbs-android/vm/108/2021-03-26T16:37:25Z" failed - Directory not empty (os error 39)

PBS Package version:
Code:
()
proxmox-backup: 1.0-4
proxmox-backup-server: 1.0.1-1
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.4.65-1-pve: 5.4.65-1
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
proxmox-backup-docs: 1.0.1-1
proxmox-backup-client: 1.0.1-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-10
pve-xtermjs: 4.7.0-2
smartmontools: 7.1-pve2
zfsutils-linux: 0.8.4-pve2


I've noticed that some "prune" backups that I do with PBS also fail, and in the TASK ERROR entry there is always the same wording: Directory not empty (os error 39)

PRUNE log:

Code:
2021-03-26T00:00:00+01:00: Starting datastore prune on store "pbs-nas"
2021-03-26T00:00:00+01:00: task triggered by schedule 'daily'
2021-03-26T00:00:00+01:00: retention options: --keep-last 3
2021-03-26T00:00:00+01:00: Starting prune on store "pbs-nas" group "vm/100"
2021-03-26T00:00:00+01:00: remove vm/100/2021-03-21T13:21:02Z
2021-03-26T00:00:00+01:00: remove vm/100/2021-03-22T13:21:03Z
2021-03-26T00:00:00+01:00: TASK ERROR: removing backup snapshot "/mnt/pbs-nas/vm/100/2021-03-22T13:21:03Z" failed - Directory not empty (os error 39)
 
Last edited:

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,122
1,583
164
can you confirm that the total number of files and directories is the same before you do the first rclone (source) and after you restore the datastore via rclone? is it possible that empty directories are skipped/ignored? the .chunks directory has a lot of sub-directories, those are only created on initial datastore creation.. if your backup/restore skips them, you'll run into strange errors.
 

abranca

Active Member
Mar 6, 2017
47
5
28
36
can you confirm that the total number of files and directories is the same before you do the first rclone (source) and after you restore the datastore via rclone? is it possible that empty directories are skipped/ignored? the .chunks directory has a lot of sub-directories, those are only created on initial datastore creation.. if your backup/restore skips them, you'll run into strange errors.
Hi Fabian,
thanks for your reply!

Indeed rclone, without any option, does not synchronize empty directories.
On the datastore "pbs-android", in my case, I have 65540 directories and 1244 files.
On mega I have the same number of files but 1232 directories.

In the documentation I found that with rclone you can use the option --create-empty-src-dirs that should synchronize also the empty directories.

I'll try again and tell you more.
Thanks for your support.
 
Last edited:
  • Like
Reactions: anzigo and fabian

abranca

Active Member
Mar 6, 2017
47
5
28
36
can you confirm that the total number of files and directories is the same before you do the first rclone (source) and after you restore the datastore via rclone? is it possible that empty directories are skipped/ignored? the .chunks directory has a lot of sub-directories, those are only created on initial datastore creation.. if your backup/restore skips them, you'll run into strange errors.
Hello everyone, hello Fabian,
I have re-tested as I mentioned in my previous message.

I backed up a 2.26gb machine and used rclone to sync remotely.
rclone, without the --create-empty-src-dirs option, doesn't synchronize the empty folders so the recovery works but it doesn't allow to make new backups on the same datastore because some folders are missing.

Using the above option I was able to backup everything, including the empty folders, now getting a mirror copy of the datastore.

The 2.26gb uploaded in about 15 minutes (I have a ftth line 1gb download/300mb upload) while the empty folders took about 2.35 hours.
The restore should always be done using the above command and, by doing so, I was able to restore the backup and make new ones on the same datastore.

If it would be helpful, I can make a pdf tutorial with screenshots and descriptions for this process.

ATTENTION: google drive does not create empty folders despite the parameter of rclone. It seems that the detected empty folders are not considered and therefore not created.


In my personal experience, I prefer to backup in 15 minutes and be able to restore in no time. Even if I can't use the newly restored datastore, it's not important to me.

Certainly time, in disaster recovery matters and waiting 3 hours to copy an empty folder structure is too long as far as I'm concerned.

I value disaster recovery as a matter of time. "I need to restore as soon as possible" the fact of reusing the datastore, according to my experience and according to my operating domain, is not important.

But that's my experience, and it certainly won't be everyone's.
 
Last edited:
  • Like
Reactions: anzigo and fabian

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!