[SOLVED] live recovery progress

DrillSgtErnst

Active Member
Jun 29, 2020
91
6
28
Hi,
after fatal crash yesterday I had to recover all VMs from backup.

I tried Live Recovery and all Servers are running quite fine so far. But I do not have a clue about the progress.
The task finished quite early and the machines started.
I can not back them up, because they are locked (create), and I fear shutting them down will destroy the machines.
Is there any way to verify that the recovery is completed
1655891701484.png
 
what do the task logs say? if those are finished, the restore should be finished and the locks should be removed
 
seems the task was interrupted... did you reboot the server while restoring? if not, does the journal/syslog maybe contain info what might happened?
 
Yeah the task always failed on the first tries. I had to start them two times.

Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
rbd rm 'vm-205-disk-1' error: interrupted by signal

This was happening for 2 hours straight.

This happened on the second try
Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
new volume ID is 'ceph_fail1:vm-205-disk-0'
new volume ID is 'ceph_fail1:vm-205-disk-1'
restore proxmox backup image: /usr/bin/pbs-restore --repository backup@pbs@172.20.14.18:cephbackup vm/205/2022-06-21T10:30:04Z drive-efidisk0.img.fidx /dev/zvol/ceph_fail1/vm-205-disk-0 --verbose --format raw --keyfile /etc/pve/priv/storage/pvebackup.enc --skip-zero
connecting to repository 'backup@pbs@172.20.14.18:cephbackup'
open block backend for target '/dev/zvol/ceph_fail1/vm-205-disk-0'
starting to restore snapshot 'vm/205/2022-06-21T10:30:04Z'
download and verify backup index
progress 100% (read 540672 bytes, zeroes = 0% (0 bytes), duration 0 sec)
restore image complete (bytes=540672, duration=0.14s, speed=3.78MB/s)
rescan volumes...
got interrupt - ignored
rbd error: got signal 15
starting VM for live-restore
repository: 'backup@pbs@172.20.14.18:cephbackup', snapshot: 'vm/205/2022-06-21T10:30:04Z'
restoring 'drive-virtio0' to 'ceph_fail1:vm-205-disk-1'
restore-drive-virtio0: transferred 0.0 B of 400.0 GiB (0.00%) in 0s
restore-drive-virtio0: transferred 508.0 MiB of 400.0 GiB (0.12%) in 1s
restore-drive-virtio0: transferred 532.0 MiB of 400.0 GiB (0.13%) in 2s
restore-drive-virtio0: transferred 636.0 MiB of 400.0 GiB (0.16%) in 3s


The machines are running. I am just scared of data loss. If I shut them down now, will they come back?
 
So answer is. so far every one came back after shutdown and I could move the disks to another space, everything works fine, even though it does not look like that
 
Yeah the task always failed on the first tries. I had to start them two times.
did only the task fail? or did the qemu process crash?
can you post the task log of such a failed restore?

rbd error: got signal 15
seems like something killed the rbd process? can you post the journal from that time period?

So answer is. so far every one came back after shutdown and I could move the disks to another space, everything works fine, even though it does not look like that
great, i'd still investigate why the tasks fail in the first place...
 
So tbh I think I know the reason. The machines Hard Drives reside on a crashed Ceph System.
I guess this deadlocks the process in the first place, because it can not get Information regarding the machine from the drive.
Code:
Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
Using encryption key from file descriptor..
Fingerprint: fc:30:23:0f:e8:ec:13:22
rbd error: 'storage-ptvceph'-locked command timed out - aborting
rbd error: 'storage-ptvceph'-locked command timed out - aborting
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!