VM Crashed, cannot restore backup!

lpallard

Well-Known Member
Mar 22, 2014
94
4
48
Hello, I am in a bit of severe trouble here so I hope I can have help from you guys..

I have this CentOS VM that I backed up twice to a small backup storage of 300GB.. What I didnt realize when I backed up the VM a second time is that the backup storage was running low on space but nevertheless the backup finished without errors.

Then I went ahead to launch the VM and do some modifications which I didnt want to keep , so I decided to restore the latest backup. It worked (restore finished 100%) but when I launched the VM, centos crashed at bootup saying that a partition or drive was not found and dropped me to a CLI. The partition or drive that couldn't be found was in fact a logical volume at the node's level assigned to the VM via its .conf file such as

Code:
virtio1: /dev/datastore/datastore, backup=no

Then I deleted older backups of other VM's to free up some space, restored the latest backup of the CentOS machine:

Code:
restore vma archive: vma extract -v -r /var/tmp/vzdumptmp11313.fifo /mnt/backups/dump/vzdump-qemu-101-2014_05_16-20_28_31.vma /var/tmp/vzdumptmp11313
CFG: size: 327 name: qemu-server.conf
DEV: dev_id=1 size: 128849018880 devname: drive-virtio0
CTIME: Fri May 16 20:28:32 2014
Formatting '/var/lib/vz/images/101/vm-101-disk-1.raw', fmt=raw size=128849018880
new volume ID is 'local:101/vm-101-disk-1.raw'
map 'drive-virtio0' to '/var/lib/vz/images/101/vm-101-disk-1.raw' (write zeros = 0)
progress 1% (read 1288503296 bytes, duration 2 sec)
progress 2% (read 2577006592 bytes, duration 5 sec)
progress 3% (read 3865509888 bytes, duration 8 sec)
progress 4% (read 5154013184 bytes, duration 12 sec)
progress 5% (read 6442450944 bytes, duration 15 sec)
progress 6% (read 7730954240 bytes, duration 19 sec)
progress 7% (read 9019457536 bytes, duration 23 sec)
...
progress 92% (read 118541123584 bytes, duration 575 sec)
progress 93% (read 119829626880 bytes, duration 582 sec)
progress 94% (read 121118130176 bytes, duration 590 sec)
progress 95% (read 122406567936 bytes, duration 598 sec)
progress 96% (read 123695071232 bytes, duration 606 sec)
progress 97% (read 124983574528 bytes, duration 614 sec)
progress 98% (read 126272077824 bytes, duration 622 sec)
progress 99% (read 127560581120 bytes, duration 632 sec)
progress 100% (read 128849018880 bytes, duration 642 sec)
total bytes read 128849018880, sparse bytes 17465548800 (13.6%)
space reduction due to 4K zero bocks 0.175%
TASK OK

Then launched the VM again only to see the same error (partition or drive cannot be found). Then I looked in the webgui to see if the missing partition or drive was still assigned to the VM and saw that It was missing!! Then I SSH'd into proxmox node and looked in the .conf file, the drive was commented out! Why??

I uncommented it but didnt help...

What can I do????

Thanks
 
Update:

I updated pVE with the latest packages (as found under updates tab), rebooted the node and started the VM back up again, it worked fine this time...

I am concerned about this... What could I provide to the PVE devs so we can get to the bottom of this???
 
You excluded the drive from backup using "backup=no" option:

Code:
virtio1: /dev/datastore/datastore,backup=no
 
Thank you Dietmar for replying.

I am just not understanding your repsonse. Yes, I have excluded the drive from being backed up but that shouldnt mean the drive is no longer available to the VM upon restoration? Especially using the UUID of the drive to mount it in the VM..

Also, I have tried twice (before the proxmox update & node reboot) to start the VM and I noticed each time the VM would complain it didnt find a drive, the first time it was being one of the Node's LV, the other time, the other LV. This makes me believe there was a bug or glitch in the way proxmox handles LV's for a short period of time.

Somehow, the logical volumes were no longer present to the VM until the node was rebooted. This is probably why the entries in VM.conf were commented out, at least I did not comment these out myself!!!!!!

I have backed up VM's before even if these had disks passed through while excluding these disks from the backup and all was fine.

I have looked at SAS controller failure, nothing to report. Same for SAS drives. No hardware anomaly whatsoever.
 
Last edited:
Somehow, the logical volumes were no longer present to the VM until the node was rebooted. This is probably why the entries in VM.conf were commented out,

Maybe you started the VM directly after restore (restore comments out drives with backup=no).
The drive is there if you add the old drive before starting the VM.
 
Maybe you started the VM directly after restore (restore comments out drives with backup=no).
The drive is there if you add the old drive before starting the VM.

Dietmar,

I did uncomment the drives before starting the VM. I tried like 3 or 4 times. Thats why I could see that on one shot, the VM halted because one of the LV was not found, and with the other shot it was the other LV..

Now that you mention that backup comments out the other drives, maybe it is not a problem with LV's but the backup feature?

Anyways if you feel there is no chance of a problem with my node, we can assume this topic closed, I trust you more than I trust me on that!

Thanks for your feedback!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!