Recover VMs from previous corrupt Proxmox installation

garethw

New Member
Feb 17, 2021
7
1
1
44
My predecessor installed Proxmox on an SD memory card (apparently he didn't read that this isn't recommended!) and of course the SD card has failed. The VMs were stored on a 16TB RAID-5 volume which is intact. I installed the most recent version of Proxmox VE on a new hard drive hoping to resurrect these VMs. I can see the volume (/dev/sda1) which is called vm-data1 but I can't figure out how to access the data on the disk / mount these VMs.

qemu.JPG

This was the 2nd server in a cluster the 1st one is still operating here is what the same screen looks like on that system

qemu2.JPG

Just wondering if there is a guide or any documentation for this type of scenario. I have limited a little bit of experience with Linux but am far from an expert.

Edit: I forgot to add I tried clicking 'Create VM' thinking I could point it to the existing VM image / file but didn't see an option to do so.
 
Last edited:
  • Like
Reactions: garrettbeachy
Here is the output of lvs / vgs / pvs it appears there are some LVM filesystems, just not sure how to mount them I don't really understand LVM

root@server2:/mnt# lvs
/dev/sdc: open failed: No medium found
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-a-tz-- <3.49t 0.00 0.19
root pve -wi-ao---- 96.00g
swap pve -wi-ao---- 8.00g
vm-101-disk-0 vm-data1 -wi-a----- 200.00g
vm-101-disk-1 vm-data1 -wi-a----- 1.00t
vm-102-disk-0 vm-data1 -wi-a----- 500.00g
vm-105-disk-0 vm-data1 -wi-a----- 200.00g
vm-105-disk-1 vm-data1 -wi-a----- 5.07t
root@server2:/mnt# vgs
/dev/sdc: open failed: No medium found
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <3.64t <16.38g
vm-data1 1 5 0 wz--n- 16.78t 9.83t
root@server2:/mnt# pvs
/dev/sdc: open failed: No medium found
PV VG Fmt Attr PSize PFree
/dev/sda1 vm-data1 lvm2 a-- 16.78t 9.83t
/dev/sdb3 pve lvm2 a-- <3.64t <16.38g
 
Watching this as very interested in the answer. I'm thinking of deploying a system that will have 3 SATA slots, plus an internal NVME in a PCIe adaptor card, plan is that SATA 1 will be an SSD as the Proxmox / boot disk, SATA 2 and 3 will be large HDDs as a ZFS mirror plus the NVME (partioned into two parts) as the read and write cache. So the ZFS pool will be the VM storage, and as mirrors are now recommended above raid5/raidz1, we have redundancy in the VM storage. But what if the Proxmox SSD itself fails - how would we go about restoring the VMs off the ZFS, or would it have to be a case of flatten it and restore from the backups?

In this particular case the hardware only supports 3 SATA and 1PCIe slot, and there are no power supply headers to allow internal SSDs, hence why I'm planning this config. The BIOS also doesn't recognise the NVME as a potential boot disk, so I'm not sure the system would remain bootable if I were to set up RaidZ1 on three identical HDDs at Proxmox install, then from the CLI add the NVME partitions as logt/cache for it.

I'm guessing the issue is the VM config files that live under /etc/pve/qemu-server/ - if those are gone can the VMs be reloaded from the storage at all?
 
Last edited:
root@server2:/mnt# lvs
/dev/sdc: open failed: No medium found
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-a-tz-- <3.49t 0.00 0.19
root pve -wi-ao---- 96.00g
swap pve -wi-ao---- 8.00g
vm-101-disk-0 vm-data1 -wi-a----- 200.00g
vm-101-disk-1 vm-data1 -wi-a----- 1.00t
vm-102-disk-0 vm-data1 -wi-a----- 500.00g
vm-105-disk-0 vm-data1 -wi-a----- 200.00g
vm-105-disk-1 vm-data1 -wi-a----- 5.07t
root@server2:/mnt# vgs
/dev/sdc: open failed: No medium found
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <3.64t <16.38g
vm-data1 1 5 0 wz--n- 16.78t 9.83t
root@server2:/mnt# pvs
/dev/sdc: open failed: No medium found
PV VG Fmt Attr PSize PFree
/dev/sda1 vm-data1 lvm2 a-- 16.78t 9.83t
/dev/sdb3 pve lvm2 a-- <3.64t <16.38g
The VMs virtual HDDs (vm-101-disk-0 and so on) are stored on a LVM. First you need to add that existing LVM as a "storage" in proxmox with "disk image" as content type. After you got that your proxmox knows where to look for disk images and you can create new VMs (VM configs were stored on the failed boot drive) and select the existing disk images as virtual HDDs.

Like Pyromancer said the configs were stored under "/etc/pve/qemu-server/". So if your second working node got copies of the same VMs you might look there.
Watching this as very interested in the answer. I'm thinking of deploying a system that will have 3 SATA slots, plus an internal NVME in a PCIe adaptor card, plan is that SATA 1 will be an SSD as the Proxmox / boot disk, SATA 2 and 3 will be large HDDs as a ZFS mirror plus the NVME (partioned into two parts) as the read and write cache. So the ZFS pool will be the VM storage, and as mirrors are now recommended above raid5/raidz1, we have redundancy in the VM storage. But what if the Proxmox SSD itself fails - how would we go about restoring the VMs off the ZFS, or would it have to be a case of flatten it and restore from the backups?

In this particular case the hardware only supports 3 SATA and 1PCIe slot, and there are no power supply headers to allow internal SSDs, hence why I'm planning this config. The BIOS also doesn't recognise the NVME as a potential boot disk, so I'm not sure the system would remain bootable if I were to set up RaidZ1 on Proxmox install, then from the CLI add the NVME partitions as logt/cache for it.

I'm guessing the issue is the VM config files that live under /etc/pve/qemu-server/ - if those are gone can the VMs be reloaded from the storage at all?
Read a little bit more about L2ARC, SLOG, ZIL and "special device". Read and Write Cache on a SSD is in most cases not useful and you should get a really decent SSD and not consumer stuff because it won't last long. Also keep in mind that everything on the HDDs might get lost if that single Cache SSD dies while the power is failing. It would be better to run that VMs on the NVMe SSD instead if you need the speed. And back them up on the HDDs regularily if you can't add another NVMe SSD to mirror them too.

And its no problem to install proxmox on a USB drive, but you should use durable Enterprise Sticks or atleast SSDs with a USB to SATA converter. You you got enough internal drive slots and a free USB3 header on the mainboard you can even use them internally.
 
Last edited:
  • Like
Reactions: takeokun
The VMs virtual HDDs (vm-101-disk-0 and so on) are stored on a LVM. First you need to add that existing LVM as a "storage" in proxmox with "disk image" as content type. After you got that your proxmox knows where to look for disk images and you can create new VMs (VM configs were stored on the failed boot drive) and select the existing disk images as virtual HDDs.

Like Pyromancer said the configs were stored under "/etc/pve/qemu-server/". So if your second working node got copies of the same VMs you might look there.

Thanks for the response. I don't believe replication / backup was enabled to the other node so I've got nothing but the RAID-5 volume.

So I'm in the Proxmox web interface, how do I add an LVM storage exactly? I went to the 'Storage View' and clicked on 'LVM' under Disks on the tree to the right, and tried clicking on 'Create: Volume Group' and all it says is 'No Disks Unused' I assume I'm not in the right place but I really can't figure out this software...

1613598787748.png
 
OK I figured out how to add the LVM storage and can now see the vm-101-disk0 etc

1613599302612.png

I can see the volume as an option when I go to Create a VM but I can't see what I select to use an existing VM disk?

1613599407481.png
 
You don't want to create a new LVM, you just want to add a new "storage" based on that existing LVM.
Look at: Datacenter -> Storage -> Add

Not sure how to find out if the old LVM was "LVM" or "LVM thin". You might want to find that out first.
 
OK I think I solved my own problem ;)

I added a new VM with the ID 201 on the vm-data1 volume and once it was added I logged into the console and edited the 201.conf and changed the ID/size of the IDE0 device to match the old vm-101-disk0

VM is now booting.
 
I can see the volume as an option when I go to Create a VM but I can't see what I select to use an existing VM disk?
Not sure if that is possible using the GUI. But you can create a VM without starting it and then manually edit the config files so they point to the right virtual disk: nano /etc/pve/qemu-server/[VMID].conf
VMID is the number you select as ID while creating the VM. Look that this ID machtes the ID of the old disk images (vm-105-disk-0 is the first disk of VM with105. vm-105-disk-1 is the second disk...and so on). You also need to test what type of disk controller was used. If for example a Windows was installed with "IDE" as controller, it won't boot if you choose "VirtIO SCSI".
In that conf file you need to look for lines like:
Code:
scsi0: VMpool7_VM:vm-107-disk-2,cache=none,discard=on,iothread=1,size=16G,ssd=1
scsi1: VMpool8_VM:vm-107-disk-0,cache=none,size=2G,ssd=1
"VMpool7_VM" is in my case the name of the storage (should be "vm-data1" in your case) and "vm-107-disk-2" is the name of the disk image you want to use.

By the way...IDE is nothing I would choose because it is very slow. You might want to use virtio SCSI if you create new VMs and don't need to import old ones.
 
Last edited:
By the way...IDE is nothing I would choose because it is very slow. You might want to use virtio SCSI if you create new VMs and don't need to import old ones.

Thanks, I tried recreating the VM using virtio and swapping in the disk image but it won't boot. Would it depend on how the VM was originally set up?

Strange because all the VMs on the other node are set up using virtio. Tried SCSI as well same issue.
 
Would it depend on how the VM was originally set up?

Strange because all the VMs on the other node are set up using virtio. Tried SCSI as well same issue.
Yes, Windows don't like to switch that later. Its need to be setup that way while installing. And you need to install the virtio drivers while installing or windows wont be able to find the drives.
 
Makes sense. I figured out why the last VM wasn't booting, my predecessor had created it using OVMF (UEFI) instead of SeaBIOS. God knows why...
 
Makes sense. I figured out why the last VM wasn't booting, my predecessor had created it using OVMF (UEFI) instead of SeaBIOS. God knows why...
Maybe he passed through some PCIe devices? For that OVMF is recommended.
 
How did you get from post 5 to post 6 above?

I can’t figure out how to add an existing lvm disk on /dev/sda.
 
Last edited:
How did you get from post 5 to post 6 above?

I can’t figure out how to add an existing lvm disk on /dev/sda.
In Datacenter, then Storage, I had to remove ID data | LVM | Disk image, Container.
Then click Add > LVM
In my case name it the same as it was "data". Choose vgroup data (in my case). OK out of it.

It magically then appears down the left hand side under Storage, alongside of the existing local and local-lvm's. Now my vms power on!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!