Recover VMs from previous corrupt Proxmox installation

garethw · Feb 17, 2021

My predecessor installed Proxmox on an SD memory card (apparently he didn't read that this isn't recommended!) and of course the SD card has failed. The VMs were stored on a 16TB RAID-5 volume which is intact. I installed the most recent version of Proxmox VE on a new hard drive hoping to resurrect these VMs. I can see the volume (/dev/sda1) which is called vm-data1 but I can't figure out how to access the data on the disk / mount these VMs.

This was the 2nd server in a cluster the 1st one is still operating here is what the same screen looks like on that system

Just wondering if there is a guide or any documentation for this type of scenario. I have limited a little bit of experience with Linux but am far from an expert.

Edit: I forgot to add I tried clicking 'Create VM' thinking I could point it to the existing VM image / file but didn't see an option to do so.

garethw · Feb 17, 2021

Here is the output of lvs / vgs / pvs it appears there are some LVM filesystems, just not sure how to mount them I don't really understand LVM

root@server2:/mnt# lvs
/dev/sdc: open failed: No medium found
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-a-tz-- <3.49t 0.00 0.19
root pve -wi-ao---- 96.00g
swap pve -wi-ao---- 8.00g
vm-101-disk-0 vm-data1 -wi-a----- 200.00g
vm-101-disk-1 vm-data1 -wi-a----- 1.00t
vm-102-disk-0 vm-data1 -wi-a----- 500.00g
vm-105-disk-0 vm-data1 -wi-a----- 200.00g
vm-105-disk-1 vm-data1 -wi-a----- 5.07t
root@server2:/mnt# vgs
/dev/sdc: open failed: No medium found
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <3.64t <16.38g
vm-data1 1 5 0 wz--n- 16.78t 9.83t
root@server2:/mnt# pvs
/dev/sdc: open failed: No medium found
PV VG Fmt Attr PSize PFree
/dev/sda1 vm-data1 lvm2 a-- 16.78t 9.83t
/dev/sdb3 pve lvm2 a-- <3.64t <16.38g

Pyromancer · Feb 17, 2021

Watching this as very interested in the answer. I'm thinking of deploying a system that will have 3 SATA slots, plus an internal NVME in a PCIe adaptor card, plan is that SATA 1 will be an SSD as the Proxmox / boot disk, SATA 2 and 3 will be large HDDs as a ZFS mirror plus the NVME (partioned into two parts) as the read and write cache. So the ZFS pool will be the VM storage, and as mirrors are now recommended above raid5/raidz1, we have redundancy in the VM storage. But what if the Proxmox SSD itself fails - how would we go about restoring the VMs off the ZFS, or would it have to be a case of flatten it and restore from the backups?

In this particular case the hardware only supports 3 SATA and 1PCIe slot, and there are no power supply headers to allow internal SSDs, hence why I'm planning this config. The BIOS also doesn't recognise the NVME as a potential boot disk, so I'm not sure the system would remain bootable if I were to set up RaidZ1 on three identical HDDs at Proxmox install, then from the CLI add the NVME partitions as logt/cache for it.

I'm guessing the issue is the VM config files that live under /etc/pve/qemu-server/ - if those are gone can the VMs be reloaded from the storage at all?

Dunuin · Feb 17, 2021

garethw said:
root@server2:/mnt# lvs
/dev/sdc: open failed: No medium found
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-a-tz-- <3.49t 0.00 0.19
root pve -wi-ao---- 96.00g
swap pve -wi-ao---- 8.00g
vm-101-disk-0 vm-data1 -wi-a----- 200.00g
vm-101-disk-1 vm-data1 -wi-a----- 1.00t
vm-102-disk-0 vm-data1 -wi-a----- 500.00g
vm-105-disk-0 vm-data1 -wi-a----- 200.00g
vm-105-disk-1 vm-data1 -wi-a----- 5.07t
root@server2:/mnt# vgs
/dev/sdc: open failed: No medium found
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <3.64t <16.38g
vm-data1 1 5 0 wz--n- 16.78t 9.83t
root@server2:/mnt# pvs
/dev/sdc: open failed: No medium found
PV VG Fmt Attr PSize PFree
/dev/sda1 vm-data1 lvm2 a-- 16.78t 9.83t
/dev/sdb3 pve lvm2 a-- <3.64t <16.38g

The VMs virtual HDDs (vm-101-disk-0 and so on) are stored on a LVM. First you need to add that existing LVM as a "storage" in proxmox with "disk image" as content type. After you got that your proxmox knows where to look for disk images and you can create new VMs (VM configs were stored on the failed boot drive) and select the existing disk images as virtual HDDs.

Like Pyromancer said the configs were stored under "/etc/pve/qemu-server/". So if your second working node got copies of the same VMs you might look there.

Pyromancer said:
Watching this as very interested in the answer. I'm thinking of deploying a system that will have 3 SATA slots, plus an internal NVME in a PCIe adaptor card, plan is that SATA 1 will be an SSD as the Proxmox / boot disk, SATA 2 and 3 will be large HDDs as a ZFS mirror plus the NVME (partioned into two parts) as the read and write cache. So the ZFS pool will be the VM storage, and as mirrors are now recommended above raid5/raidz1, we have redundancy in the VM storage. But what if the Proxmox SSD itself fails - how would we go about restoring the VMs off the ZFS, or would it have to be a case of flatten it and restore from the backups?

In this particular case the hardware only supports 3 SATA and 1PCIe slot, and there are no power supply headers to allow internal SSDs, hence why I'm planning this config. The BIOS also doesn't recognise the NVME as a potential boot disk, so I'm not sure the system would remain bootable if I were to set up RaidZ1 on Proxmox install, then from the CLI add the NVME partitions as logt/cache for it.

I'm guessing the issue is the VM config files that live under /etc/pve/qemu-server/ - if those are gone can the VMs be reloaded from the storage at all?

Read a little bit more about L2ARC, SLOG, ZIL and "special device". Read and Write Cache on a SSD is in most cases not useful and you should get a really decent SSD and not consumer stuff because it won't last long. Also keep in mind that everything on the HDDs might get lost if that single Cache SSD dies while the power is failing. It would be better to run that VMs on the NVMe SSD instead if you need the speed. And back them up on the HDDs regularily if you can't add another NVMe SSD to mirror them too.

And its no problem to install proxmox on a USB drive, but you should use durable Enterprise Sticks or atleast SSDs with a USB to SATA converter. You you got enough internal drive slots and a free USB3 header on the mainboard you can even use them internally.

garethw · Feb 17, 2021

Dunuin said:
The VMs virtual HDDs (vm-101-disk-0 and so on) are stored on a LVM. First you need to add that existing LVM as a "storage" in proxmox with "disk image" as content type. After you got that your proxmox knows where to look for disk images and you can create new VMs (VM configs were stored on the failed boot drive) and select the existing disk images as virtual HDDs.

Like Pyromancer said the configs were stored under "/etc/pve/qemu-server/". So if your second working node got copies of the same VMs you might look there.

Thanks for the response. I don't believe replication / backup was enabled to the other node so I've got nothing but the RAID-5 volume.

So I'm in the Proxmox web interface, how do I add an LVM storage exactly? I went to the 'Storage View' and clicked on 'LVM' under Disks on the tree to the right, and tried clicking on 'Create: Volume Group' and all it says is 'No Disks Unused' I assume I'm not in the right place but I really can't figure out this software...

garethw · Feb 17, 2021

OK I figured out how to add the LVM storage and can now see the vm-101-disk0 etc

I can see the volume as an option when I go to Create a VM but I can't see what I select to use an existing VM disk?

Dunuin · Feb 17, 2021

You don't want to create a new LVM, you just want to add a new "storage" based on that existing LVM.
Look at: Datacenter -> Storage -> Add

Not sure how to find out if the old LVM was "LVM" or "LVM thin". You might want to find that out first.

garethw · Feb 17, 2021

OK I think I solved my own problem

I added a new VM with the ID 201 on the vm-data1 volume and once it was added I logged into the console and edited the 201.conf and changed the ID/size of the IDE0 device to match the old vm-101-disk0

VM is now booting.

Dunuin · Feb 17, 2021

garethw said:
I can see the volume as an option when I go to Create a VM but I can't see what I select to use an existing VM disk?

Not sure if that is possible using the GUI. But you can create a VM without starting it and then manually edit the config files so they point to the right virtual disk: nano /etc/pve/qemu-server/[VMID].conf
VMID is the number you select as ID while creating the VM. Look that this ID machtes the ID of the old disk images (vm-105-disk-0 is the first disk of VM with105. vm-105-disk-1 is the second disk...and so on). You also need to test what type of disk controller was used. If for example a Windows was installed with "IDE" as controller, it won't boot if you choose "VirtIO SCSI".
In that conf file you need to look for lines like:

Code:

scsi0: VMpool7_VM:vm-107-disk-2,cache=none,discard=on,iothread=1,size=16G,ssd=1
scsi1: VMpool8_VM:vm-107-disk-0,cache=none,size=2G,ssd=1

"VMpool7_VM" is in my case the name of the storage (should be "vm-data1" in your case) and "vm-107-disk-2" is the name of the disk image you want to use.

By the way...IDE is nothing I would choose because it is very slow. You might want to use virtio SCSI if you create new VMs and don't need to import old ones.

garethw · Feb 17, 2021

Dunuin said:
By the way...IDE is nothing I would choose because it is very slow. You might want to use virtio SCSI if you create new VMs and don't need to import old ones.

Thanks, I tried recreating the VM using virtio and swapping in the disk image but it won't boot. Would it depend on how the VM was originally set up?

Strange because all the VMs on the other node are set up using virtio. Tried SCSI as well same issue.

Dunuin · Feb 18, 2021

garethw said:
Would it depend on how the VM was originally set up?

Strange because all the VMs on the other node are set up using virtio. Tried SCSI as well same issue.

Yes, Windows don't like to switch that later. Its need to be setup that way while installing. And you need to install the virtio drivers while installing or windows wont be able to find the drives.

garethw · Feb 18, 2021

Makes sense. I figured out why the last VM wasn't booting, my predecessor had created it using OVMF (UEFI) instead of SeaBIOS. God knows why...

Dunuin · Feb 18, 2021

garethw said:
Makes sense. I figured out why the last VM wasn't booting, my predecessor had created it using OVMF (UEFI) instead of SeaBIOS. God knows why...

Maybe he passed through some PCIe devices? For that OVMF is recommended.

Search

Search

Recover VMs from previous corrupt Proxmox installation

garethw

New Member

garethw

New Member

Pyromancer

Member

Dunuin

Distinguished Member

garethw

New Member

garethw

New Member

Dunuin

Distinguished Member

garethw

New Member

Dunuin

Distinguished Member

garethw

New Member

Dunuin

Distinguished Member

garethw

New Member

Dunuin

Distinguished Member