[SOLVED] VM Drive Recovery

MrPilote

New Member
Jan 21, 2023
11
0
1
So I have a PVE 6.4.1 installation
- 1ea root drive
- 2ea drives configured into ZFS pool

My root drive died and is not recoverable.
I did not have the VM configurations backed up.

I got a new root drive and reinstalled PVE 6.4.1.
I read the forums and managed to get the ZFS pool recovered (lots of good stuff out there).

So now I have a number ZFS drives and I am going to try and recreate the virtual machines.
My thought was to build a new Linux VM and mount some of the drives as READ-ONLY so I could determine which machines were what. Most are Linux, one is Windows and I know which drives belong to it.

My question is can I just build a new machine and then change out the drive for the original one from in the pool?
Is that the best way to get this operational again? I'm looking for a good way to recover here.


Thanks again for the help. I'm learning more each day.
 
I got a new root drive and reinstalled PVE 6.4.1.
PVE 6 is end of life since summer and isn't receiving any security patches. I would start with a fresh PVE 7.3 install.

My thought was to build a new Linux VM and mount some of the drives as READ-ONLY so I could determine which machines were what. Most are Linux, one is Windows and I know which drives belong to it.
You can also mount those zvols directly on the PVE host, like you would mount a physical disk, to have a look whats on them. Have a look at "/dev/zvol/YourPoolName/NameOfYourZvol". Just make sure to unmount them before starting a VM.

My question is can I just build a new machine and then change out the drive for the original one from in the pool?
Is that the best way to get this operational again? I'm looking for a good way to recover here.
Yes, but when you use wrong setting the VM won't be able to boot. Storage protocol, disk controller, BIOS/UEFI and so on need to match the old VM.
 
PVE 6 is end of life since summer and isn't receiving any security patches. I would start with a fresh PVE 7.3 install.
I was concerned it would be compatible with my existing stuff. If it's not a problem then I can move to the latest and I was planning to do that after this recovery but moving now would be easier.
You can also mount those zvols directly on the PVE host, like you would mount a physical disk, to have a look whats on them. Have a look at "/dev/zvol/YourPoolName/NameOfYourZvol". Just make sure to unmount them before starting a VM.
I thought about that but wasn't real sure it was completely safe. I know that I have lost all my virtual machine configurations so going that route would make it easier to look over the "drive" and determine what virtual machine it was. I don't think any of the VM information is stored there, is that true?
Yes, but when you use wrong setting the VM won't be able to boot. Storage protocol, disk controller, BIOS/UEFI and so on need to match the old VM.
I knew that would probably be a problem but I hope it won't do any damage to the disks. Would this be easiest way to recover the machines or is there a better way?

I moved to ProxMox from an ESXi environment and I've been slowly learning it. I really appreciate your help. I noticed that you have many answers up here and it's been good information, thanks for that.
 
Last edited:
I don't think any of the VM information is stored there, is that true?
No, VM configs were only stored in "/etc/pve/" on the system disk. In case you got guest backups they also contain the VMs config file.
Would this be easiest way to recover the machines or is there a better way?
There is no other way if you didn't got any backups.
 
I thought that I would try to mount them directly to the PVE host but I wasn't able to get that to work.
I wasn't sure what to set the file system type to in the mount command. I know it is a Linux machine and probably a Debian-based one.

Any help here?

Code:
fdisk -l vm-550-disk-0
Disk vm-550-disk-0: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x2300c78f

Device          Boot   Start       End   Sectors   Size Id Type
vm-550-disk-0p1 *       2048   1050623   1048576   512M  b W95 FAT32
vm-550-disk-0p2      1052670 419428351 418375682 199.5G  5 Extended
vm-550-disk-0p5      1052672 419428351 418375680 199.5G 83 Linux
 
Yes, that's what I tried without success.
I have taken your suggestion and I am doing a fresh install with PVE 7.3.1. I will then go back to this approach. Are you aware of any issues with ZFS between versions?
Do you have a good resource for how to swap out the virtual drives on a VM if I go down that road?

Again, thanks for the help. I have gone back to some videos to learn more about ProxMox.
 
Are you aware of any issues with ZFS between versions?
ZFS is backwards compatible. So shouldn't be a problem to use that pool with PVE 7.3. Most common problems when switching from PVE 6 to 7 are differences with PCI passthrough and the dropped support for cgroup (so very old LXCs might not work anymore without extra steps).

Do you have a good resource for how to swap out the virtual drives on a VM if I go down that road?
You rename the zvol matching the naming scheme of the new VM ("zfs rename" command). Then you can use the "qm rescan" command to tell PVE to look for new disks. The renamed disks then should show up in the VM as unattached disks and you can attach them using the webUI.
 
You rename the zvol matching the naming scheme of the new VM ("zfs rename" command). Then you can use the "qm rescan" command to tell PVE to look for new disks. The renamed disks then should show up in the VM as unattached disks and you can attach them using the webUI.
So I've been configuring the v7.3 and I'm a little nervous in trying to connect these drives in my zfs pool. Is there a way to copy or back-up these "drives" in case things go wrong? I've been reading about zfs send but I am not sure where to write that output? I have a NAS attached now so I have space. Can I just make a copy in the same location?

For instance, instead of renaming the old drive could I zfs send|zfs recv to the same location and use a new name on the recv side?

Again, thanks for any help here.
 
Last edited:
For instance, instead of renaming the old drive could I zfs send|zfs recv to the same location and use a new name on the recv side?
Jup.
And if you are running out of space you could also pipe the output of zfs send into a file on a SMB/NFS share of your NAS.
 
Jup.
And if you are running out of space you could also pipe the output of zfs send into a file on a SMB/NFS share of your NAS.

So I have read through all the different man pages for the zfs send and zfs recv and come up with the following command;
Code:
zfs send VM4tbR1/vm-300-disk-2 | zfs recv -v VM4tbR1/vm-322-disk-1
It did not appear that I needed any of the flags or options.

I have created a new virtual machine with the id 322 and it has one disk. I have not started the machine yet as my plan is to copy the old original drive over to a new drive with the new name. Then I am going to attach the disk in the WebUI. If all goes according to plan then I should have my old VM back with a new id. This is a Windows Server machine so I am trying to be careful.

Here is a list of my zfs list output;
Code:
# zfs list
NAME                    USED  AVAIL     REFER  MOUNTPOINT
VM4tbR1                1.57T  1.94T       96K  /VM4tbR1
VM4tbR1/vm-100-disk-0  33.0G  1.97T     6.92G  -
VM4tbR1/vm-200-disk-1  1.03T  2.07T      932G  -
VM4tbR1/vm-300-disk-0   103G  2.02T     21.6G  -
VM4tbR1/vm-300-disk-2   103G  2.02T     22.2G  -
VM4tbR1/vm-322-disk-0    56K  1.94T       56K  -
VM4tbR1/vm-500-disk-0   103G  2.03T     20.1G  -
VM4tbR1/vm-550-disk-0   206G  2.13T     17.6G  -


Have I missed anything? This does not appear to be working.
Do I need to include the path?
 
Last edited:
Have I missed anything? This does not appear to be working.
Do I need to include the path?
You can't send datasets directly. You need to send snapshots. So you will have to create a snapshot of that dataset first. So it should be something like: "zfs send pool/dataset@snapshot"
 
Last edited:
You can't send datasets directly. You need to send snapshots. So you will have to create a snapshot first of that dataset first. So it shouls be something like: "zfs send pool/dataset@snapshot"

Thanks for that confirmation. I started reading up on the ZFS stuff and things were starting to look like that.
I found a great resource for ZFS HERE.

It looks like I might be able to do this with a clone too.

Again, thanks for all your help and guidance. When I get this working I intend to post a final post here with what I did and how it worked.
 
Yes, clones should work too. But then you can't destroy the old dataset later on when you decide that everything works fine.
 
Thanks for you help.
I was able to get my Windoze Server back up and running. I still need to test it out but now I have a clean copy of the drive just in case something goes bad. I will write up a whole piece for this and put it in this thread.

The next thing to do is get a couple of my Linux machines recovered. I plan to pretty much follow the same process.

How can I make a backup of just the VM configuration for now?
 
VM configs are stored in "/etc/pve/qemu-server". Best you set up a daily or weekly backup task for the whole "/etc" folder (especially "/etc/pve") in case your system disk ever fails again. You could for example add a cronjob that uses tar and gzip to write a archive to a NAS or use the proxmox-backup-client to store a backup snapshot of the "/etc" folder on a PBS server.
 
Last edited:
VM configs are stored in "/etc/pve/qemu-server". Best you set up a daily or weekly backup task for the whole "/etc" folder (especially "/etc/pve") in case your system disk ever fails again. You could for example add a cronjob that uses tar and gzip to write a archive to a NAS or use the proxmox-backup-client to store a backup snapshot of the "/etc" folder on a PBS server.
Right, that makes sense. So there is no place in the WebUI to do this type of backup for just the configuration or other files?
 
So here is a summary of my situation and this thread.

I had a PVE 6.4.1 installation as follow;
- 1ea root drive
- 2ea drives configured into ZFS pool

My root drive failed and I did not have any backups.

Here is the process that I went through to recover the virtual machine drives that were still out on the ZFS pool.
Do not blindly follow this process. I spent hours reading documentation, mostly ZFS stuff that is linked in this thread.
When I had questions or wanted confirmation then I posted here and @Dunuin was kind enough to help.

I got a new root drive and installed the latest version of ProxMox Virtual Environment (PVE) v7.3.4.
I went to the Shell and started checking.

Code:
# lsblk
NAME               MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                  8:0    1 232.9G  0 disk
├─sda1               8:1    1  1007K  0 part
├─sda2               8:2    1   512M  0 part
└─sda3               8:3    1 232.4G  0 part
  ├─pve-swap       253:0    0     8G  0 lvm  [SWAP]
  ├─pve-root       253:1    0    58G  0 lvm  /
  ├─pve-data_tmeta 253:2    0   1.5G  0 lvm 
  │ └─pve-data     253:4    0 147.4G  0 lvm 
  └─pve-data_tdata 253:3    0 147.4G  0 lvm 
    └─pve-data     253:4    0 147.4G  0 lvm 
sdb                  8:16   1   3.7T  0 disk
├─sdb1               8:17   1   3.7T  0 part
└─sdb9               8:25   1     8M  0 part
sdc                  8:32   1   3.7T  0 disk
├─sdc1               8:33   1   3.7T  0 part
└─sdc9               8:41   1     8M  0 part
sr0                 11:0    1  1024M  0 rom

So my ZFS drives were still there.
From there I imported the zfs pool which was named VM4tbR1.

Code:
# zpool import -f VM4tbR1
# zfs list
NAME                    USED  AVAIL     REFER  MOUNTPOINT
VM4tbR1                1.57T  1.94T       96K  /VM4tbR1
VM4tbR1/vm-100-disk-0  33.0G  1.97T     6.92G  -
VM4tbR1/vm-200-disk-1  1.03T  2.07T      932G  -
VM4tbR1/vm-300-disk-0   103G  2.02T     21.6G  -
VM4tbR1/vm-300-disk-2   103G  2.02T     22.2G  -
VM4tbR1/vm-500-disk-0   103G  2.03T     20.1G  -
VM4tbR1/vm-550-disk-0   206G  2.13T     17.6G  -

Then you can run a zpool status and it should show the state of the pool.
Mine was clean and didn't need a scrubbing.

I had 6ea VMs but I could only remember about half of them.
I dug through my DNS and found what names I could. Only one of them was a Windows machine and it was my big concern. If the boot goes wrong there then it could make the disk unusable. So my plan was to copy that drive from the zfs pool to a new name within the same pool. Then I could build a new VM with the closest configuration I could remember.

I built a new VM with a different ID with the settings.

I then used zfs send and zfs recv to copy the drive. zfs send needs a snapshot to work so I took a snapshot and duplicated it.
Code:
// We need to create a snapshot first
# zfs snapshot VM4tbR1/vm-300-disk-2@last

# zfs list -t snapshot
NAME                         USED  AVAIL     REFER  MOUNTPOINT
VM4tbR1/vm-300-disk-2@last     0B      -     22.2G  -

//Duplicate the drive into the same directory
# zfs send VM4tbR1/vm-300-disk-2@last | zfs recv -v VM4tbR1/vm-322-disk-1
receiving full stream of VM4tbR1/vm-300-disk-2@last into VM4tbR1/vm-322-disk-1@last
received 30.9G stream in 1086 seconds (29.2M/sec)

# zfs list
NAME                    USED  AVAIL     REFER  MOUNTPOINT
VM4tbR1                1.61T  1.90T       96K  /VM4tbR1
VM4tbR1/vm-100-disk-0  33.0G  1.93T     6.92G  -
VM4tbR1/vm-200-disk-1  1.03T  2.02T      932G  -
VM4tbR1/vm-300-disk-0   103G  1.98T     21.6G  -
VM4tbR1/vm-300-disk-2   125G  2.00T     22.2G  -
VM4tbR1/vm-322-disk-0    56K  1.90T       56K  -
VM4tbR1/vm-322-disk-1  22.2G  1.90T     22.2G  -
VM4tbR1/vm-500-disk-0   103G  1.98T     20.1G  -
VM4tbR1/vm-550-disk-0   206G  2.09T     17.6G  -
# zfs list -t snapshot
NAME                         USED  AVAIL     REFER  MOUNTPOINT
VM4tbR1/vm-300-disk-2@last     0B      -     22.2G  -
VM4tbR1/vm-322-disk-1@last     0B      -     22.2G  -

// Rename the original disk that was created for the new VM to something else
// This is not really necessary. You can just detach the drive from the WebUI and then delete it.
# zfs rename VM4tbR1/vm-322-disk-0 VM4tbR1/vm-322-disk-0old

// Rename the newly copied old disk to the new VM disk name
// The snapshot will follow the rename so no need to worry about it.
# zfs rename VM4tbR1/vm-322-disk-1 VM4tbR1/vm-322-disk-0

// Rescan disks against the VM so they show up in the WebUI
# qm rescan

I went to the WebUI and detached the drive from the new machine that I created. The new copied drive showed up after the qm rescan so I attached it as the sole drive and booted. The machine booted, made some changes and I had to reboot and it's up and running.

I am now working on my Linux machines and I hope they will be much easier. I realize that I didn't need to duplicate the disks but I was concerned about corruption with them.

BIG THANKS to @Dunuin. Hopefully this helps someone else.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!