Help needed to recover VM def after dead HD on proxmox V8

sbs

Member
Mar 30, 2021
15
0
6
50
Hi,

One of the drive has gone dead. SMART is in failed state, any disk operation fills the logs with errors, and slows everything down.
This is not a system disk. The VM using this disk is stopped pending restore.

Disk has just 1 LVM 8T partition, and will be replaced with a bigger disk.

Is there a restart/reinstall procedure so that the replacement disk can be reconfigured as the previous drive? Or should I just remove the old drive, install the new one and add a new lvm in proxmox with the same size as the original one?
 
Disk has just 1 LVM 8T partition
How was this disk setup in PVE. What is its storage.cfg details? Was anything else excepting the above-mentioned VM using it?

But in principal if the old disk is dead - just remove it & put the new one in, setup the desired storage configuration for that disk in PVE, & restore the VM (from restorable backup that I assume you have) to that newly created storage.

I assume from your post the disk is not part of a raid array/ZFS etc. That would be a different story.
 
How was this disk setup in PVE. What is its storage.cfg details? Was anything else excepting the above-mentioned VM using it?
Not sure, I used the "add lvm" from the proxmox gui to create a lvm on the disk.
Nothing else on this partition except for that one VM.
But in principal if the old disk is dead - just remove it & put the new one in, setup the desired storage configuration for that disk in PVE, & restore the VM (from restorable backup that I assume you have) to that newly created storage.
Ok I will try that. Thank you.
I assume from your post the disk is not part of a raid array/ZFS etc. That would be a different story.
Yes, it is a simple setup.
 
If you don't need to partition or resize it, LVM is overkill. Just make 1 partition on the replacement disk and format it with XFS to keep it simple
 
Hi,

HDD was replace but the VM definition seems to have been wiped in the operation. VM show only the vm number and almost no hardware.
1719908399962.png


The VM was running despite the dead HDD, since it was using other VMDisk where the OS was stored. I can still see my VM storage disks in the console:
1719908518087.png
Is there a file somewhere I could use to restore the VM hardware definition as it was before stopping the system. then I'll just need to find the command to reattach the storage.
 
Well, I managed to mess up my backup and let backup rotation delete the backup file.
I guess my current option short of creating a new vm, and reinstalling everything, is reconfiguring the hardware and reattach the existing drives?
 
Is there a file somewhere I could use to restore the VM hardware definition
By that I assume you mean the VM configuration file. Yes such a file exists on your PVE host & (should) exist in the following location:
Code:
/etc/pve/qemu-server/139.conf  #Assuming you are referring to VMID: 139

I must be honest though, I'm actually not quite sure what you attempting to do.
 
Ok, I am trying to save me a reinstall of the VM OS and software. I have 3 disks still available which contain the UEFI partition, OS partition and software. The lost partition on the lost disk was just hosting a developement database (and this data can be retrieved from the live db if needed).

So if I can reattach them to a VM definition with the correct hardware, it should boot back to the VM OS

The 139.conf file exists but content only lists :
memory: 128

I guess I have to create a new VM, and add the existing disks into, via the GUI.
 
If vm 140 was similar to vm139 you can copy the 140.conf file replacing 140 with 139 inside, also remember to change mac address.... and set consistent disk sizes...
 
The 139.conf file exists but content only lists :
memory: 128
Why is this? On the GUI image you posted above of the VM hardware - I noticed the same thing that the drives are missing. This config/file, which is not located within the failing HDD but in the PVE host seems to also have been corrupted/misconfigured/mishandled.

So far what we know:

1. 1 HDD has been corrupted.
2. Your backups are not in order - due to an incorrect retention policy.
3. You have a corrupted/mismanaged PVE system.

Please understand - I'm not trying to be hard with you - but I'm trying to clear things up so that in future they don't happen again.


What I would do in your case (in principle) is create a new VM with similar config to the old VM 139 without adding any disks & then import/move/assign the old virtual disks: vm-139-disk-0, vm-139-disk-1 & vm-139-disk-2 to that new VM.

The first procedure is to discover on which PVE storage these disks are located. You do not provide these details (although from your image I would bet on that it is in local-lvm , if this is indeed the case the procedure will be more tricky).

You can follow this official guide to help you along.

One other thing: I would suggest you do a block-level backup of your PVE host before you continue. This is not the easiest of procedures. I do this by following this procedure on a SHUTDOWN PVE system. Don't try it if you don't know what you are doing!

WARNING: I take no responsibility for any data loss/corruption on any procedure you try.
 
Why is this? On the GUI image you posted above of the VM hardware - I noticed the same thing that the drives are missing. This config/file, which is not located within the failing HDD but in the PVE host seems to also have been corrupted/misconfigured/mishandled.
Yes something happened to this file between the last try to restore on the faulty HDD and the restart after changing the HDD. Can tell what or why.
So far what we know:

1. 1 HDD has been corrupted.
2. Your backups are not in order - due to an incorrect retention policy.
3. You have a corrupted/mismanaged PVE system.
Yes You are right
Please understand - I'm not trying to be hard with you - but I'm trying to clear things up so that in future they don't happen again.
No Problem, I appreciate your efforts to understand my setup , and suggest solutions
What I would do in your case (in principle) is create a new VM with similar config to the old VM 139 without adding any disks & then import/move/assign the old virtual disks: vm-139-disk-0, vm-139-disk-1 & vm-139-disk-2 to that new VM.

The first procedure is to discover on which PVE storage these disks are located. You do not provide these details (although from your image I would bet on that it is in local-lvm , if this is indeed the case the procedure will be more tricky).

You can follow this official guide to help you along.

One other thing: I would suggest you do a block-level backup of your PVE host before you continue. This is not the easiest of procedures. I do this by following this procedure on a SHUTDOWN PVE system. Don't try it if you don't know what you are doing!

WARNING: I take no responsibility for any data loss/corruption on any procedure you try.
I did a :
- VM creation, (got 149)
- copied it over VM139,
- Delete the new conf (vm149)
- qm rescan to rediscover all non attached HDD (and found them all)
- Added back the hdds to VM139
- When booting, windows does not see the OS, repair does not see the disks, even after adding the virtio dirvers.

I will reinstall the OS, and restart from a clean situation.
There is no point spending more of your time on this, since a clean reinstall will take a few hours with a guaranteed result.

Thank you again for everything.
 
When booting
Did you check the Boot order & selection in GUI; VMID, Option, Boot Order & Edit ?

Also Windows can be very finicky to have the identical HW - so maybe you configured something slightly differently
Thats what backups are for (sorry, there I'm going again!).

Anyway good luck & happy Proxmoxing what ever you do.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!