[SOLVED] Help requested with recovery after SSD failure

gadaph

New Member
Aug 23, 2025
7
6
3
Background: My father recently died suddenly back in the UK. He ran a home Dell server with an up to date Proxmox install containing several VM's including TrueNAS, Plex, an audiobookshelf server for my mother (who is legally blind), and a recipe server for my sister. Whilst I was in the UK for the funeral a brownout destroyed the server SSD. I am now back in Australia but have remote desktop and SSH access to my Dad's stuff. I am reasonably familiar with Linux, especially Debian and Ubuntu and have a basic understanding of virtual machines, but have never used Proxmox. Dad was a retired IBM System's Engineer and had a Backblaze account where I'm reasonably confident his backups will be stored. The location of passwords etc for this is currently unclear however.

Actions taken:
1. Attempted SSD recovery using an Ubuntu Live CD. Multiple attempts at superblock repair failed. I assume the drive is beyond repair.
2. Removed SSD from Server and replaced with new one. The old SSD is stored safely with a plan to DD it to a fresh SSD from Linux to preserve its data as much as possible.
3. Successfully installed Proxmox to new SSD, and therefore have a fresh running proxmox system.
4. LSBLK shows me the following:

Screenshot 2025-08-26 193033.png

sdb is the new SSD
sdc appears to contain virtual machine discs?
My understanding of sda and sdd is that they contain his local backups.

My questions are:

1. Do I possess the necessary files/HDD data etc to perform a system recovery of his virtual machines or is this a lost cause?
2. If a system recovery is possible, what steps do I follow? I'm nervous about experimenting and making things worse rather than better.

I have not yet attempted to mount any of the non SSD partitions in Proxmox. If I go to Datacentre-Storage and select Add + LVM then Proxmox appears to "see" dell-c as a mountable partition, but sda1 and sdd1 do not appear.

My understanding is that this was a default Proxmox setup and didn't use any exotic filesystems or ZFS.

My plan B if recovery proves impossible is to perform a bit for bit copy of the failed SSD using dd and to attempt to recover the original disc.

I would REALLY appreciate any help on this so that I can get my mother and sister setup again and back on track.
Thank you in advance.
 
You appear to still have the virtual disks for VMs 100, 101, 102, 106 and 107 (assuming all of them had one disk, which is common) but the VM configurations are lost.
You need to add the existing Thin LVM storage on sdc to your new Proxmox. If you create new VMs with a configuration that matches the original good enough (and scan for and add the existing virtual disks) then you can probably boot them and access the virtual disks.
Making a byte for byte backup of drive sdc is recommended to prevent data loss if anything does wrong (your original SSD is probably beyond repair especially if you alreay tried fixing the filesystem). I have good hopes that the still existing virtual drives can be recovered unless there has been silent data corruption because of the power outage.
Lots of threads on this forum about people having lost their configurations but trying to recover with the still existing virtual disks. I don't have time now to guide you step by step but maybe someone else can (or maybe you can find similar threads here).
 
  • Like
Reactions: gadaph
Have you managed to add the existing LVM-Thin storage in the Proxmox web GUI > Datacenter > Storage?
Can you create a new VM without virtual disks with a fresh number, that can provide the basic VM configuration for the other VMs?
 
  • Like
Reactions: gadaph
Thanks for the reply.

I believe that I've added the existing storage and given it the label "old".

Screenshot 2025-08-27 172011.png

I've also created a new virtual machine, mounted Ubuntu Server LTS CD on it, and given it number 150 to avoid overlap with 100-107 as listed above.

Screenshot 2025-08-27 172103.png

So his represents a fresh virtual machine with no attached disk?
Is it now as simple as somehow mounting one of the vm-x-disk 0's on the new VM?

Thanks for the guidance.
 
You could copy the file /etc/pve/qemu-server/150.conf to /etc/pve/qemu-server/107.conf and then run qm disk rescan --vmid 107. That will add the vm-107-disk-0 to VM 107 as unused. You can then use the Proxmox web GUI to connect the unused drive (as SCSI, SATA or IDE). Change the VM Options > Boot order accordingly and see if you can boot the VM. You'll have to figure out how to connect the drives and boot them by trial and error, I guess.

I notice a base-105-disk-0 in your screenshot that I did not see in your previous post. I don't know where it suddenly came from or whether you created a template and maybe one or more of the VMs are based on the 105 template. That would complicate things.

Alternatively, you could try to boot a VM from a Linux Live CD (like GParted or Ubuntu installer live environment) with one of the existing virtual disks connected. Connect them either via the GUI as above or manually in the VM configuration file in /etc/pve/qemu-server/. Then you can probably access the virtual disk and see if you can read the files that are on there.

Since you are somewhat familiar with Debian-based Linux, I would suggest treating this a "I have several disks and don't know what's on them" as if you have a physical system (with a Debian or Ubuntu Live CD) and just trying to read and recover the data from the disks. Except that you now have one (or more) VM(s) instead of a physical system. And you don't know how the drives connect (probably SCSI but could be SATA or IDE) and they therefore probably won't boot.
 
Last edited:
Thank you!

I think I may have accidentally created base-105-disk-0 whilst creating VM 150. I went past the create disk screen and then realised that a zero byte sized disk was not the same as no disk, so went backwards and did the VM creation wizard again without having fully completed it the first time.

I've ordered a 10TB external disk and plan to do a byte for byte DD copy of /devc before I continue.

Once this is done then i will try the above.

Thank you for your help!
 
base-105-disk-0 is a virtual disk created by converting a virtual machine to a template, where vm-105-disk-0 is converted to base-105-disk-0.

disk is created in relation to the vmid, which is never created in the 150 creation.

This is necessary if there are other virtual machines created based on virtual machine template 105 that own base-105-disk-0.
 
Last edited:
  • Like
Reactions: gadaph
Hmmm, OK. Interesting.

I wonder why that has suddenly made an appearance,

Thank you.

I will investigate once I have a backup....
 
  • Like
Reactions: uzumo
I think if you can get that file it will work by replacing the file.

https://pve.proxmox.com/wiki/Proxmox_Cluster_File_System_(pmxcfs)

If you have major problems with your Proxmox VE host, for example hardware issues, it could be helpful to copy the pmxcfs database file/var/lib/pve-cluster/config.db, and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service and replace the config.db file (required permissions0600). Following this, adapt /etc/hostname and /etc/hostsaccording to the lost Proxmox VE host, then reboot and check (and don’t forget your VM/CT data).
 
  • Like
Reactions: gadaph
Thanks. That's potentially another potentially useful line of approach.

I'm stuck in a bit of a holding pattern at the moment whilst waiting for the USB drive to arrive. Australia Post are not the quickest......
 
So, a brief update for the record.

I purchased a new 8TB SAS drive and also a 960GB SAS SSD and they are now installed in Dad's machine.

OpenSuperClone successfully made a bit for bit copy of the 8TB drive containing the VM discs, so I now have an exact backup of this.

The old boot SSD, as predicted, appears to be toast. It is detected by his Dell machine's bios but does not show up under lsblk or sudo fdisk -l. This is now well beyond my knowledge both of linux and filesystems to repair.

Whilst browsing the filesystem from the OpenSuperClone LiveCD, however, I believe that I may have found his backups!

It occurred to me that there was another drive (1TB) visible, which I had not mounted or explored, so i did both.......

These are the contents:

Screenshot 2025-09-07 174732.png

If I'm correct then logically it would seem easier to attempt to figure out how to mount this drive partition within Proxmox and then restore from them, would it not? Or is the process more complex?
 
  • Like
Reactions: Johannes S
Backups usually contain all (important) virtual disks as well as the VM configuration (which can be viewed before restoring). This is a great find and even if they are out of date they will still be of help.
Mount the parent directory (not dump itself as Proxmox expects certain directory structure) on the Proxmox host (just as you would on Linux) and then go to the Proxmox web GUI > Datacenter > Storage > Add Directory and use the mount path for the Directory entry and make sure to select Backup in the Content pull-down menu entry.
You should then be able to restore backups from the storage you just added in the left pane of the Proxmox web GUI.
 
I am currently looking at a fully restored Proxmox server.

The last backups appear to have been automatically performed the day before the SSD failure, so I believe that data loss has been minimal.

Thank you so much for all your help. I am eternally grateful to you, as are my family.