Disaster recovery

stsinc

Member
Apr 15, 2021
66
0
11
I am in a difficult situation. Here is what happened:
  1. I needed to get more space on each of the boot partition of my 5 Proxmox v7 nodes
  2. So I foolishly applied a cleaning script that has made ALL my 5 nodes unbootable
  3. Basically, when I boot each I get the Memory test and NO Proxmox entry (when applying `proxmox-boot-tool status` I get `proxmox-boot-uuids` not found)
I have made image backups of each of the boot SSDs of the 5 nodes.
I can mount the LVMs inside each (except for the `data` LV which has no path) and I know the data are there, in `ext4` format.

So:
  1. is there a way I can reinstall Proxmox v7 on each node without erasing the `data` partition where the CTs and VMs are stored?
  2. If not possible, is there a way I can recover these CTs and VMs and "reinject" them into a brand new Proxmox install?
Could you send me ponters to the solution I could use?

Thanks
Stephen
 
Hi,
show us your "cleaning" script. i would guess it removed the active pve kernel. you could reinstall it by booting from a debian rescue mode, mount the proxmox disk, add proxmox repositories and install with apt.

Your questions:
1. no, if you can't boot into that ssd and use apt. another way would be to boot rescue from debian iso/cd, mount the ssd and reinstall with apt from there.
2. if you install proxmox to new ssds, you can find the configs including containers and VMs in /etc/pve, if you are running ceph, also /etc/ceph
2.1 before you reinstall on the same ssds, check, that your images are really working and you can extract the configs from there. also you could just restore images, but best on new ssd.
 
Last edited:
Hi, and thank you for your prompt response.
  • I will try and find the "cleaning script" which have deleted from all my computers since ;o)) BTW, I found it in a user's response on this very forum.
  • I am convinced #1 solution is the right one. Could you give me the full procedure please or direct me to an existing link?
Thanks again,
Stephen
 
i need to know more about your setup. how did you originally installed proxmox? zfs-boot or default lvm? are your vms on a local/local-lvm/local-zfs storage (same disk as OS) or is there a dedicated storage as separate zfs, ceph cluster or something... the more info you can tell, the better we can help

what exactly happens, when you boot from the current ssd? screenshot?
 
Here are my answers @flames :
  1. default lvm
  2. same disk as OS
  3. no ceph, zfs, etc. Plain simple install on SSDs
When I boot, the GRUB does not display any Proxmox entry any longer, just different entries of MemoryTest (which fails, BTW despite the fact I checked the memory with Lenovo's own tools and it is fine).
I then tried to use the DEBUG install mode and could execute `proxmox-boot-tool status`
 
  1. Basically, when I boot each I get the Memory test and NO Proxmox entry (when applying `proxmox-boot-tool status` I get `proxmox-boot-uuids` not found)
did you already booted into a rescue disk and had a look into your ssd, if all the data is still there? especially /etc/pve still contains all the configs
 
Here is what I did:
  1. I unmounted the SSDs from each of the physical machines and clonezilla-ed them each
  2. Then I inserted them each in an enclosure and mounted them on my workstation using `vgscan` then `vgdisplay` `vgchange -ay` and so on. The report showed that every LV was fine.
Meaning: I had the `swap`, the `pve-root` and the `data` (which I was unable to mount but it is for yet another reason)
 
Last edited:
sounds good so far.
missing boot-uuids means actually, that you are booting in legacy mode (grub), not uefi. if i remember correctly, proxmox-boot-tool is for uefi boot.
did you eventually changed something in the boot options in bios/eufi?
 
No, I do not think so, stayed in mixed Legacy/UEFI mode with a preference for Legacy
 
if that does not work and pve kernel is also gone, you could mount pve-root again, chroot into it and "apt install proxmox-ve"
 
OK, that is the procedure I was looking for.
I also forgot to tell you I tried a boot repair from the live ISO Boot Repair Disk and it did not work
 
OK @flames , here is the latest update from the recovery front.
  1. I boot from the Proxmox live install
  2. I access the DEBUG mode
  3. `exit` to go back to the shell
  4. mount `/dev/pve/root`
  5. but when I execute `apt install proxmox-ve` I get an error message saying it is not found.
The reason is that I am not connected to the Internet in that mode (despite the fact I am physically connected to my router).
And I did an `ip a` and my link is, in effect, down.

So my question is: how do I connect to the Internet from the debug mode?
 
when you chroot into the volume, your rescue system acts as if pve-root is its own boot volume but running from rescue kernel. so you can issue commands "like" you booted from the offline system (not everything works ofc., but yet it helps).
 
I get it now.
The error I was also making was to have booted from a Ventoy based Proxmox. Redoing the whole process with a plain USB installed Proxmox installer.
 
I am hitting a blocking point when I execute
Code:
mount /dev/sda1 /media/RESCUE/boot
I get an error message saying that sda1 already mounted or mount point busy
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!