Need some help recovering/rebuilding my pve+pbs home server

gctwnl

Member
Aug 24, 2022
115
20
23
The controller of my SSD (Samsung 980 Pro...) died so I have to rebuild my setup from backups. The first step is to install PVE and on top of that PBS and update everything, then do my LUKS setup on my external RAID and recover /etc/pve from a backup.

The original setup is:
HardwareInternal Samsung 980 Pro NVMe SSD (now dead) Mercury Elite Pro Dual Mini hardware RAID1
Encryption at restSSD Passwordluks-fa1483bd-f599-4dcf-9732-c09069472150
PV/dev/nvme0n1p3/dev/mapper/luks-fa1483bd-f599-4dcf-9732-c09069472150
VGpverna-mepdm-1
LV-thinYESNO
LV (size)vm-100-disk-0 (32GB)vm-100-disk-0 (801GB)
Used asCLIENT boot diskCLIENT: /mnt/ServerData (300GB) ext4 UUID:109bd659-811d-442e-9539-ebf3673d9ad3
LVvm-100-disk-1 (32GB)rna-pbs-mepdm-1 (200GB)
Used asCLIENT: /var/lib/docker UUID:a74f54a6-7a85-4c3b-839f-c034ef280d0bHOST: /mnt/pbs-backup-1 (200GB) ext4 UUID:fb75e648-561d-47a1-948c-83d9d72df80f
LVvm-100-disk-2 (500GB)
Used asCLIENT: /mnt/ServerBackup (500GB) UUID:e7639f38-e488-46fb-bd95-64c930c30603

The server data (of the single VM I am currently using) lives on the RAID, and gets backed up to a LV on the internal SSD (as well as to the cloud). I have not lost server data, I lost the internal backup of server data, but I still have one external restic backup of server data on B2.
The VM itself lives on the internal SSD and gets backed up with PBS to the RAID. I lost that VM, but I have a PBS backup on the RAID.

I had:
Code:
root@pve:~# vgs -o +lv_size,lv_name
  VG          #PV #LV #SN Attr   VSize    VFree   LSize    LV             
  pve           1   6   0 wz--n- <931.01g  15.99g <794.79g data           
  pve           1   6   0 wz--n- <931.01g  15.99g    8.00g swap           
  pve           1   6   0 wz--n- <931.01g  15.99g   96.00g root           
  pve           1   6   0 wz--n- <931.01g  15.99g   32.00g vm-100-disk-0 
  pve           1   6   0 wz--n- <931.01g  15.99g   32.00g vm-100-disk-1 
  pve           1   6   0 wz--n- <931.01g  15.99g  500.00g vm-100-disk-2
(and the RAID entries)

After installing PVE and PBS and unlocking the external RAID I have
Code:
root@pve:~# lsblk
NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
loop0                                           7:0    0   300G  0 loop 
`-loop0p1                                     259:4    0   300G  0 part 
sda                                             8:0    0   1.7T  0 disk 
`-sda1                                          8:1    0   1.7T  0 part 
  `-luks-fa1483bd-f599-4dcf-9732-c09069472150 252:5    0   1.7T  0 crypt
    |-rna--mepdm--1-vm--100--disk--0          252:6    0   300G  0 lvm   
    `-rna--mepdm--1-rna--pbs--mepdm--1        252:7    0   200G  0 lvm   
nvme0n1                                       259:0    0 931.5G  0 disk 
|-nvme0n1p1                                   259:1    0  1007K  0 part 
|-nvme0n1p2                                   259:2    0     1G  0 part  /boot/efi
`-nvme0n1p3                                   259:3    0   930G  0 part 
  |-pve-swap                                  252:0    0     8G  0 lvm   [SWAP]
  |-pve-root                                  252:1    0    96G  0 lvm   /
  |-pve-data_tmeta                            252:2    0   8.1G  0 lvm   
  | `-pve-data                                252:4    0 793.8G  0 lvm   
  `-pve-data_tdata                            252:3    0 793.8G  0 lvm   
    `-pve-data                                252:4    0 793.8G  0 lvm

I lost the internal SSD, so I lost my VM (which I should be able to restore from the PBS backup), the docker cache of the VM, and the restic backup of server data (of which I still have external backups).

Now, the bad news is that I probably have lost a .tar.gz of /etc on the pve host, and the one I still have might be from PVE 8, so before I upgraded to PVE 9 a few months ago (I hope that upgrade is not what finally killed my SSD controller for some reason, but I don't expect it).

Basically, I need to recreate the LV thin VG and the LVs, but I recall that doing that from the command line did not work years ago when I first set it up (2022, PVE 7). What is my best way forward at this point?
 
In this particular case, the situation and environment are quite complex, so there is no single approach that can be universally considered the “best.” However, if the goal is to ensure a stable and reliable recovery, the following approach can be considered.
Since a rebuild is required, this assumes that both PVE and PBS will be newly installed.

To restore the VM, you would first rebuild PVE and PBS, and then restore the VM from the PBS backup data.
The PBS datastore is likely located on the LUKS device. When adding the datastore, you can enable Reuse existing datastore: option to make the existing backup data available again.

Regarding the PVE recovery, if your /etc backup may be outdated, it is safer to manually reconfigure the necessary parts by referencing /etc and /etc/pve as needed.

You seem concerned about recreating the LV/VG/LVs, but what is the reason this step is necessary?
Since PVE is installed on a single disk, is this related to adjusting the sizes of local and local-lvm that are created by default during installation?
If so, these sizes can be adjusted using installation options:
https://pve.proxmox.com/pve-docs/chapter-pve-installation.html#advanced_lvm_options

If you absolutely need to reintegrate the existing disk layout, it may be less problematic to attempt a hardware-level recovery—such as replacing the internal RAID controller—rather than rebuilding everything from scratch in PVE.
 
  • Like
Reactions: Johannes S