This is a long story, so I'm just gonna summarize it. This week my proxmox server started having issues with its SSD. I managed to fix it by running fsck a few times + lvconvert --repair. I thought it was a one-time issue, so I left it be. Today the issues returned.
The server has 3 drives: call them sda (the SSD), sdb (HDD), sdc (HDD) and 2 VMs.
Figuring sda was a lost cause, I installed a fresh copy of Proxmox 7 on sdc.
During the install, the volume group on sda was renamed to pve-OLD-059360C7 by the installer.
Now I'm on the proxmox install on sdc and I want to try to recover as much from sda as possible. For instance, I'm pretty sure I can salvage most of the 50GB Windows install. I created an LVM-Thin storage with the pve-OLD-059360C7 volume group and data thin pool, attached the old Windows disk to my instance, and then tried to "Move" the instance over to my sdc HDD where I know it would be safe from further corruption.
At around 42% in the transfer, qemu-img failed with input/output error. Understandable, the storage is corrupt after all:
So I figured I can try to repair the pve-OLD-059360C7/data LVM:
Repairing pve-OLD-059360C7/vm-200-disk-0 doesn't work either:
I tried many other commands - fsck, e2fsck, mount, and others I can't remember - almost none of them successfully run.
At this point I'm out of ideas of what to try. While I have most of my important data backed up, I would like to regain access to the files on the sda drive. If anything, it might be easier to repair the storage and clone it to sdc instead of re-installing and re-configuring the OSes. It would also help with the files that I don't have good backups for.
Any ideas?
The server has 3 drives: call them sda (the SSD), sdb (HDD), sdc (HDD) and 2 VMs.
Figuring sda was a lost cause, I installed a fresh copy of Proxmox 7 on sdc.
During the install, the volume group on sda was renamed to pve-OLD-059360C7 by the installer.
Now I'm on the proxmox install on sdc and I want to try to recover as much from sda as possible. For instance, I'm pretty sure I can salvage most of the 50GB Windows install. I created an LVM-Thin storage with the pve-OLD-059360C7 volume group and data thin pool, attached the old Windows disk to my instance, and then tried to "Move" the instance over to my sdc HDD where I know it would be safe from further corruption.
At around 42% in the transfer, qemu-img failed with input/output error. Understandable, the storage is corrupt after all:
Code:
qemu-img: error while reading at byte 21967662080: Input/output error
TASK ERROR: storage migration failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve-OLD-059360C7/vm-200-disk-0 zeroinit:/dev/pve/vm-200-disk-1' failed: exit code 1
So I figured I can try to repair the pve-OLD-059360C7/data LVM:
Code:
lvchange -an pve-OLD-059360C7/data -ff
lvconvert --repair pve-OLD-059360C7/data
Active pools cannot be repaired. Use lvchange -an first.
Repairing pve-OLD-059360C7/vm-200-disk-0 doesn't work either:
Code:
# lvconvert --repair pve-OLD-059360C7/vm-200-disk-0
Command on LV pve-OLD-059360C7/vm-200-disk-0 does not accept LV type thin.
Command not permitted on LV pve-OLD-059360C7/vm-200-disk-0.
I tried many other commands - fsck, e2fsck, mount, and others I can't remember - almost none of them successfully run.
At this point I'm out of ideas of what to try. While I have most of my important data backed up, I would like to regain access to the files on the sda drive. If anything, it might be easier to repair the storage and clone it to sdc instead of re-installing and re-configuring the OSes. It would also help with the files that I don't have good backups for.
Any ideas?