Storage pool unavailable after clonezilla clone, how to clean up

KoenDG

Member
May 31, 2022
13
1
6
Disclaimer: Yes, I should have looked this up beforehand, it is listed in various places: Clonezilla does not properly back up LVM thins. I might have missed it, but I didn't see it in the proxmox documentation itself.

I had to replace the disks for that hold the boot sector of my proxmox machine. Also on these disks: a storage pool. That is now no longer responsive, after the restore.

I have a second storage pool, and that works fine. The boot sector works fine too.

The error message I'm getting is:

Code:
'pve/data' failed: Thin pool pve-data-tpool (253:23) transaction_id is 0, while expected 44. (500)

There's nothing crucial on it, but now I'm sitting here with a 1TB SSD (RAID1) that I basically cannot utilize anymore.

Googling and asking chatgpt and the likes, I'm getting suggestions for using "lvconvert --repair pve/data;" or "vgcfgbackup pve;"

Or just deleting the pool, but then I'm only getting CLI commands like "lvremove pve/data;" and "lvcreate -L 1T -T pve/data;"

This seems very risky, and I would like to be sure I can't just remove it via the UI, instead of via the commandline. Is there a clean, documented procedure for this? Just removing a storage pool. If it's "damaged" so to speak.

The proxmox is version 7.2, which isn't recent. I should look into updating it, once I get it up and running smooth again.

Thanks for any help or advice.
 
Last edited:
If the pool metadata is corrupted, then there are only CLI commands available for fixing that.
You might have a look at the specific errors with thin_check /dev/mapper/pve-data_tmeta
but at the end running lvconvert --repair pve/data is a useful way.

If you want to remove it, then CLI is your only option.
 
If the pool metadata is corrupted, then there are only CLI commands available for fixing that.
You might have a look at the specific errors with thin_check /dev/mapper/pve-data_tmeta
but at the end running lvconvert --repair pve/data is a useful way.

If you want to remove it, then CLI is your only option.

I tried the thin_check, but that path doesn't exist.

Also tried the repair, this is the output:

Code:
lvconvert --repair pve/data
  Transaction id 45 from pool "pve/data" does not match repaired transaction id 0 from /dev/mapper/pve-lvol0_pmspare.
  WARNING: LV pve/data_meta0 holds a backup of the unrepaired metadata. Use lvremove when no longer required.

Attempts at removing logical volumes also fail:

Code:
lvremove -y --force /dev/pve/vm-101-disk-0
Thin pool pve-data-tpool (253:23) transaction_id is 0, while expected 45.
Failed to update pool pve/data.

And I can't remove the physical volume, it has the root parition on it.

Seems at this point my only option is to back up all the disks and settings, and do a fresh install?
 
Mind making a backup first place.

If killing the thinpool is an option, you can try the following:
Code:
lvchange -an pve/data
dmsetup ls | grep pve-data
# Remove each one shown:
dmsetup remove pve-data-tpool
dmsetup remove pve-data_tdata
dmsetup remove pve-data_tmeta
# etc.

lvchange -ay pve/data --refresh

If there were any vm disks on you can try starting the vms, if it doesn't work you will need to remove the disks with lvremove pve/vm-* and the vms.
Afterwards you can also remove the metadata backup with lvremove pve/data_meta0 created by the lvconvert command.
 
Last edited:
Mind making a backup first place.

If killing the thinpool is an option, you can try the following:
Code:
lvchange -an pve/data
dmsetup ls | grep pve-data
# Remove each one shown:
dmsetup remove pve-data-tpool
dmsetup remove pve-data_tdata
dmsetup remove pve-data_tmeta
# etc.

lvchange -ay pve/data --refresh

If there were any vm disks on you can try starting the vms, if it doesn't work you will need to remove the disks with lvremove pve/vm-* and the vms.
Afterwards you can also remove the metadata backup with lvremove pve/data_meta0 created by the lvconvert command.
"dmsetup ls" also came up empty for pve-data.

What I ended up doing was taking a copy of the pve metadata:

Code:
cp /etc/lvm/backup/pve /root/pve.meta.edit

And modifying that copy, removing the pool and its metadata,and references to any disks on it, careful not to touch root and swap.

Then running:

Code:
vgcfgrestore -f /root/pve.meta.edit

That worked. Then I created a new thin pool with the same name:

Code:
lvcreate -L 600G -T pve/data

And that did it. It appeared under "LVM-Thin". Giving it a different name would require editting the name in /etc/pve/storage.cfg

Worked for me. No guarantees for anyone else. Part of my situation was that I copied all disks that were on this pool, before I encountered the problem. So I could remove them. This might not be the case for everyone who runs into something like this.
 
  • Like
Reactions: fba