Not enough space on thin lvm

unam

Active Member
Nov 21, 2019
24
3
43
37
Hi,

I have a node running latest up to date Proxmox version.

I configured lvm thin provisionning for users with a Raid0 composed with 2 SSD for about 1To.

Maybe a misconfig or bad usage made it crash. A user provisionned a template with a storage of 500Go.

Everything was running fine for our test and thin provisionning.
A change was introduced in the kickstart contained in the template about a filesystem with option "--grow".

A user tried to start two vms based on this template and everything crashed.

For the moment I have two questions about that :

- I can't delete any volume contained on my virtualgroup, because my thin pool needs a repair.

Code:
root@hypervisor04:~# lvremove /dev/VGRAID0/vm-103-disk-0 
Do you really want to remove and DISCARD logical volume VGRAID0/vm-103-disk-0? [y/n]: y
  Check of pool VGRAID0/lv_thin_build failed (status:1). Manual repair required!
  Failed to update pool VGRAID0/lv_thin_build.

When I try to repair, I don't have enough free space

Code:
root@hypervisor04:~# lvconvert --repair -v VGRAID0/lv_thin_build
    Preparing pool metadata spare volume for Volume group VGRAID0.
  Volume group "VGRAID0" has insufficient free space (0 extents): 30 required.

And when I try to extend my VG, it's already full

Code:
root@hypervisor04:~# lvextend -l +100%FREE VGRAID0/lv_thin_build
  New size (238345 extents) matches existing size (238345 extents).

So : How can I deal with this stuck situation ? Does every devices lost ?

- Is it possible to create 2 or more thin volumes to avoid this kind of errors ?
I mean, if I create 3 thin pools of about 200G on the same VG, and if a vm become too fat to fit in its thin volume, the others will not be impacted ? Am I wrong ?

Thanks for your advices and reply.

Regards,
 
Quick reply,

I tried another way by deleting the meta0 lv created by a repair attempt and start another repair

Code:
root@hypervisor04:~# lvremove VGRAID0/lv_thin_build_meta0
root@hypervisor04:~# lvconvert --repair -v VGRAID0/lv_thin_build
Preparing pool metadata spare volume for Volume group VGRAID0.
    Archiving volume group "VGRAID0" metadata (seqno 1481).
    Creating logical volume lvol0
  WARNING: Sum of all thin volume sizes (1,63 TiB) exceeds the size of thin pools and the size of whole volume group (<931,27 GiB).
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
    Creating volume group backup "/etc/lvm/backup/VGRAID0" (seqno 1482).
    Activating logical volume VGRAID0/lvol0.
    activation/volume_list configuration setting not defined: Checking only host tags for VGRAID0/lvol0.
    Creating VGRAID0-lvol0
    Loading table for VGRAID0-lvol0 (253:1).
    Resuming VGRAID0-lvol0 (253:1).
    Initializing 4,00 KiB of logical volume "VGRAID0/lvol0" with value 0.
    Temporary logical volume "lvol0" created.
    Removing VGRAID0-lvol0 (253:1)
    Renaming lvol0 as pool metadata spare volume lvol0_pmspare.
    activation/volume_list configuration setting not defined: Checking only host tags for VGRAID0/lvol0_pmspare.
    Creating VGRAID0-lvol0_pmspare
    Loading table for VGRAID0-lvol0_pmspare (253:1).
    Resuming VGRAID0-lvol0_pmspare (253:1).
    activation/volume_list configuration setting not defined: Checking only host tags for VGRAID0/lv_thin_build_tmeta.
    Creating VGRAID0-lv_thin_build_tmeta
    Loading table for VGRAID0-lv_thin_build_tmeta (253:3).
    Resuming VGRAID0-lv_thin_build_tmeta (253:3).
    Executing: /usr/sbin/thin_repair  -i /dev/mapper/VGRAID0-lv_thin_build_tmeta -o /dev/mapper/VGRAID0-lvol0_pmspare
    Piping: /usr/sbin/thin_dump /dev/mapper/VGRAID0-lvol0_pmspare
  Transaction id 730 from pool "VGRAID0/lv_thin_build" does not match repaired transaction id 729 from /dev/mapper/VGRAID0-lvol0_pmspare.
    Removing VGRAID0-lv_thin_build_tmeta (253:3)
    Removing VGRAID0-lvol0_pmspare (253:1)
    Preparing pool metadata spare volume for Volume group VGRAID0.
  Volume group "VGRAID0" has insufficient free space (0 extents): 30 required.
  WARNING: LV VGRAID0/lv_thin_build_meta0 holds a backup of the unrepaired metadata. Use lvremove when no longer required.

Then I re-activate a volume and tried to delete it
Code:
root@hypervisor04:~# lvchange -ay /dev/VGRAID0/vm-700-disk-0 
root@hypervisor04:~# lvremove /dev/VGRAID0/vm-700-disk-0 
Do you really want to remove and DISCARD active logical volume VGRAID0/vm-700-disk-0? [y/n]: y
  Logical volume "vm-700-disk-0" successfully removed

So it is ok for me, as it is a test server, I don't care about theses vm.

But the fact is : how to deal with this if it happens in a prod environement ? Or, how to avoid this filling ?

Any suggestion is welcome !

Regards
 
Hi,
glad you were able to sort out the issue.

Is it possible to create 2 or to create 2 or more thin volumes to avoid this kind of errors ?
I mean, if I create 3 thin pools of about 200G on the same VG, and if a vm become too fat to fit in its thin volume, the others will not be impacted ? Am I wrong ?

That would work, but there's two problems with it:
  1. You would need to add each thin pool as its own storage to PVE.
  2. Over-provisioning can only be done within each thin pool. You can't over-provision within the VG: i.e. if space is reserved for a thin pool within the VG, it is actually reserved for that thin pool.
It's true that volumes sharing a thin pool will all be affected at the same time when the thin pool gets full. This is the inherent downside to thin provisioning in general. IMHO the best way to avoid such errors is to be careful about how much space you are actually using and extending the thin pool (and if needed VG) early enough. Avoid creating VMs if you're not sure it won't eat up all the remaining space.
 
Hi,

You are right when you says "Avoid creating VMs if you're not sure it won't eat up all the remaining space".

I take care about this because of my job as a sysadmin. The fact is, finals / casuals proxmox users don't care , and play with it as a sandbox and are not aware of all this technical / admin problematics.

I will play with multiple logical volumes, as it, if one of them is full, it will not impact others.

Anyway, thanks for your reply.

Regards,
 
Hi,

You are right when you says "Avoid creating VMs if you're not sure it won't eat up all the remaining space".

I take care about this because of my job as a sysadmin. The fact is, finals / casuals proxmox users don't care , and play with it as a sandbox and are not aware of all this technical / admin problematics.

I will play with multiple logical volumes, as it, if one of them is full, it will not impact others.

Anyway, thanks for your reply.

Regards,

Sounds like it's more of a user/permission management problem then. You might want to use Resource Pools (Datacenter > Permissions > Pools > Create), make your LVM Thin storage available on that pool (the new pool should appear in the leftmost GUI pane below your nodes, then use Members > Add > Storage) and limit access for users to that pool (see also permission management in the docs). And you also can combine it with your approach and have one LVM Thin storage for each Resource Pool.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!