Volume Sizes exceeds the size of thin pool and free space in volume group

amlanhldr · Jan 10, 2022

Hello All,

I'm doing things on proxmox since a few weeks. So treat me new on these things.

Coming to the point, whenever try to create a new LV or try to create a Snap or backup, I get warnings...

Code:

  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  WARNING: Sum of all thin volume sizes (<172.98 GiB) exceeds the size of thin pool pve/data and the amount of free space in volume group (<16.00 GiB).

But I'm pretty sure I've enough room inside the thin-pool to hold couple of them yet.

Code:

root@pxmx:~# lvs
  LV                       VG  Attr       LSize    Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  cache1                   pve Vwi-aotz--   10.00g data               99.94                                 
  cache2                   pve Vwi-aotz--   10.00g data               0.01                                   
  data                     pve twi-aotz-- <141.43g                    16.54  1.95                           
  root                     pve -wi-ao----   55.75g                                                           
  snap_vm-100-disk-1_Snap1 pve Vri---tz-k   16.00g data vm-100-disk-1                                       
  snap_vm-100-disk-1_Snap2 pve Vri---tz-k   16.00g data vm-100-disk-1                                       
  snap_vm-101-disk-0_Snap1 pve Vri---tz-k   20.00g data vm-101-disk-0                                       
  snap_vm-101-disk-0_Snap2 pve Vri---tz-k   20.00g data vm-101-disk-0                                       
  swap                     pve -wi-ao----    7.00g                                                           
  vm-100-disk-1            pve Vwi-aotz--   16.00g data               9.71                                   
  vm-100-state-Snap1       pve Vwi-a-tz--   <3.49g data               18.07                                 
  vm-100-state-Snap2       pve Vwi-a-tz--   <3.49g data               13.01                                 
  vm-101-disk-0            pve Vwi-aotz--   20.00g data               21.14                                 
  vm-114-disk-0            pve Vwi-aotz--    8.00g data               21.16                                 
  vm-124-disk-0            pve Vwi-a-tz--    6.00g data               67.80                                 
  vm-124-disk-1            pve Vwi-a-tz--    4.00g data               1.63                                   
root@pxmx:~# vgdisplay pve 
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  127
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                16
  Open LV               7
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <223.07 GiB
  PE Size               4.00 MiB
  Total PE              57105
  Alloc PE / Size       53010 / 207.07 GiB
  Free  PE / Size       4095 / <16.00 GiB
  VG UUID               ug5DjO-FyBB-O0Gm-rxLW-ibQC-pPrS-jTgo1i
root@pxmx:~# df -h
Filesystem                  Size  Used Avail Use% Mounted on
udev                        3.8G     0  3.8G   0% /dev
tmpfs                       777M  1.1M  776M   1% /run
/dev/mapper/pve-root         55G  6.3G   46G  13% /
tmpfs                       3.8G   46M  3.8G   2% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
/dev/sda2                   511M  328K  511M   1% /boot/efi
zstore                      3.5T  128K  3.5T   1% /zstore
zstore/media                3.6T   63G  3.5T   2% /opt/media
/dev/fuse                   128M   20K  128M   1% /etc/pve
tmpfs                       777M     0  777M   0% /run/user/0

I'm not finding any clue, what might have gone wrong.!
Please shade some light. Thank you.

fiona · Jan 11, 2022

Hi,

amlanhldr said:
Hello All,

I'm doing things on proxmox since a few weeks. So treat me new on these things.

Coming to the point, whenever try to create a new LV or try to create a Snap or backup, I get warnings...

Code:

WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. WARNING: Sum of all thin volume sizes (<172.98 GiB) exceeds the size of thin pool pve/data and the amount of free space in volume group (<16.00 GiB).

This warns that your thin pool pve/data is currently over-provisioned, i.e. you have assigned more space to volumes that is actually available. As long as there is actual free space nothing is wrong, but once the disks start filling up, it could happen that the pool runs out of space, which is bad.

amlanhldr said:

But I'm pretty sure I've enough room inside the thin-pool to hold couple of them yet.

Code:

root@pxmx:~# lvs
  LV                       VG  Attr       LSize    Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  cache1                   pve Vwi-aotz--   10.00g data               99.94                                
  cache2                   pve Vwi-aotz--   10.00g data               0.01                                  
  data                     pve twi-aotz-- <141.43g                    16.54  1.95                          
  root                     pve -wi-ao----   55.75g                                                          
  snap_vm-100-disk-1_Snap1 pve Vri---tz-k   16.00g data vm-100-disk-1                                      
  snap_vm-100-disk-1_Snap2 pve Vri---tz-k   16.00g data vm-100-disk-1                                      
  snap_vm-101-disk-0_Snap1 pve Vri---tz-k   20.00g data vm-101-disk-0                                      
  snap_vm-101-disk-0_Snap2 pve Vri---tz-k   20.00g data vm-101-disk-0                                      
  swap                     pve -wi-ao----    7.00g                                                          
  vm-100-disk-1            pve Vwi-aotz--   16.00g data               9.71                                  
  vm-100-state-Snap1       pve Vwi-a-tz--   <3.49g data               18.07                                
  vm-100-state-Snap2       pve Vwi-a-tz--   <3.49g data               13.01                                
  vm-101-disk-0            pve Vwi-aotz--   20.00g data               21.14                                
  vm-114-disk-0            pve Vwi-aotz--    8.00g data               21.16                                
  vm-124-disk-0            pve Vwi-a-tz--    6.00g data               67.80                                
  vm-124-disk-1            pve Vwi-a-tz--    4.00g data               1.63

The LSize column shows the provisioned space for each volume and the Data% column shows how much space is currently actually used. The most important thing to keep an eye out is the Data% of the thin pool data itself (currently 16.54).

amlanhldr said:

Code:

root@pxmx:~# vgdisplay pve
  --- Volume group ---
  VG Name               pve
  System ID            
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  127
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                16
  Open LV               7
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <223.07 GiB
  PE Size               4.00 MiB
  Total PE              57105
  Alloc PE / Size       53010 / 207.07 GiB
  Free  PE / Size       4095 / <16.00 GiB
  VG UUID               ug5DjO-FyBB-O0Gm-rxLW-ibQC-pPrS-jTgo1i
root@pxmx:~# df -h
Filesystem                  Size  Used Avail Use% Mounted on
udev                        3.8G     0  3.8G   0% /dev
tmpfs                       777M  1.1M  776M   1% /run
/dev/mapper/pve-root         55G  6.3G   46G  13% /
tmpfs                       3.8G   46M  3.8G   2% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
/dev/sda2                   511M  328K  511M   1% /boot/efi
zstore                      3.5T  128K  3.5T   1% /zstore
zstore/media                3.6T   63G  3.5T   2% /opt/media
/dev/fuse                   128M   20K  128M   1% /etc/pve
tmpfs                       777M     0  777M   0% /run/user/0

I'm not finding any clue, what might have gone wrong.!
Please shade some light. Thank you.

amlanhldr · Jan 11, 2022

Fabian_E said:
Hi,
This warns that your thin pool pve/data is currently over-provisioned, i.e. you have assigned more space to volumes that is actually available. As long as there is actual free space nothing is wrong, but once the disks start filling up, it could happen that the pool runs out of space, which is bad.

Hi @Fabian_E , Thank you for quick and detailed response. I have few more queries though..
So far I understand,
I initially have created the VG, the root directory and thinpool 'data' under that VG.
Now the thinpool 'data' has around 151G and all the other logical volumes are inside that thinpool.
Then, how come the "Sum of all thin volume sizes (<172.98 GiB)" -- this calculation of 172.98G is derived from?

--Thank you
Amlan

fiona · Jan 12, 2022

amlanhldr said:
Hi @Fabian_E , Thank you for quick and detailed response. I have few more queries though..
So far I understand,
I initially have created the VG, the root directory and thinpool 'data' under that VG.
Now the thinpool 'data' has around 151G and all the other logical volumes are inside that thinpool.

amlanhldr said:
Then, how come the "Sum of all thin volume sizes (<172.98 GiB)" -- this calculation of 172.98G is derived from?

A thin volume is one that lives inside a thin pool and hence only occupies the space it actively needs, but it still has a size (what it can maximally occupy) when viewed as a block device. So if you sum up all LSize from the volumes where Pool is data you should arrive at that value.

nick.loenders · Oct 19, 2022

Hi, I am having a big issue here as well.

but how can I solve this?

The disks in the VMs is not full at all, but yet there are problems now. 1 VM is even unable to start.

Dunuin · Oct 19, 2022

You didn't monitor your storage so the case happened that should never happen. You completely filled up your LVM-thin pool which might cause data loss and corruption.

First I would try to do a discard/TRIM/fstrim inside your guests. If that isn't helping try to destroy snapshots. And if that also isn't working delete files on those virtual disks. And if that isn't working backup vm-100-disk-0 using dd and then destroy it to free up space.

I hope you got recent backups as your data might already be damaged.

nick.loenders · Oct 19, 2022

No, I'm sorry I didn't monitor it as WHY would it grow, on the VM disks there is plenty of room.

How do I do a discard/TRIM/fstrim inside my guest?
There are no snapshots.

I am afraid there are no backups no.... If only I can access the D drive on the VM it will be alright though.

Dunuin · Oct 19, 2022

nick.loenders said:
No, I'm sorry I didn't monitor it as WHY would it grow, on the VM disks there is plenty of room.

Thats how thin provisioning works. You can have 1TB physical disk and create two VM with 5TB virtual disks each on it. Both VMs will then of cause tell you there is plenty of space as they will think they can write 10 TB of data. Which of cause isn't the case in reality, as you can't store 10TB of data on 1TB physical disk. No matter what the guests are reporting how much free space is available, all guests together can't store than 1 TB of data. And it's your duty as admin to daily log in into the webUI or the shell to check that your thin pool got plenty of space left. As nothing will prevent the guests from filling up that pool until you loose data. Its also highly recommended to set up some kind of monitoring tool like zabbix so you early get some warn emails when the space on the thin pool is getting low.

nick.loenders said:
How do I do a discard/TRIM/fstrim inside my guest?

Depends on your guest OS. For linux you could run a fstrim -a for windows yu could run a Optimize-Volume -DriveLetter YourDriveLetter -ReTrim -Verbose. This of cause will only work if you didn't setup your storage the wrong way. The complete TRIM chain from guest OS to physical disk need to work. So "discard" flag set for all virtual disks, "virtio SCSI" used as virtual disk controller and not IDE or "virtio block", a physical disk controller that supports TRIM commands, SSD emulation enabled for virtual disks, ...
And thin provisioning requires that this whole TRIM chain is working. Otherwise the thin pool will grow and grow and nothing that got deleted will really be deleted and your thin pool is filling up, as it can't free up any space.

nick.loenders · Oct 19, 2022

Dunuin said:
Thats how thin provisioning works. You can have 1TB physical disk and create two VM with 5TB virtual disks each on it. Both VMs will then of cause tell you there is plenty of space as they will think they can write 10 TB of data. Which of cause isn't the case in reality, as you can't store 10TB of data on 1TB physical disk. No matter what the guests are reporting how much free space is available, all guests together can't store than 1 TB of data. And it's your duty as admin to daily log in into the webUI or the shell to check that your thin pool got plenty of space left. As nothing will prevent the guests from filling up that pool until you loose data.

Depends on your guest OS. For linux you could run a fstrim -a for windows yu could run a Optimize-Volume -DriveLetter YourDriveLetter -ReTrim -Verbose. This of cause will only work if you didn't setup your storage the wrong way. The complete TRIM chain from guest OS to physical disk need to work. So "discard" flag set for all virtual disks, "virtio SCSI" used as virtual disk controller and not IDE or "virtio block", a physical disk controller that supports TRIM commands, SSD emulation enabled for virtual disks, ...

Hi,
this is on 1 server:

The other has 50GB FRee of 80GB.

So the guests itselfs are not using more than provided

About this:
Depends on your guest OS. For linux you could run a fstrim -a for windows yu could run a Optimize-Volume -DriveLetter YourDriveLetter -ReTrim -Verbose. This of cause will only work if you didn't setup your storage the wrong way. The complete TRIM chain from guest OS to physical disk need to work. So "discard" flag set for all virtual disks, "virtio SCSI" used as virtual disk controller and not IDE or "virtio block", a physical disk controller that supports TRIM commands, SSD emulation enabled for virtual disks, ...

the host is proxmox, so linux. The guests are Windows server.
But I can only go in the SHELL via the instance directly under Datacenter.

It is VirtIO SCSI

Disk is :

Can I still enable SSD emulation and the Discard option?
Or would it not help?

Michael.Uray · Nov 12, 2023

fiona said:
it could happen that the pool runs out of space, which is bad.

I had the situation yesterday.
It ended up with a corrupt filesystem in the LXC and I had to reinstall the LXC.

I expected that in this case the containers just cannot write to the disk anymore, but not such a desaster.
Is this the way how it should be by its design that the filesystem could get corrupted in such a situation?

Dunuin · Nov 12, 2023

Michael.Uray said:
Is this the way how it should be by its design that the filesystem could get corrupted in such a situation?

Yes, that's normal with thin-provisioning. Thats why you should set up proper monitoring with notifications and extend your storage early so you don't run into this situation in the first place.
The filesystem on the guestOS can't know that the space is running out and only realizes it once all the writes will fail which is similar to a disk that is suddenly failing.

SquirrelStun · Oct 3, 2024

I realize this is an old thread, but I'm stuck in this situation, too, and I'm not sure what to do. I tried running lvconvert --repair pve/data and got the warnings/errors in OP's initial post. I'm unable to access any of my VMs to check how much space they're using. I do have a couple VMs that I never use that I'd have no qualms about deleting. Would that maybe fix the issue? Or is that more like a band-aid solution?

As a side note, I'm also unable to access the web interface for some reason, even though yesterday it was just fine. I have to switch my keyboard into Bluetooth mode and access the console on the mobile app on my phone, which is a royal pain. If anyone's got a clue what that's about, I'm all ears.

Dunuin · Oct 6, 2024

SquirrelStun said:
I'm also unable to access the web interface for some reason, even though yesterday it was just fine.

Maybe your root filesystem is full as well? This could cause the webUI to fail becasue of a read-only filesystem. You could check that via CLI with "df -h" and look for the like with the single "/" if it is at 100% utilization.

SquirrelStun · Oct 12, 2024

Dunuin said:
Maybe your root filesystem is full as well? This could cause the webUI to fail becasue of a read-only filesystem. You could check that via CLI with "df -h" and look for the like with the single "/" if it is at 100% utilization.

I took a look last night and I don't recall exactly what it said, but df -h didn't report that the root filesystem was full. However, I did free up some space by moving some VMs to my JBOD via TrueNAS, which I didn't know you could do, and that did solve the web interface accessibility issue. As for my other issues, they seemed to fix themselves...?

Search

Search

Volume Sizes exceeds the size of thin pool and free space in volume group

amlanhldr

Member

fiona

Proxmox Staff Member

amlanhldr

Member

fiona

Proxmox Staff Member

nick.loenders

New Member

Dunuin

Distinguished Member

nick.loenders

New Member

Dunuin

Distinguished Member

nick.loenders

New Member

Michael.Uray

Renowned Member

Dunuin

Distinguished Member

SquirrelStun

New Member

Dunuin

Distinguished Member

SquirrelStun

New Member

We value your privacy