disk full?

fresh

New Member
Nov 28, 2022
17
0
1
Hello,

I'm getting random errors in different lxc.
I think it's because I ran out of storage.

sda = 238GB SSD
sdb = 4TB HDD (is nearly full, but can be ignored, it's just media)

How can I proove this? Where I exactly can see, if sda is full?
And how can I reassign the data? Because in general, I don't think I'm out of space. If so, I would like to see the files that are supposed to be that large.
Code:
root@pve:~# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                            8:0    0 238.5G  0 disk
├─sda1                         8:1    0  1007K  0 part
├─sda2                         8:2    0   512M  0 part /boot/efi
└─sda3                         8:3    0   238G  0 part
  ├─pve-swap                 253:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                 253:1    0  59.3G  0 lvm  /
  ├─pve-data_tmeta           253:2    0   1.6G  0 lvm
  │ └─pve-data-tpool         253:4    0 151.6G  0 lvm
  │   ├─pve-data             253:5    0 151.6G  1 lvm
  │   ├─pve-vm--102--disk--0 253:6    0    32G  0 lvm
  │   ├─pve-vm--104--disk--0 253:7    0     8G  0 lvm
  │   ├─pve-vm--105--disk--0 253:8    0     8G  0 lvm
  │   ├─pve-vm--103--disk--0 253:9    0     8G  0 lvm
  │   ├─pve-vm--101--disk--0 253:10   0   100G  0 lvm
  │   ├─pve-vm--199--disk--0 253:11   0     8G  0 lvm
  │   ├─pve-vm--106--disk--0 253:12   0    32G  0 lvm
  │   └─pve-vm--107--disk--0 253:13   0     4G  0 lvm
  └─pve-data_tdata           253:3    0 151.6G  0 lvm
    └─pve-data-tpool         253:4    0 151.6G  0 lvm
      ├─pve-data             253:5    0 151.6G  1 lvm
      ├─pve-vm--102--disk--0 253:6    0    32G  0 lvm
      ├─pve-vm--104--disk--0 253:7    0     8G  0 lvm
      ├─pve-vm--105--disk--0 253:8    0     8G  0 lvm
      ├─pve-vm--103--disk--0 253:9    0     8G  0 lvm
      ├─pve-vm--101--disk--0 253:10   0   100G  0 lvm
      ├─pve-vm--199--disk--0 253:11   0     8G  0 lvm
      ├─pve-vm--106--disk--0 253:12   0    32G  0 lvm
      └─pve-vm--107--disk--0 253:13   0     4G  0 lvm
sdb                            8:16   0   3.6T  0 disk
├─sdb1                         8:17   0    16M  0 part
└─sdb2                         8:18   0   3.6T  0 part /mnt/pve/data

Code:
root@pve:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
udev                  7.5G  8.0K  7.5G   1% /dev
tmpfs                 1.5G  1.2M  1.5G   1% /run
/dev/mapper/pve-root   59G   21G   35G  37% /
tmpfs                 7.5G   46M  7.5G   1% /dev/shm
tmpfs                 5.0M     0  5.0M   0% /run/lock
/dev/sda2             511M  328K  511M   1% /boot/efi
/dev/sdb2             3.7T  3.6T   57G  99% /mnt/pve/data
/dev/fuse             128M   20K  128M   1% /etc/pve
tmpfs                 1.5G     0  1.5G   0% /run/user/0

Thanks in advance!
fresh
 
Last edited:
I don't know.
Can you tell me if my thin pool is full?
Where does the 151 GB come from?
238,5GB (total) - 59,25GB (ProxMox root) - 8,00GB (SWAP) = 171,25GB (for data?)

Code:
root@pve:~# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotzD- <151.63g             100.00 4.36
  root          pve -wi-ao----   59.25g
  swap          pve -wi-ao----    8.00g
  vm-101-disk-0 pve Vwi-aotz--  100.00g data        91.95
  vm-102-disk-0 pve Vwi-aotz--   32.00g data        62.44
  vm-103-disk-0 pve Vwi-aotz--    8.00g data        65.22
  vm-104-disk-0 pve Vwi-a-tz--    8.00g data        56.86
  vm-105-disk-0 pve Vwi-aotz--    8.00g data        99.55
  vm-106-disk-0 pve Vwi-aotz--   32.00g data        54.20
  vm-107-disk-0 pve Vwi-aotz--    4.00g data        57.80
  vm-199-disk-0 pve Vwi-a-tz--    8.00g data        28.88

I checked lxc 105 for example.
lvs says 99,55% full.

Proxmox says 42% full:
1688560326577.png

I can't get through this storage information :-D
 
Last edited:
I don't know.
Can you tell me if thin pool is full?
Where does the 151 GB come from?



I checked lxc 105 for example.
lvs says 99,55% full.

Proxmox says 42% full:
View attachment 52602

I can't get through this storage information :-D
Please put console output in CODE-Tags, otherwise these tables aren't readable. But if I interpret that "table" right, your data% of your "data" thin pool is 100, so then it would be full. Which means you also probably lost some data as this should NEVER happen. You should set up some proper monitoring with notifications for the future so this won't happen again.
 
Please put console output in CODE-Tags, otherwise these tables aren't readable. But if I interpret that "table" right, your data% of your "data" thin pool is 100, so then it would be full. Which means you also probably lost some data as this should NEVER happen. You should set up some proper monitoring with notifications for the future so this won't happen again.
I changed to code. Thanks.

How can I see, which files need so much storage?
ncdu on root "/" gives me nothing big.

Why did I "loose data"? Isn't it just not written because disk is full?
 
Last edited:
Any further ideas?
SSD can't be completely full.
I just downloaded a 10GB test file to the 101 LXC and a 1GB test file to the 105 LXC.
It works. I would say that this is a prove that whether data nor vm-105-disk-0 is full and lvs data column is printing wrong values.

I'm a windows guy. And linux/lxc is driving me crazy with this problem. Why is it so hard to analyze the overall disk usage including the biggest files...

PS: according to the following reddit post, 100% should be normal...
https://www.reddit.com/r/Proxmox/co...?utm_source=reddit&utm_medium=web2x&context=3

1688818183705.png
 
Last edited:
I think you need to consider you're comparing apples to oranges there. The first image, of the bootdisk, is the bootdisk... not the LVM disk.
That bootdisk is 7.78GB big. 3.29GB is used, which is approx 42.37% of it.
Your local-lvm is a different disk partition or drive; it is 162.81GB big, of which 100% is occupied.

I note from the LVM allocations, that you have assigned more available space to your VMs than physical space on your system... that's a most "exciting" approach. First, though, that is telling you the truth, 100% has been allocated to data. Now, at a guess, this is possible due to no trim of the HDDs being performed, so discard might not be enabled on the VM disks. This means you might be able to delete a file in one of the VMs, and have space which is not being freed in the local-lvm. This also makes it possible for a file to be copied in and out.

However, that's playing a game of chicken, and it will go wrong randomly. I suggest backing up what you have left and reassessing your allocations. Ensure the discard checkmark is enabled for the HDDs allocated to each VM, check if they perform trim, etc.

As there is no info on the OS's running, I can't recommend what you should do there.
 
  • Like
Reactions: fresh and Dunuin
I think I understood (mostly :p). Thanks for this information!

I thought the lxc just take how much they need. It seems my intention was too dynamic for Linux. I did this because I don't know before what size exactly every lxc needs and further more I added more lxc than planned and I can't shrink existing lxc volumes.

Is there no possibility to get this a little bit more dynamic? Do I really have to allocate the storage exactly when creating the lxc? And if I need storage for a new lxc, I have to delete an existing lxc and create it new with smaller volume?
 
You could also allocate 100TB to each LXC. This is basically no problem. But nothing will prevent the LXCs from trying to use the full 100TB until the pool is full, the LXCs will crash and you lose data as data in volatile RAM write cache will be lost on the next reboot as there is no space left to flush the cache.

It's your job to monitor the pool usage and delete data or replace disks so it will NEVER hit those 100%. If you don't want to monitor your pool 24/7 or set up some proper monitoring with Messeger/Mail notifications, you shouldn't overprovision a thin-provisioned storage.

And like already said, you need to set up trimming/discard for every VM and LXC when using thin-provisioning so the pool can free up space when deleting data in a guest.

"Install and forget" is not a valid way to manage a server.
 
Last edited:
  • Like
Reactions: fresh and Neobin
I think the trim/discard function was what made the difference to my thoughts on how Linux works.
I've only known trim with SSDs in general, but as I understand it, it releases storage allocated to LXC back to the host system and thus also for other LXC -> so exactly what I want.

Now I would like to enable this option for all my LXC.
Unfortunately I do not know how?
I can't find a "discard" checkmark in the mount settings:
1688914454849.png
Shouldn't that actually be there?
Is it even possible to do this retroactively?

Or do I just have to run "fstrim -a" via CRON? If yes, where? On host? On every LXC?

Maybe few more info to my setup:
1x SSD 240GB (on this SSD proxmox itself and all lxc thin-volumes are stored)
1x HDD 4TB (just media, only mounted to host and forwarded to one lxc)
 
Shouldn't that actually be there?
For VMs yes, but not for LXCs. You trim a LXC by running pct fstrim YourVMID on the host. Could be automated by crontab but make sure you don't run this when a backup job could run at the same time, as a pct fstrim will lock the LXC and the backup will fail.
 
  • Like
Reactions: fresh
Code:
root@pve:~# pct fstrim 102
/var/lib/lxc/102/rootfs/: 23 GiB (24665104384 bytes) trimmed
/var/lib/lxc/102/rootfs/mnt/syncthing: 150 MiB (157319168 bytes) trimmed

Code:
root@pve:~# pct fstrim 103
/var/lib/lxc/103/rootfs/: 5.5 GiB (5891125248 bytes) trimmed
/var/lib/lxc/103/rootfs/mnt/syncthing: 37.8 GiB (40628310016 bytes) trimmed

...

final lvs:
Code:
root@pve:~# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <151.63g             45.04  2.82
  root          pve -wi-ao----   59.25g
  swap          pve -wi-ao----    8.00g
  vm-101-disk-0 pve Vwi-aotz--  100.00g data        36.10
  vm-102-disk-0 pve Vwi-aotz--   32.00g data        28.39
  vm-103-disk-0 pve Vwi-aotz--    8.00g data        31.86
  vm-104-disk-0 pve Vwi-a-tz--    8.00g data        26.18
  vm-105-disk-0 pve Vwi-aotz--    8.00g data        56.12
  vm-106-disk-0 pve Vwi-aotz--   32.00g data        30.23
  vm-107-disk-0 pve Vwi-aotz--    4.00g data        49.96
  vm-199-disk-0 pve Vwi-a-tz--    8.00g data        28.88

Looks fantastic! I hope this will fix my storage issues!
Ofc I will keep an eye on my storage usage in general :)

Thanks to all!

PS: if somebody needs: pct list | awk -F " " '{print $1}' | while read ct; do pct fstrim $ct; done; trims every lxc.
 
Last edited:
Code:
root@pve:~# pct fstrim 102
/var/lib/lxc/102/rootfs/: 23 GiB (24665104384 bytes) trimmed
/var/lib/lxc/102/rootfs/mnt/syncthing: 150 MiB (157319168 bytes) trimmed

Code:
root@pve:~# pct fstrim 103
/var/lib/lxc/103/rootfs/: 5.5 GiB (5891125248 bytes) trimmed
/var/lib/lxc/103/rootfs/mnt/syncthing: 37.8 GiB (40628310016 bytes) trimmed

...

final lvs:
Code:
root@pve:~# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <151.63g             45.04  2.82
  root          pve -wi-ao----   59.25g
  swap          pve -wi-ao----    8.00g
  vm-101-disk-0 pve Vwi-aotz--  100.00g data        36.10
  vm-102-disk-0 pve Vwi-aotz--   32.00g data        28.39
  vm-103-disk-0 pve Vwi-aotz--    8.00g data        31.86
  vm-104-disk-0 pve Vwi-a-tz--    8.00g data        26.18
  vm-105-disk-0 pve Vwi-aotz--    8.00g data        56.12
  vm-106-disk-0 pve Vwi-aotz--   32.00g data        30.23
  vm-107-disk-0 pve Vwi-aotz--    4.00g data        49.96
  vm-199-disk-0 pve Vwi-a-tz--    8.00g data        28.88

Looks fantastic! I hope this will fix my storage issues!
Ofc I will keep an eye on my storage usage in general :)

Thanks to all!

PS: if somebody needs: pct list | awk -F " " '{print $1}' | while read ct; do pct fstrim $ct; done; trims every lxc.
thank you for this! This thread saved ~10TB of storage, lol. I didn't pay attention to my storage and my server got locked up because I had never ran sudo fstrim -a, or these on my LXC's. Freed up like ~4tb of space. Cheers
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!