ceph thin provisioning for lxc's not working as expected?

lifeboy

Active Member
I have an LXC that is provisioned with a 100GB boot drive using ceph RBD storage. However, see the following:

Code:
~# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd10       98G  8.8G   85G  10% /

This is in the running container.

Checking the disk usage in ceph however, claims the whole volume is basically used.

Code:
FT1-NodeA:~# rbd du speedy/vm-192-disk-2
NAME           PROVISIONED  USED 
vm-192-disk-2      100 GiB  97 GiB

Why is this? There doesn't seem to be a tool with which to reclaim the unused blocks for an lxc...?
 

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
Is the container issuing TRIM commands? Can you try running pct fstrim <CTID> and see if that helps?
 
Last edited:

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
You could try running fstrim -a from within the container, does that work? Not sure if that should change anything though, could you post your container config?
 
Last edited:

lifeboy

Active Member
You could try running fstrim -a from within the container, does that work? Not sure if that should change anything though, could you post your container config?
Of course that gives the same result. For some reason the container believes that the storage doesn't support trimming, i.e. it's not thin provisioned. However, some other volumes on the same ceph storage pool are completely ok with trimming.

Could there be something that's set in the container config that prevents this?

Code:
~# cat /etc/pve/lxc/192.conf
arch: amd64
cores: 4
features: nesting=1
hostname: productive
memory: 8192
mp0: speedy:vm-192-disk-3,mp=/home/user-data/owncloud,backup=1,size=1900G
nameserver: 8.8.8.8
net0: name=eth0,bridge=VLAN11,firewall=1,gw=192.168.142.254,hwaddr=86:6B:32:CE:F0:D3,ip=192.168.142.101/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: speedy:vm-192-disk-2,size=100G
searchdomain: co.za
swap: 0
unprivileged: 1
 

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
Of course that gives the same result.
Ye, I figured as much.

I think the issue here is that either your container is unprivileged, or that you have a mountpoint (or both).
 

lifeboy

Active Member
Ye, I figured as much.

I think the issue here is that either your container is unprivileged, or that you have a mountpoint (or both).
Does it mean that if you have a mountpoint (over and above the boot drive), thin-provisioning doesn't work?

Code:
~# cat /etc/pve/lxc/192.conf
arch: amd64
cores: 4
features: nesting=1
hostname: productive
memory: 8192
nameserver: 8.8.8.8
net0: name=eth0,bridge=VLAN11,firewall=1,gw=192.168.142.254,hwaddr=86:6B:32:CE:F0:D3,ip=192.168.142.101/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: speedy:vm-192-disk-2,size=100G
mp0: speedy:vm-192-disk-3,mp=/home/user-data/owncloud,backup=1,size=1900G
searchdomain: co.za
swap: 0
unprivileged: 1

In the above, the mp0 shows as thin-provisioned, but the rootfs does not. Can I do this differently to make both properly thin provisioned?
 

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
Does it mean that if you have a mountpoint (over and above the boot drive), thin-provisioning doesn't work?

Code:
~# cat /etc/pve/lxc/192.conf
arch: amd64
cores: 4
features: nesting=1
hostname: productive
memory: 8192
nameserver: 8.8.8.8
net0: name=eth0,bridge=VLAN11,firewall=1,gw=192.168.142.254,hwaddr=86:6B:32:CE:F0:D3,ip=192.168.142.101/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: speedy:vm-192-disk-2,size=100G
mp0: speedy:vm-192-disk-3,mp=/home/user-data/owncloud,backup=1,size=1900G
searchdomain: co.za
swap: 0
unprivileged: 1

In the above, the mp0 shows as thin-provisioned, but the rootfs does not. Can I do this differently to make both properly thin provisioned?
I tried it now locally, I could not get the container to issue TRIM commands as long as it was unprivileged, as soon as I changed to a privileged container it worked. I think it is the kernel that doesn't let an unprivileged container issue TRIM commands to block devices (which Ceph RBD is).
 

lifeboy

Active Member
I tried it now locally, I could not get the container to issue TRIM commands as long as it was unprivileged, as soon as I changed to a privileged container it worked. I think it is the kernel that doesn't let an unprivileged container issue TRIM commands to block devices (which Ceph RBD is).

I don't think it's a good idea to run privileged containers for clients, not? If a UID matches one of the host's UIDs that has rights to locations a client should not have access to, it may create a big problem...
 

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
I don't think it's a good idea to run privileged containers for clients, not? If a UID matches one of the host's UIDs that has rights to locations a client should not have access to, it may create a big problem...
Yes, it also should be ran from the host, on second thought. I just thought they were somehow related, but they shouldn't be - a little bit of confusion from my part sorry.

fstrim: /: FITRIM ioctl failed: Operation not permitted
This output seems a bit weird to me. Is this the whole output of pct fstrim <CTID> running on the host? Because it should try to trim the directory /var/lib/lxc/<CTID>/rootfs/ and not /.

edit: Could you also post your storage.cfg ?
 
Last edited:

lifeboy

Active Member
Yes, it also should be ran from the host, on second thought. I just thought they were somehow related, but they shouldn't be - a little bit of confusion from my part sorry.


This output seems a bit weird to me. Is this the whole output of pct fstrim <CTID> running on the host? Because it should try to trim the directory /var/lib/lxc/<CTID>/rootfs/ and not /.

:redface: Of course, the command has to run on the node on which the container is running...!

Code:
~# pct fstrim 192
/var/lib/lxc/192/rootfs/: 88.9 GiB (95446147072 bytes) trimmed
/var/lib/lxc/192/rootfs/home/user-data/owncloud: 1.6 TiB (1795599138816 bytes) trimmed

However, when I ask rbd for the stats, I get:

Code:
~# rbd du speedy/vm-192-disk-2
NAME           PROVISIONED  USED 
vm-192-disk-2      100 GiB  75 GiB

In the container however, I still see:

Code:
/dev/rbd10       98G  8.9G   84G  10% /

Does it take time to reflect correctly?

With qemu guests the trimming happens periodically (i.e. in Window is once a week). I support if one wants that to run regularly, it has to be scripted, right? How would that work with HA then, since a container could be running on a different node at some point in the future?

edit: Could you also post your storage.cfg ?

I guess it doesn't matter anymore now.
 
Last edited:

shanreich

Proxmox Staff Member
Staff member
Sep 1, 2022
366
64
28
Ceph RBD has some peculiarities with regards to trimming, especially if the sector size of your disk isn't aligned with the RBD object size.

Could you try using the --exact flag with ceph rbd? It should give a more accurate number of used space:
Code:
rbd du --exact speedy/vm-192-disk-2
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!