Reclaiming free space in VMs and Containers with CEPH RBD images

Gert · Nov 25, 2019

I used to be able to reclaim free space of VM's when I was still using ZFS, but with RBD images on Ceph it does not seem to work properly, I manage to reclaim some of the free space but not all. I have the same issue with VM's and Containers. I have discard enabled and the follow the following steps:

First I fill the container's free space with zeros using the following command:

dd if=/dev/zero of=/tmp/bigfile bs=1M; rm -f /tmp/bigfile

Then I do:

fstrim -v /

and the output of that shows:

/: 66.7 GiB (71601565696 bytes) trimmed

and then "df -h" shows:

Filesystem Size Used Avail Use% Mounted on
/dev/rbd0 196G 99G 89G 53% /

But on proxmox cli "rbd du --pool ceph-ssd | grep 105" shows:

vm-105-disk-0 200 GiB 139 GiB

Why do I have 139GB used when only 99G is used in the Container? Am I doing something wrong? It works on images stored on ZFS.

Alwin · Nov 26, 2019

On what version are you pveversion -v? And 99 + 67 != 139, either. The release of the data might be just delayed on Ceph. Did you check after some time if the size still shrank?

Gert · Nov 26, 2019

You mean 99 + 67 != 200? After filling the image with zeros the image size reported back from RBD was maxed out at 200GB, then while running fstrim it came down in realtime to 139, it is now days later and still its in 139.

"pveversion -v"

root@node1:~# pveversion -v
proxmox-ve: 6.0-2 (running kernel: 5.0.21-5-pve)
pve-manager: 6.0-11 (running version: 6.0-11/2140ef37)
pve-kernel-helper: 6.0-12
pve-kernel-5.0: 6.0-11
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.21-2-pve: 5.0.21-7
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.4-pve1
ceph-fuse: 14.2.4-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-3
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-7
libpve-guest-common-perl: 3.0-2
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-10
pve-docs: 6.0-8
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-4
pve-ha-manager: 3.0-3
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-13
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

Alwin · Nov 27, 2019

Gert said:
You mean 99 + 67 != 200? After filling the image with zeros the image size reported back from RBD was maxed out at 200GB, then while running fstrim it came down in realtime to 139, it is now days later and still its in 139.

Either way. 67 + 139 = 206, therefore Ceph seems to have discarded most of the data. Ceph may not be able to remove all the objects, as they may contain data blocks still in use by the image.

Gert · Nov 27, 2019

Ah, i see. That makes sense.
So it should help if I defrag the file system first?

Alwin · Nov 27, 2019

Gert said:
So it should help if I defrag the file system first?

Ext4, tries not to fragment data allocation. Defrag will not have the expected effect. Only a export/import will probably get rid of the remaining 40 GB. As all objects will be deleted and recreated.

Gert · Nov 27, 2019

I understand, thank you.

Search

Search

Reclaiming free space in VMs and Containers with CEPH RBD images

Gert

Member

Alwin

Proxmox Retired Staff

Gert

Member

Alwin

Proxmox Retired Staff

Gert

Member

Alwin

Proxmox Retired Staff

Gert

Member