Problem with trim/discard on Ceph storage

GabrieleV

Renowned Member
May 20, 2013
54
6
73
Hello,
I have problems with various VMS that seems do not release unused space.

Here is the VM Config:

agent: 1,fstrim_cloned_disks=1
boot: order=ide2;scsi0
cores: 2
ide2: none,media=cdrom
memory: 4096
meta: creation-qemu=7.2.0,ctime=1711617679
name: CRODC05
net0: virtio=22:9F:25:F3:22:96,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
scsi0: CRO-CEPH-SSD:vm-105-disk-0,aio=threads,cache=writeback,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=fe50e879-9b53-4bb0-aa61-78c583d98000
sockets: 1
vmgenid: 1f92ae2c-0125-4cf8-a3c4-5f72e423a433
Here is the fstab:
# cat /etc/fstab | grep ext4
/dev/mapper/sys-root / ext4 discard,errors=remount-ro 0 1
UUID=5a77d9ea-9fc8-4ee2-9391-3429084d0f3d /boot ext4 discard,defaults 0 2

Here is the LVM config:
# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 sys lvm2 a-- <31.07g <16.17g

# vgs
VG #PV #LV #SN Attr VSize VFree
sys 1 1 0 wz--n- <31.07g <16.17g

# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root sys -wi-ao---- <14.90g

# cat /etc/lvm/lvm.conf | grep issue_discards
# Configuration option devices/issue_discards.
issue_discards = 1


Here is the Filesystem free disk space:
# df -h -t ext4
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/sys-root 15G 3.2G 11G 23% /
/dev/sda1 920M 108M 749M 13% /boot

I've also issued fstrim manually after poweroff/poweron the VM:

# fstrim -av
/boot: 812.6 MiB (852033536 bytes) trimmed on /dev/sda1
/: 11.4 GiB (12247777280 bytes) trimmed on /dev/mapper/sys-root

# fstrim -av
/boot: 0 B (0 bytes) trimmed on /dev/sda1
/: 0 B (0 bytes) trimmed on /dev/mapper/sys-root

But when I investigate du on ceph, it syas that the VM disk is the full 32GB hdd size:
# rbd du -p rbd_ssd vm-105-disk-0
NAME PROVISIONED USED
vm-105-disk-0 32 GiB 32 GiB

What am I missing ?
 
What am I missing ?

I have really no idea, sorry. So this is just to give another random example, showing "thin"-behavior is the default for my installation: on my replication_rule pool I can see:

Code:
~# rbd du  -p ceph1   vm-2223-disk-0
NAME                                PROVISIONED  USED   
vm-2223-disk-0@auto-d-241104080715       32 GiB   23 GiB
vm-2223-disk-0@auto-d-241105080717       32 GiB  3.3 GiB
vm-2223-disk-0@auto-h-241105161000       32 GiB  1.8 GiB
vm-2223-disk-0@auto-h-241105170946       32 GiB  516 MiB
vm-2223-disk-0@auto-h-241105180953       32 GiB  544 MiB
vm-2223-disk-0                           32 GiB  392 MiB
<TOTAL>                                  32 GiB   30 GiB
 
The other strange thing is that on my HDD pool thin provisioning seems working:
Code:
# rbd du -p rbd
NAME                      PROVISIONED  USED   
base-501-disk-1@__base__       80 GiB  1.6 GiB
base-501-disk-1                80 GiB      0 B
vm-104-disk-1                  80 GiB   77 GiB
vm-106-disk-0                  32 GiB   15 GiB
vm-106-disk-1                  80 GiB   80 GiB
vm-110-disk-0                  32 GiB   30 GiB
vm-111-disk-0                  32 GiB   32 GiB
vm-111-disk-1                  80 GiB  5.6 GiB
vm-201-disk-1                   8 GiB  7.8 GiB
vm-205-disk-1                  80 GiB   35 GiB
vm-206-disk-2                 950 GiB  948 GiB
vm-301-disk-0                 512 GiB  508 GiB
vm-301-disk-1                  80 GiB   66 GiB
vm-301-disk-2                 512 GiB  508 GiB
vm-301-disk-3                 512 GiB  508 GiB
vm-301-disk-4                 512 GiB  508 GiB
vm-302-disk-0                  32 GiB  4.2 GiB
vm-302-disk-1                   1 TiB  178 GiB
vm-303-disk-1                  80 GiB   27 GiB
vm-304-disk-1                 320 GiB  320 GiB
vm-607-disk-0                  32 GiB   22 GiB
<TOTAL>                       5.0 TiB  3.8 TiB

But on the SSD pool, not:
Code:
# rbd du -p rbd_ssd
NAME           PROVISIONED  USED   
vm-102-disk-0       32 GiB   32 GiB
vm-105-disk-0       32 GiB   32 GiB
vm-107-disk-0       64 GiB   64 GiB
vm-108-disk-0       74 GiB   74 GiB
vm-108-disk-1       10 GiB  9.8 GiB
vm-206-disk-0       80 GiB   80 GiB
vm-304-disk-0       32 GiB   32 GiB
vm-601-disk-0       65 GiB   65 GiB
vm-602-disk-0       65 GiB   65 GiB
vm-603-disk-0       65 GiB   65 GiB
vm-604-disk-0       65 GiB   65 GiB
vm-605-disk-0       65 GiB   65 GiB
vm-606-disk-0       65 GiB   65 GiB
<TOTAL>            714 GiB  713 GiB
 
What am I missing ?
Maybe fragmenting? I've observed a similar problem. AFAIK, Ceph uses an internal allocation unit of 4 MB so if at least one 4k block is written, the whole 4 MB block is written. I tested this on Windows and Linux and after defragmenting the filesystem - not only files, but also move the files to the front after shrinking the filesystem, I could reclaim a lot more space. Not as much as I could reclaim e.g. on ZFS (I tested by live migration), yet more than without it.

Yet, I don't know how this could be only in your SSD pool and not in your HDD pool. Have you tried migrating the VM between the pools (or even other storage types)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!