Ceph rbd du shows usage 2-4x higher than inside VM

SteveITS

Renowned Member
Feb 6, 2025
656
281
68
I've noticed VMs that show much higher usage via rbd du than in the VM, for example:

Code:
NAME            PROVISIONED  USED
vm-119-disk-0       500 GiB  413 GiB
vm-122-disk-0       140 GiB  131 GiB

Inside the VM, df shows 95G and 63G used space, respectively. Both of these are Debian 12 which has fstrim.timer running, and the VM has had ssd=1, discard=on since creation. I haven't looked in to many other VMs, but from skimming the usage column, others may be similar.

I can run fstrim -av manually inside the VM which does trim (18 and 25 GB) but the usage in rbd du doesn't immediately drop...in fact it increased by 1 GB each.

It's not a problem for our Ceph capacity, but it seems unexpected to be that different, and I'm not finding much online other than discussions of running a trim. I'd expect a decent difference in usage until a trim ran.

Also, the totals from rbd don't seem to add up to an expected number...the total of the USED column from rbd du is about 70% of the PVE Ceph dashboard "Usage" and about 200% of the PVE storage entry's usage line (which shows numbers 1/3 of the dashboard/summary page, due to the 3/2 replication). Cephfs exists but has almost nothing in it.

What am I missing?

Thanks,
 
All the ones I've looked at have the standard weekly fstrim timer/Windows disk optimize, and I can run it manually. But as noted above trimming 18 GB leaves a wide gulf between 95 GB "on disk" and ~400 GB USED.

I've found a few other threads around the Internet asking a similar question. Best I've got so far (without trying resize2fs yet) is that "it can take some time" for USED to update. :-/

Given it doesn't match either usage number in PVE I'm starting to wonder if it's just wrong, or maybe not the same "usage" it sounds like.