[SOLVED] Ceph storage fully although storage is still available

Okay, so unless data has been moved away from the Ceph cluster, things did improve a little bit. osd 6 is now "only" 80% full.

Still not great though. Has this cluster been running for a while and gone through a few updates since then?
I don't like the output of the balancer status. Can you please run ceph osd get-require-min-compat-client? It could be possible that this is not as it should be, hindering the balancer to actually do its thing.
 
The cluster is running a while and the nodes are on different versions. Does this can be also the reason?
 

Attachments

  • ceph osd get-require-min-compat-client.jpg
    ceph osd get-require-min-compat-client.jpg
    6.7 KB · Views: 19
Ah okay.
nodes are on different versions.
I hope only on different minor versions, though you should keep the cluster in sync when installing updates.

The min-compat-client setting is still on "jewel" and the balancer needs "luminous" to work.

Run
Code:
ceph osd set-require-min-compat-client luminous
to set the requirements for the clients to at least luminous. I do hope that all clients connecting are at least on that version or newer. Then the balancer should start working and actively mapping PGs to different OSDs.
It will do so slowly, but after a while you should see the usage of the OSDs to align more evenly, and since full OSDs will get data moved away from them, more free estimated space for the pool.

It is possible that setting the min-compat-client was missed in one of the major version upgrades.

For completeness sake, this is the part in the Ceph docs about it: https://docs.ceph.com/en/pacific/rados/operations/upmap/
 
Last edited:
We have now more, but only 12 TB. The Cluster still show 50 TB free space?

BR,
KC IT-Team
 

Attachments

  • ceph status new.jpg
    ceph status new.jpg
    114 KB · Views: 16
We have now more, but only 12 TB. The Cluster still show 50 TB free space?

BR,
KC IT-Team
Where exactly do you see that? You always have to differentiate between RAW storage capacity and storage capacity of the pool. What you see in the Ceph panels is usually raw storage capacity.
Since the pool has to store 3 replicas with the current size parameter, the number for the pool is lower.

50 TiB / 3 = 16.666 TiB. The current free space according to the screenshot is 14.8 TiB.

What you should also keep in mind: How much free space do you need in the remaining nodes (spread over at least 3) if one of the large nodes fail. Is there still enough space available so that the OSDs don't get close to 90% full?

Ceph can tolerate a lot, but running out of space is one of the really painful experiences. Having a cluster with the storage distributed unequally makes it quite a bit harder to predict how the loss of a node will affect the remaining cluster.
 
Yes we know that we must have some space left if the node fail.
If you take a look to the attached screen you see that ist ~50 tb left space for what is this going?
 

Attachments

  • ceph free space.jpg
    ceph free space.jpg
    34 KB · Views: 13
Yes we know that we must have some space left if the node fail.
If you take a look to the attached screen you see that ist ~50 tb left space for what is this going?
This is the raw space. Different pools can have different replicat number,

so with 50tb raw space free, you'll have 50/3 tb free space for a pool with size=3, or 50/2 for a pool with size=2
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!