Why mon low disk space ?

samontetro · Mar 21, 2021

Hi,

I'm again in a "mon low disk space" on my small proxmox cluster. I saw many threads about this warning but I'm unable to manage the problem.

average osd use is 5% only
/ partition (where /var/lib/ceph is located) is 72% (19GB available) and this seems to be a problem for Ceph
Code:
```
# df -h /Filesystem      Size  Used Avail Use% Mounted on
/dev/dm-0        69G   47G   19G  72% /
```

/var/lib/ceph/mon is only 22MB

Code:

# du -sh /var/lib/ceph/mon/
22M    /var/lib/ceph/mon/

- Inodes on / are only 4%

Code:

# df -i /
Filesystem      Inodes  IUsed   IFree IUse% Mounted on
/dev/dm-0      4554752 174705 4380047    4% /

So is it a threshold set too low somewhere or is there a real risk for my ceph cluster ? My proxmox is old, 4.4 but....

Thanks if someone can help me to understand this warning and solve this

Patrick

Stefan_R · Mar 24, 2021

The ceph property mon_data_avail_warn dictates the percentage of free space that must be available for the health check to return green. It is by default set to 30, so with 72% usage you are over that threshold. In general, that warning exists for a reason, and the real solution is to expand your root disk (either expand the LV or add a physically bigger disk). If, however, you know what you are doing and have monitoring about the free disk space installed otherwise, you can change the property using: ceph set cephpool mon_data_avail_warn <percentage>

samontetro · Mar 26, 2021

Thanks Stefan for this detailed explanation. My OS disk is fully allocated. No free space to enlarge the logicals volumes. I was not aware of the promox requierements for storage when I baught these hardware, I was not yet a promox user. I have only 2x280GB in RAID1 for the OS. But on this server, there is no free slot for a new disk (I have also 2 osd + 4 disk in RAID5 for backup).
I will try to set the threshold a little bit lower but is this command just for the current server or is it a global instruction for all the servers ?

May be I can also try to create a new lvm for /var/lib/ceph on the raid5 array and move the data to it.... if it is possible safely.
Patrick

Stefan_R · Mar 29, 2021

samontetro said:
I will try to set the threshold a little bit lower but is this command just for the current server or is it a global instruction for all the servers ?

Global.

samontetro said:
May be I can also try to create a new lvm for /var/lib/ceph on the raid5 array and move the data to it.... if it is possible safely.

That is possible, though not a supported configuration - it may work, or it may break everything. Do not use on important production, maybe use in homelab

To do so you'd have to turn off all the ceph services on the node before moving the folder and then mounting it. Make sure the mount is available before the ceph services start on boot.

cjdnad · Jan 24, 2023

Stefan_R said:
The ceph property mon_data_avail_warn dictates the percentage of free space that must be available for the health check to return green. It is by default set to 30, so with 72% usage you are over that threshold. In general, that warning exists for a reason, and the real solution is to expand your root disk (either expand the LV or add a physically bigger disk). If, however, you know what you are doing and have monitoring about the free disk space installed otherwise, you can change the property using: ceph set cephpool mon_data_avail_warn <percentage>

How do I set this to 20% please?
I have tried from the shell:
ceph set cephpool mon_data_avail_warn 20

but get


no valid command found; 10 closest matches:
pg stat
pg getmap
pg dump [<dumpcontents:all|summary|sum|delta|pools|osds|pgs|pgs_brief>...]
pg dump_json [<dumpcontents:all|summary|sum|pools|osds|pgs>...]
pg dump_pools_json
pg ls-by-pool <poolstr> [<states>...]
pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...]
pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...]
pg ls [<pool:int>] [<states>...]
pg dump_stuck [<stuckops:inactive|unclean|stale|undersized|degraded>...] [<threshold:int>]
Error EINVAL: invalid command

PawelK · Jul 20, 2023

cjdnad said:
How do I set this to 20% please?
I have tried from the shell:
ceph set cephpool mon_data_avail_warn 20

but get

no valid command found; 10 closest matches: pg stat pg getmap pg dump [<dumpcontents:all|summary|sum|delta|pools|osds|pgs|pgs_brief>...] pg dump_json [<dumpcontents:all|summary|sum|pools|osds|pgs>...] pg dump_pools_json pg ls-by-pool <poolstr> [<states>...] pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] pg ls [<pool:int>] [<states>...] pg dump_stuck [<stuckops:inactive|unclean|stale|undersized|degraded>...] [<threshold:int>] Error EINVAL: invalid command

Hi,

Try :

Code:

ceph config set global mon_data_avail_warn 20

to 20% warn

Search

Search

Why mon low disk space ?

samontetro

Active Member

Stefan_R

Proxmox Retired Staff

samontetro

Active Member

Stefan_R

Proxmox Retired Staff

cjdnad

Member

PawelK

Member

We value your privacy