Why mon low disk space ?

samontetro

Active Member
Jun 19, 2012
78
2
28
Grenoble, France
Hi,

I'm again in a "mon low disk space" on my small proxmox cluster. I saw many threads about this warning but I'm unable to manage the problem.
  • average osd use is 5% only
  • / partition (where /var/lib/ceph is located) is 72% (19GB available) and this seems to be a problem for Ceph
    Code:
    # df -h /Filesystem      Size  Used Avail Use% Mounted on
    /dev/dm-0        69G   47G   19G  72% /
  • /var/lib/ceph/mon is only 22MB
    Code:
    # du -sh /var/lib/ceph/mon/
    22M    /var/lib/ceph/mon/
  • - Inodes on / are only 4%
    Code:
    # df -i /
    Filesystem      Inodes  IUsed   IFree IUse% Mounted on
    /dev/dm-0      4554752 174705 4380047    4% /
So is it a threshold set too low somewhere or is there a real risk for my ceph cluster ? My proxmox is old, 4.4 but....


Thanks if someone can help me to understand this warning and solve this


Patrick
 
The ceph property mon_data_avail_warn dictates the percentage of free space that must be available for the health check to return green. It is by default set to 30, so with 72% usage you are over that threshold. In general, that warning exists for a reason, and the real solution is to expand your root disk (either expand the LV or add a physically bigger disk). If, however, you know what you are doing and have monitoring about the free disk space installed otherwise, you can change the property using: ceph set cephpool mon_data_avail_warn <percentage>
 
Thanks Stefan for this detailed explanation. My OS disk is fully allocated. No free space to enlarge the logicals volumes. I was not aware of the promox requierements for storage when I baught these hardware, I was not yet a promox user. I have only 2x280GB in RAID1 for the OS. But on this server, there is no free slot for a new disk (I have also 2 osd + 4 disk in RAID5 for backup).
I will try to set the threshold a little bit lower but is this command just for the current server or is it a global instruction for all the servers ?

May be I can also try to create a new lvm for /var/lib/ceph on the raid5 array and move the data to it.... if it is possible safely.
Patrick
 
I will try to set the threshold a little bit lower but is this command just for the current server or is it a global instruction for all the servers ?
Global.

May be I can also try to create a new lvm for /var/lib/ceph on the raid5 array and move the data to it.... if it is possible safely.
That is possible, though not a supported configuration - it may work, or it may break everything. Do not use on important production, maybe use in homelab ;)

To do so you'd have to turn off all the ceph services on the node before moving the folder and then mounting it. Make sure the mount is available before the ceph services start on boot.
 
The ceph property mon_data_avail_warn dictates the percentage of free space that must be available for the health check to return green. It is by default set to 30, so with 72% usage you are over that threshold. In general, that warning exists for a reason, and the real solution is to expand your root disk (either expand the LV or add a physically bigger disk). If, however, you know what you are doing and have monitoring about the free disk space installed otherwise, you can change the property using: ceph set cephpool mon_data_avail_warn <percentage>
How do I set this to 20% please?
I have tried from the shell:
ceph set cephpool mon_data_avail_warn 20

but get

no valid command found; 10 closest matches: pg stat pg getmap pg dump [<dumpcontents:all|summary|sum|delta|pools|osds|pgs|pgs_brief>...] pg dump_json [<dumpcontents:all|summary|sum|pools|osds|pgs>...] pg dump_pools_json pg ls-by-pool <poolstr> [<states>...] pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] pg ls [<pool:int>] [<states>...] pg dump_stuck [<stuckops:inactive|unclean|stale|undersized|degraded>...] [<threshold:int>] Error EINVAL: invalid command
 
How do I set this to 20% please?
I have tried from the shell:
ceph set cephpool mon_data_avail_warn 20

but get

no valid command found; 10 closest matches: pg stat pg getmap pg dump [<dumpcontents:all|summary|sum|delta|pools|osds|pgs|pgs_brief>...] pg dump_json [<dumpcontents:all|summary|sum|pools|osds|pgs>...] pg dump_pools_json pg ls-by-pool <poolstr> [<states>...] pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] pg ls [<pool:int>] [<states>...] pg dump_stuck [<stuckops:inactive|unclean|stale|undersized|degraded>...] [<threshold:int>] Error EINVAL: invalid command
Hi,

Try :

Code:
ceph config set global mon_data_avail_warn 20
to 20% warn
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!