[SOLVED] Ceph health warning: backfillfull

cmonty14

Well-Known Member
Mar 4, 2014
343
5
58
Hello,

in my cluster consisting of 4 OSD nodes there's a HDD failure.
This affects currently 31 disks.

Each node has 48 HDDs à 2TB connected.

This results in this crushmap:
root hdd_strgbox {
id -17 # do not change unnecessarily
id -19 class hdd # do not change unnecessarily
id -21 class nvme # do not change unnecessarily
# weight 312.428
alg straw2
hash 0 # rjenkins1
item ld5505-hdd_strgbox weight 78.107
item ld5506-hdd_strgbox weight 78.107
item ld5507-hdd_strgbox weight 78.107
item ld5508-hdd_strgbox weight 78.107

This means there should be ~100TB disk storage available considering a replication factor 3.

Checking the disk utilization the related pool db_backup is shown with 94% used space or 54.5TiB.
root@ld3955:~# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
398TiB 232TiB 166TiB 41.70 14.47M
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED
backup 4 N/A N/A 0B 0 3.33TiB 0 0 0B 0B 0B
nvme 6 N/A N/A 0B 0 7.36TiB 0 0 0B 0B 0B
db_backup 11 N/A N/A 54.5TiB 94.23 3.33TiB 14285830 14.29M 213KiB 36.2MiB 163TiB
pve_cephfs_data 21 N/A N/A 242GiB 0.68 34.3TiB 64232 64.23k 80.6KiB 69.5KiB 726GiB
pve_cephfs_metadata 22 N/A N/A 126MiB 0 34.3TiB 53 53 42B 2.09KiB 378MiB


Even when I consider that one node goes down there should be more storage available with the available OSDs.

Can you please explain why Ceph displays 94% used disk space and a health warning?
root@ld3955:~# ceph health
HEALTH_WARN 3 backfillfull osd(s); 8 nearfull osd(s); 2 pool(s) backfillfull


THX
 
What does 'ceph osd df tree' give you? Some of the OSDs are over 85% fill level, irregardless of the overall usage of that root bucket on the cluster.
 
Hi Alwin,
the faulty host is ld5507-hdd_strgbox, means there are two enclosures connected with 24 HDDs à 2TB each.
On each enclosure there are 15 and respectively 16 HDDs showing a failure.
The investigation why 31 disks fail at the same time are ongoing.
The error of every single disk is this:
- no SMART health values available anymore
- creating partition table with gdisk fails
OK; writing new GUID partition table (GPT) to /dev/sdb.
Unable to save backup partition table! Perhaps the 'e' option on the experts'
menu will resolve this problem.
Warning! An error was reported when writing the partition table! This error
MIGHT be harmless, or the disk might be damaged! Checking it is advisable.



The requested output is attached.

THX
 

Attachments

On each enclosure there are 15 and respectively 16 HDDs showing a failure.
The investigation why 31 disks fail at the same time are ongoing.
I would suspect that the enclosure itself might have an issue or the connected HBA.

EDIT: But as not all OSDs on that host are down, the remaining OSDs will be tried for recovery. Best mark all OSDs out on that host. The data should recover/rebalance to the remaining three nodes.
 
Hello!
The issue with the failed disks is resolved, means all disks are back in to Ceph.
However I would take the opportunity to ask you for clarification of the available and allocated disk space.

In my setup I have 4 OSD nodes.
Each node has 2 (storage) enclosures.
Each enclosure has 24 HDDs.
Each HDD is 1.6TB.

Hereby I get a total disk storage of
4 x 2 x 24 x 2TB = 307TB

With replication factor 3 there should be 102TB available.
Now I consider that a single HDD should not allocate more than 80-90% of available space and this gives me 82-92TB.

However, the output ceph df detail gives me some different figures:
root@ld3955:/mnt/pve/pve_cephfs/template/cache# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
446TiB 279TiB 167TiB 37.46 14.52M
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED
backup 4 N/A N/A 0B 0 18.8TiB 0 0 0B 0B 0B
nvme 6 N/A N/A 0B 0 7.35TiB 0 0 0B 0B 0B
db_backup 11 N/A N/A 54.7TiB 74.43 18.8TiB 14337600 14.34M 219KiB 37.2MiB 164TiB
pve_cephfs_data 21 N/A N/A 242GiB 0.68 34.3TiB 64232 64.23k 80.7KiB 69.5KiB 726GiB
pve_cephfs_metadata 22 N/A N/A 126MiB 0 34.3TiB 53 53 42B 2.14KiB 378MiB
hdd 25 N/A N/A 61.0GiB 0.17 34.3TiB 15685 15.69k 3.87MiB 9.12MiB 183GiB
pve_default 26 N/A N/A 394GiB 1.11 34.3TiB 101245 101.25k 38.2MiB 36.8MiB 1.15TiB



You must focus on pool "db_backup" and "backup" only; all other pools store data on different drives that don't belong to the enclosure.

So here is my question:
Why is this output showing 74.43% used disk space?

THX
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!