Hello!
We use a Ceph cluster running on four nodes with 32 OSD daemons. The cluster has 16 slow (HDD) and 16 fast (SSD) disks. The health information indicates the OSD_BACKFILLFULL and POOL_BACKFILLFULL status flags. "ceph health detail" show:
"ceph osd status" shows:
"ceph df" shows:
How can the "backfillfull" status be cleared?
Best,
René.
We use a Ceph cluster running on four nodes with 32 OSD daemons. The cluster has 16 slow (HDD) and 16 fast (SSD) disks. The health information indicates the OSD_BACKFILLFULL and POOL_BACKFILLFULL status flags. "ceph health detail" show:
Code:
HEALTH_WARN 1 failed cephadm daemon(s); 1 backfillfull osd(s); 2 pool(s) backfillfull; 4 slow ops, oldest one blocked for 189006 sec, mon.hat-ceph-01 has slow ops
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
daemon osd.32 on hat-ceph-01 is in error state
[WRN] OSD_BACKFILLFULL: 1 backfillfull osd(s)
osd.8 is backfill full
[WRN] POOL_BACKFILLFULL: 2 pool(s) backfillfull
pool 'device_health_metrics' is backfillfull
pool 'fastpool' is backfillfull
[WRN] SLOW_OPS: 4 slow ops, oldest one blocked for 189006 sec, mon.hat-ceph-01 has slow ops
"ceph osd status" shows:
Code:
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
1 hat-ceph-01 590G 303G 122 2743k 0 24.7k exists,up
2 hat-ceph-01 514G 379G 0 0 0 0 exists,up
3 hat-ceph-01 442G 451G 8 1026k 4 320k exists,up
4 hat-ceph-01 238G 10.6T 1 19.1k 0 0 exists,up
5 hat-ceph-01 238G 10.6T 1 13.5k 0 0 exists,up
6 hat-ceph-01 203G 10.7T 0 5733 0 0 exists,up
7 hat-ceph-01 170G 10.7T 0 4914 0 0 exists,up
8 hat-ceph-02 811G 82.3G 186 11.0M 1 9829 backfillfull,exists,up
9 hat-ceph-02 219G 674G 0 0 0 0 exists,up
10 hat-ceph-02 442G 452G 55 2961k 0 51.2k exists,up
11 hat-ceph-02 519G 375G 103 4789k 5 284k exists,up
12 hat-ceph-03 440G 453G 132 4302k 2 52.7k exists,up
13 hat-ceph-02 169G 10.7T 0 4914 0 0 exists,up
14 hat-ceph-03 442G 452G 63 2762k 2 68.8k exists,up
15 hat-ceph-02 205G 10.7T 1 15.9k 0 0 exists,up
16 hat-ceph-03 149G 745G 0 0 0 0 exists,up
17 hat-ceph-02 204G 10.7T 1 10.3k 0 0 exists,up
18 hat-ceph-03 738G 156G 0 0 2 90.4k exists,up
19 hat-ceph-02 273G 10.6T 0 1638 0 0 exists,up
20 hat-ceph-03 272G 10.6T 1 9829 0 0 exists,up
21 hat-ceph-03 171G 10.7T 0 4095 0 0 exists,up
22 hat-ceph-03 136G 10.7T 0 0 0 0 exists,up
23 hat-ceph-03 271G 10.6T 0 5836 0 0 exists,up
24 hat-ceph-04 148G 745G 46 1171k 0 0 exists,up
25 hat-ceph-04 590G 304G 97 7709k 1 69.6k exists,up
26 hat-ceph-04 222G 671G 1 2474k 0 15.1k exists,up
27 hat-ceph-04 442G 451G 0 0 0 6552 exists,up
28 hat-ceph-04 68.7G 10.8T 0 0 0 0 exists,up
29 hat-ceph-04 271G 10.6T 1 8191 0 0 exists,up
30 hat-ceph-04 170G 10.7T 0 1945 0 0 exists,up
31 hat-ceph-04 204G 10.7T 0 1638 0 0 exists,up
33 hat-ceph-01 368G 525G 0 0 0 28.0k exists,up
"ceph df" shows:
Code:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 175 TiB 171 TiB 3.2 TiB 3.2 TiB 1.83
ssd 14 TiB 7.1 TiB 6.9 TiB 6.9 TiB 49.51
TOTAL 189 TiB 178 TiB 10 TiB 10 TiB 5.36
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 91 MiB 32 274 MiB 0 2.6 TiB
fastpool 2 32 2.3 TiB 604.72k 6.9 TiB 92.13 201 GiB
slowpool 3 32 1.1 TiB 298.55k 3.2 TiB 1.93 54 TiB
How can the "backfillfull" status be cleared?
Best,
René.
Last edited: