Hi all,
Since a couple of days we are experiencing an outage of our CEPH cluster.
We are running a 3/2 config, with 3 nodes and 3 OSD's per node (all same NVMe in both type and size).
At first our VMs became unreachable, probably due to the fact that the disks where full and thus prevented IO.
After that we restarted the machines and noticed that 3 OSD's where down and tried to restart them.
But they failed to start with bluestore throwing enospc error:
We removed the OSD's and added extra storage, but now the cluster is stuck and cannot get it up and running.
We changed some of the ratios to try to gain access to the disks, but does not help.
Also allowing it to reweight by utilization and changing the ratio's to move the full disks around did help to solve some of the warnings but not all of them.
What can we do best? See ceph detail below, removed some of it because of chars limit.
OSD's
Thanks!
Since a couple of days we are experiencing an outage of our CEPH cluster.
We are running a 3/2 config, with 3 nodes and 3 OSD's per node (all same NVMe in both type and size).
At first our VMs became unreachable, probably due to the fact that the disks where full and thus prevented IO.
After that we restarted the machines and noticed that 3 OSD's where down and tried to restart them.
But they failed to start with bluestore throwing enospc error:
Code:
-1038> 2023-07-26T14:50:58.886+0200 7fbc8af6e540 -1 bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file)
-3> 2023-07-26T14:51:23.565+0200 7fbc8af6e540 -1 bluefs _allocate allocation failed, needed 0x27b6
-2> 2023-07-26T14:51:23.565+0200 7fbc8af6e540 -1 bluefs _flush_range_F allocated: 0x0 offset: 0x0 length: 0x27b6
-1> 2023-07-26T14:51:23.577+0200 7fbc8af6e540 -1 ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range_F(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7fbc8af6e540 time 2023-07-26T14:51:23.570570+0200
./src/os/bluestore/BlueFS.cc: 3137: ceph_abort_msg("bluefs enospc")
We removed the OSD's and added extra storage, but now the cluster is stuck and cannot get it up and running.
We changed some of the ratios to try to gain access to the disks, but does not help.
Also allowing it to reweight by utilization and changing the ratio's to move the full disks around did help to solve some of the warnings but not all of them.
What can we do best? See ceph detail below, removed some of it because of chars limit.
Code:
HEALTH_ERR noout flag(s) set; 1 backfillfull osd(s); 1 full osd(s); 2 nearfull osd(s); Reduced data availability: 37 pgs inactive; Low space hindering backfill (add storage if this doesn't resolve itself): 36 pgs backfill_toofull; Degraded data redundancy: 196392/1449408 objects degraded (13.550%), 79 pgs degraded, 79 pgs undersized; 23 pgs not deep-scrubbed in time; 2 pool(s) full; 88 daemons have recently crashed
[WRN] OSDMAP_FLAGS: noout flag(s) set
[WRN] OSD_BACKFILLFULL: 1 backfillfull osd(s)
osd.3 is backfill full
[ERR] OSD_FULL: 1 full osd(s)
osd.5 is full
[WRN] OSD_NEARFULL: 2 nearfull osd(s)
osd.4 is near full
osd.8 is near full
[WRN] PG_AVAILABILITY: Reduced data availability: 37 pgs inactive
pg 2.0 is stuck inactive for 2d, current state undersized+degraded+remapped+backfilling+peered, last acting [0]
pg 2.1 is stuck inactive for 2d, current state undersized+degraded+remapped+backfilling+peered, last acting [2]
pg 2.6 is stuck inactive for 2d, current state undersized+degraded+remapped+backfill_toofull+peered, last acting [0]
pg 2.a is stuck inactive for 2d, current state undersized+degraded+remapped+backfilling+peered, last acting [1]
pg 2.d is stuck inactive for 2d, current state undersized+degraded+remapped+backfill_toofull+peered, last acting [2]
[WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 36 pgs backfill_toofull
pg 2.5 is active+undersized+degraded+remapped+backfill_toofull, acting [8,0]
pg 2.6 is undersized+degraded+remapped+backfill_toofull+peered, acting [0]
pg 2.7 is active+undersized+degraded+remapped+backfill_toofull, acting [2,7]
pg 2.8 is active+undersized+degraded+remapped+backfill_toofull, acting [0,7]
pg 2.d is undersized+degraded+remapped+backfill_toofull+peered, acting [2]
pg 2.f is active+undersized+degraded+remapped+backfill_toofull, acting [8,0]
[WRN] PG_DEGRADED: Degraded data redundancy: 196392/1449408 objects degraded (13.550%), 79 pgs degraded, 79 pgs undersized
pg 2.0 is stuck undersized for 10h, current state undersized+degraded+remapped+backfilling+peered, last acting [0]
pg 2.1 is stuck undersized for 10h, current state undersized+degraded+remapped+backfilling+peered, last acting [2]
pg 2.3 is stuck undersized for 67m, current state active+undersized+degraded+remapped+backfilling, last acting [2,8]
pg 2.5 is stuck undersized for 10h, current state active+undersized+degraded+remapped+backfill_toofull, last acting [8,0]
pg 2.6 is stuck undersized for 10h, current state undersized+degraded+remapped+backfill_toofull+peered, last acting [0]
pg 2.7 is stuck undersized for 10h, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2,7]
pg 2.8 is stuck undersized for 10h, current state active+undersized+degraded+remapped+backfill_toofull, last acting [0,7]
pg 2.a is stuck undersized for 10h, current state undersized+degraded+remapped+backfilling+peered, last acting [1]
pg 2.c is stuck undersized for 67m, current state active+undersized+degraded+remapped+backfilling, last acting [0,7]
pg 2.d is stuck undersized for 10h, current state undersized+degraded+remapped+backfill_toofull+peered, last acting [2]
pg 2.e is stuck undersized for 67m, current state active+undersized+degraded+remapped+backfilling, last acting [2,8]
[WRN] PG_NOT_DEEP_SCRUBBED: 23 pgs not deep-scrubbed in time
pg 2.7e not deep-scrubbed since 2023-07-12T04:39:51.750223+0200
pg 2.35 not deep-scrubbed since 2023-07-12T16:25:19.553858+0200
pg 2.32 not deep-scrubbed since 2023-07-14T08:42:02.184452+0200
pg 2.2b not deep-scrubbed since 2023-07-11T15:31:56.792350+0200
pg 2.22 not deep-scrubbed since 2023-07-14T07:14:43.041368+0200
pg 2.a not deep-scrubbed since 2023-07-13T23:43:06.970018+0200
pg 2.5 not deep-scrubbed since 2023-07-13T22:52:05.079369+0200
pg 2.7d not deep-scrubbed since 2023-07-13T04:31:01.394803+0200
pg 2.1 not deep-scrubbed since 2023-07-13T03:46:17.360905+0200
pg 2.f not deep-scrubbed since 2023-07-13T11:58:44.528170+0200
pg 2.11 not deep-scrubbed since 2023-07-13T07:40:55.838036+0200
pg 2.13 not deep-scrubbed since 2023-07-14T13:55:49.448741+0200
pg 2.15 not deep-scrubbed since 2023-07-12T04:59:45.326263+0200
pg 2.3f not deep-scrubbed since 2023-07-13T22:33:55.154729+0200
pg 2.46 not deep-scrubbed since 2023-07-12T10:45:21.814722+0200
pg 2.47 not deep-scrubbed since 2023-07-12T18:03:00.237776+0200
pg 2.4f not deep-scrubbed since 2023-07-12T02:36:05.006218+0200
pg 2.55 not deep-scrubbed since 2023-07-12T10:40:12.190681+0200
pg 2.56 not deep-scrubbed since 2023-07-14T15:34:16.399247+0200
pg 2.59 not deep-scrubbed since 2023-07-13T10:49:19.802743+0200
pg 2.5d not deep-scrubbed since 2023-07-14T06:46:56.523599+0200
pg 2.6b not deep-scrubbed since 2023-07-14T09:45:44.006297+0200
pg 2.71 not deep-scrubbed since 2023-07-14T08:44:26.257269+0200
[WRN] POOL_FULL: 2 pool(s) full
pool '.mgr' is full (no space)
pool 'ceph-pool' is full (no space)
[WRN] RECENT_CRASH: 88 daemons have recently crashed
osd.4 crashed on host pve02 at 2023-07-26T20:38:27.868239Z
OSD's
Code:
root@pve01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.67580 root default
-3 2.18300 host pve01
0 ssd 0.43660 osd.0 up 1.00000 1.00000
1 ssd 0.43660 osd.1 up 0.90002 1.00000
2 ssd 0.43660 osd.2 up 0.90002 1.00000
10 ssd 0.43660 osd.10 up 0.79999 1.00000
11 ssd 0.43660 osd.11 up 0.79999 1.00000
-5 1.30980 host pve02
3 ssd 0.43660 osd.3 up 0.79999 1.00000
4 ssd 0.43660 osd.4 up 0.79999 1.00000
5 ssd 0.43660 osd.5 up 0.79999 1.00000
-7 2.18300 host pve03
6 ssd 0.43660 osd.6 up 1.00000 1.00000
7 ssd 0.43660 osd.7 up 0.90002 1.00000
8 ssd 0.43660 osd.8 up 0.90002 1.00000
9 ssd 0.43660 osd.9 up 0.79999 1.00000
12 ssd 0.43660 osd.12 up 0.79999 1.00000
Thanks!