Ceph Cluster Broken

Wazaari

New Member
Dec 28, 2023
22
6
3
Dear all,

yesterday we did some maintenance on one of our nodes (swapping the mainboard) and brought it back online afterwards. We suddenly realized that "something" is wrong, everything was painfully slow and became worse. During the maintenance, we've set the noout flag.

There are three nodes with 4 disks each, however on all three nodes (also the two we haven't changed during maintenance), the same two disks are no longer available:

Code:
# Node 1
root@proxmox01:~# lsblk
NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda                                                                                                     8:0    0 894.3G  0 disk
└─sda4                                                                                                  8:4    0    37G  0 part
sdb                                                                                                     8:16   0 894.3G  0 disk
└─sdb4                                                                                                  8:20   0    37G  0 part
sdc                                                                                                     8:32   0 894.3G  0 disk
└─ceph--f86b1366--7404--46fe--b9d5--d27b1af6dfd6-osd--block--5bc096cf--96d5--40fa--a289--e35907275994 252:1    0 894.3G  0 lvm 
sdd                                                                                                     8:48   0 894.3G  0 disk
└─ceph--f09229cd--fcab--4547--bda6--7342b5c138fa-osd--block--409c3800--6299--4528--93ec--745e3dbee671 252:0    0 894.3G  0 lvm

# Node 2
root@proxmox02:~# lsblk
NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda                                                                                                     8:0    0 894.3G  0 disk
└─sda4                                                                                                  8:4    0    37G  0 part
sdb                                                                                                     8:16   0 894.3G  0 disk
└─sdb4                                                                                                  8:20   0    37G  0 part
sdc                                                                                                     8:32   0 894.3G  0 disk
└─ceph--fa8929ee--974d--4e0c--926a--c0a4885d7f8c-osd--block--434171d1--d900--4b57--abdc--e44dc1dfccac 252:1    0 894.3G  0 lvm 
sdd                                                                                                     8:48   0 894.3G  0 disk
└─ceph--359c8ff3--e1d2--4680--9293--128c13dc4b4c-osd--block--d5109736--8162--4444--af70--b33e099449a3 252:0    0 894.3G  0 lvm 
    
# Node 3
root@proxmox03:~# lsblk
NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda                                                                                                     8:0    0 894.3G  0 disk
└─sda4                                                                                                  8:4    0    37G  0 part
sdb                                                                                                     8:16   0 894.3G  0 disk
└─sdb4                                                                                                  8:20   0    37G  0 part
sdc                                                                                                     8:32   0 894.3G  0 disk
└─ceph--819cc7d0--5259--4f48--97c2--0b7354245529-osd--block--c787bf62--555c--467b--930d--ead361c99001 252:1    0 894.3G  0 lvm 
sdd                                                                                                     8:48   0 894.3G  0 disk
└─ceph--6ea5c166--73d7--4ced--98bd--1385f770a875-osd--block--b3077adf--ee58--4bae--85e1--483aef62bc59 252:0    0 894.3G  0 lvm

Current Ceph Status:

Code:
root@proxmox01:~# ceph -s
  cluster:
    id:     d88b5090-dde9-41c0-a3d4-e04a1e212112
    health: HEALTH_WARN
            2 filesystems are degraded
            clock skew detected on mon.proxmox03, mon.proxmox01
            1/366705 objects unfound (0.000%)
            norebalance,norecover flag(s) set
            2 osds down
            Reduced data availability: 57 pgs inactive, 13 pgs down
            Degraded data redundancy: 341985/1100115 objects degraded (31.086%), 188 pgs degraded, 198 pgs undersized
            5 daemons have recently crashed
            1 slow ops, oldest one blocked for 134 sec, osd.6 has slow ops
 
  services:
    mon: 3 daemons, quorum proxmox02,proxmox03,proxmox01 (age 5m)
    mgr: proxmox02(active, since 5m), standbys: proxmox01, proxmox03
    mds: 2/2 daemons up, 1 standby
    osd: 12 osds: 6 up (since 5m), 8 in (since 16h); 69 remapped pgs
         flags norebalance,norecover
 
  data:
    volumes: 0/2 healthy, 2 recovering
    pools:   7 pools, 289 pgs
    objects: 366.70k objects, 1.3 TiB
    usage:   2.5 TiB used, 2.7 TiB / 5.2 TiB avail
    pgs:     3.114% pgs unknown
             16.609% pgs not active
             341985/1100115 objects degraded (31.086%)
             6305/1100115 objects misplaced (0.573%)
             1/366705 objects unfound (0.000%)
             99 active+undersized+degraded
             67 active+clean
             48 active+undersized+degraded+remapped+backfill_wait
             19 undersized+degraded+peered
             13 down
             12 undersized+degraded+remapped+backfill_wait+peered
             9  unknown
             8  active+undersized
             6  active+recovery_wait+undersized+degraded+remapped
             3  undersized+peered
             1  active+recovery_wait+degraded
             1  active+remapped+backfill_wait
             1  undersized+degraded+remapped+backfilling+peered
             1  active+recovering+undersized+degraded
             1  active+undersized+degraded+remapped+backfilling

I will say I'm not as familiar with CEPH as I probably should be to fix that issue. For most VMs we do have backups, so it wouldn't be the end of the world. Any idea if that situation can somewhat be salvaged?
 
In case it helps, detailed health output:

Code:
root@proxmox01:~# ceph health detail
HEALTH_WARN 2 filesystems are degraded; clock skew detected on mon.proxmox03, mon.proxmox01; 1/366705 objects unfound (0.000%); norebalance,norecover flag(s) set; 2 osds down; Reduced data availability: 57 pgs inactive, 13 pgs down; Degraded data redundancy: 341985/1100115 objects degraded (31.086%), 188 pgs degraded, 198 pgs undersized; 5 daemons have recently crashed; 1 slow ops, oldest one blocked for 134 sec, osd.6 has slow ops
[WRN] FS_DEGRADED: 2 filesystems are degraded
    fs proxmox-isos is degraded
    fs k8s-fs is degraded
[WRN] MON_CLOCK_SKEW: clock skew detected on mon.proxmox03, mon.proxmox01
    mon.proxmox03 clock skew 1.17796s > max 0.05s (latency 0.00180987s)
    mon.proxmox01 clock skew 1.16264s > max 0.05s (latency 0.00405562s)
[WRN] OBJECT_UNFOUND: 1/366705 objects unfound (0.000%)
    pg 5.15 has 1 unfound objects
[WRN] OSDMAP_FLAGS: norebalance,norecover flag(s) set
[WRN] OSD_DOWN: 2 osds down
    osd.2 (root=default,host=proxmox01) is down
    osd.5 (root=default,host=proxmox02) is down
[WRN] PG_AVAILABILITY: Reduced data availability: 57 pgs inactive, 13 pgs down
    pg 2.2 is down, acting [13,11]
    pg 2.5 is stuck inactive for 5m, current state unknown, last acting []
    pg 2.6 is stuck inactive for 5m, current state unknown, last acting []
    pg 2.8 is stuck inactive for 5m, current state unknown, last acting []
    pg 2.10 is down, acting [10,13]
    pg 2.13 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.14 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [6]
    pg 2.15 is down, acting [11]
    pg 2.18 is down, acting [7,11]
    pg 2.26 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [11]
    pg 2.27 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [7]
    pg 2.39 is down, acting [11]
    pg 2.3a is down, acting [11,7]
    pg 2.42 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.48 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.4c is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 2.4e is down, acting [13,10]
    pg 2.53 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.54 is stuck inactive for 16h, current state undersized+degraded+remapped+backfilling+peered, last acting [13]
    pg 2.67 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 2.6a is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [6]
    pg 2.6f is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
    pg 2.73 is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
    pg 3.3 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [6]
    pg 3.4 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [11]
    pg 3.6 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 3.8 is stuck inactive for 5m, current state unknown, last acting []
    pg 3.14 is stuck inactive for 16h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 3.17 is down, acting [3,11]
    pg 3.1a is stuck inactive for 5m, current state unknown, last acting []
    pg 3.1d is down, acting [11,3]
    pg 4.2 is stuck inactive for 16h, current state undersized+degraded+peered, last acting [11]
    pg 4.d is stuck inactive for 5m, current state unknown, last acting []
    pg 4.18 is stuck inactive for 5m, current state unknown, last acting []
    pg 4.19 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 5.7 is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
    pg 5.8 is stuck inactive for 15h, current state undersized+peered, last acting [11]
    pg 5.19 is stuck inactive for 16h, current state undersized+peered, last acting [10]
    pg 5.1b is stuck inactive for 9w, current state undersized+peered, last acting [11]
    pg 6.e is stuck inactive for 16h, current state undersized+degraded+peered, last acting [11]
    pg 6.f is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 6.12 is stuck inactive for 17h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 6.14 is down, acting [7,10]
    pg 6.17 is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
    pg 6.19 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 6.1a is stuck inactive for 17h, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 7.4 is stuck inactive for 5m, current state unknown, last acting []
    pg 7.a is stuck inactive for 15h, current state undersized+degraded+peered, last acting [11]
    pg 7.e is stuck inactive for 15h, current state undersized+degraded+peered, last acting [10]
    pg 7.1b is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
    pg 7.1c is stuck inactive for 16h, current state undersized+degraded+peered, last acting [10]
[WRN] PG_DEGRADED: Degraded data redundancy: 341985/1100115 objects degraded (31.086%), 188 pgs degraded, 198 pgs undersized
    pg 2.41 is active+undersized+degraded+remapped+backfill_wait, acting [10,6]
    pg 2.42 is stuck undersized for 5m, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.43 is stuck undersized for 5m, current state active+undersized+degraded, last acting [7,11]
    pg 2.44 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [11,6]
    pg 2.45 is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,7]
    pg 2.46 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [13,7]
    pg 2.47 is stuck undersized for 5m, current state active+undersized+degraded, last acting [13,11]
    pg 2.48 is stuck undersized for 5m, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.49 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,13]
    pg 2.4a is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,13]
    pg 2.4b is stuck undersized for 5m, current state active+undersized+degraded, last acting [7,11]
    pg 2.4c is stuck undersized for 5m, current state undersized+degraded+peered, last acting [10]
    pg 2.4d is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,7]
    pg 2.4f is stuck undersized for 5m, current state active+undersized+degraded, last acting [7,11]
    pg 2.50 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,6]
    pg 2.51 is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,3]
    pg 2.52 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,6]
    pg 2.53 is stuck undersized for 5m, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13]
    pg 2.54 is stuck undersized for 5m, current state undersized+degraded+remapped+backfilling+peered, last acting [13]
    pg 2.55 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,3]
    pg 2.56 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [13,6]
    pg 2.57 is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,3]
    pg 2.59 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,6]
    pg 2.5a is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,11]
    pg 2.5b is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,6]
    pg 2.5c is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,7]
    pg 2.5d is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [11,7]
    pg 2.5e is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [6,10]
    pg 2.61 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,13]
    pg 2.62 is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,7]
    pg 2.64 is stuck undersized for 5m, current state active+undersized+degraded, last acting [13,10]
    pg 2.65 is stuck undersized for 5m, current state active+undersized+degraded, last acting [6,10]
    pg 2.67 is stuck undersized for 5m, current state undersized+degraded+peered, last acting [10]
    pg 2.68 is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,11]
    pg 2.69 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,7]
    pg 2.6a is stuck undersized for 5m, current state undersized+degraded+remapped+backfill_wait+peered, last acting [6]
    pg 2.6b is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,11]
    pg 2.6c is stuck undersized for 5m, current state active+undersized+degraded, last acting [6,11]
    pg 2.6d is stuck undersized for 5m, current state active+undersized+degraded, last acting [3,10]
    pg 2.6e is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,6]
    pg 2.6f is stuck undersized for 5m, current state undersized+degraded+peered, last acting [10]
    pg 2.70 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,3]
    pg 2.71 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,3]
    pg 2.73 is stuck undersized for 5m, current state undersized+degraded+peered, last acting [10]
    pg 2.74 is stuck undersized for 5m, current state active+undersized+degraded, last acting [7,11]
    pg 2.75 is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,6]
    pg 2.76 is stuck undersized for 5m, current state active+undersized+degraded, last acting [11,13]
    pg 2.77 is stuck undersized for 5m, current state active+undersized+degraded, last acting [10,7]
    pg 2.79 is stuck undersized for 5m, current state active+undersized+degraded, last acting [13,10]
    pg 2.7b is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,3]
    pg 2.7c is stuck undersized for 5m, current state active+undersized+degraded+remapped+backfill_wait, last acting [13,10]
[WRN] RECENT_CRASH: 5 daemons have recently crashed
    osd.1 crashed on host proxmox01 at 2025-09-24T17:14:13.350313Z
    osd.4 crashed on host proxmox02 at 2025-09-24T17:15:09.653364Z
    osd.9 crashed on host proxmox03 at 2025-09-24T17:24:16.380719Z
    osd.8 crashed on host proxmox03 at 2025-09-24T17:24:16.381189Z
    osd.2 crashed on host proxmox01 at 2025-09-24T17:25:16.134573Z
[WRN] SLOW_OPS: 1 slow ops, oldest one blocked for 134 sec, osd.6 has slow ops
 
Fix your time synchronization as the first step.
Code:
clock skew detected on mon.proxmox03, mon.proxmox01
Thanks, this got fixed automatically after restoring all machines from backup. The NTP server is virtualized as well and wasn't reachable, now it is.