Ceph - Reduced data availability: 1 pg inactive, 1 pg incomplete

FabioMesquita

New Member
Feb 9, 2024
2
0
1
Hi!

I've got 1 pg and 2 osd down and my VMs can't start.
I'm pretty sure that some of my data is gone, but I'd like to check if there's anything more I can do.

Code:
root@pve4:~# ceph -s
  cluster:
    id:     90f53852-cb9d-4391-aac5-1443f81480be
    health: HEALTH_WARN
            Reduced data availability: 1 pg inactive, 1 pg incomplete
            36 daemons have recently crashed
 
  services:
    mon: 3 daemons, quorum pve3,pve4,pve5 (age 3h)
    mgr: pve4(active, since 3h), standbys: pve3
    osd: 6 osds: 4 up (since 12m), 4 in (since 11m)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 150.08k objects, 547 GiB
    usage:   1.6 TiB used, 2.8 TiB / 4.4 TiB avail
    pgs:     3.030% pgs not active
             32 active+clean
             1  incomplete



root@pve4:~# ceph health detail
HEALTH_WARN Reduced data availability: 1 pg inactive, 1 pg incomplete; 36 daemons have recently crashed
[WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive, 1 pg incomplete
    pg 2.d is incomplete, acting [5,0,3] (reducing pool ceph_pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
[WRN] RECENT_CRASH: 36 daemons have recently crashed
    osd.5 crashed on host pve5 at 2024-10-02T19:50:36.696259Z
    osd.5 crashed on host pve5 at 2024-10-02T19:51:25.948498Z
    osd.5 crashed on host pve5 at 2024-10-02T19:51:47.107306Z
    osd.5 crashed on host pve5 at 2024-10-02T19:52:08.950559Z
    osd.1 crashed on host pve4 at 2024-10-02T21:07:49.966173Z
    osd.4 crashed on host pve3 at 2024-10-02T21:07:57.813673Z
    osd.1 crashed on host pve4 at 2024-10-02T21:18:52.035987Z
    osd.4 crashed on host pve3 at 2024-10-02T21:18:53.732259Z
    osd.1 crashed on host pve4 at 2024-10-02T21:19:19.367670Z
    osd.4 crashed on host pve3 at 2024-10-02T21:19:21.948768Z
    osd.1 crashed on host pve4 at 2024-10-02T21:19:46.982653Z
    osd.4 crashed on host pve3 at 2024-10-02T21:19:48.597716Z
    osd.4 crashed on host pve3 at 2024-10-02T23:23:09.015400Z
    osd.4 crashed on host pve3 at 2024-10-02T23:23:28.177387Z
    osd.4 crashed on host pve3 at 2024-10-02T23:23:46.360725Z
    osd.1 crashed on host pve4 at 2024-10-02T23:40:01.037355Z
    osd.4 crashed on host pve3 at 2024-10-02T23:40:03.442478Z
    osd.1 crashed on host pve4 at 2024-10-02T23:40:20.505999Z
    osd.4 crashed on host pve3 at 2024-10-02T23:40:23.375901Z
    osd.1 crashed on host pve4 at 2024-10-02T23:40:39.545972Z
    osd.4 crashed on host pve3 at 2024-10-02T23:40:43.446893Z
    osd.1 crashed on host pve4 at 2024-10-03T03:09:05.418699Z
    osd.1 crashed on host pve4 at 2024-10-03T03:09:28.064790Z
    osd.1 crashed on host pve4 at 2024-10-03T03:09:47.134319Z
    osd.4 crashed on host pve3 at 2024-10-03T04:05:07.235691Z
    osd.4 crashed on host pve3 at 2024-10-03T04:05:29.596620Z
    osd.4 crashed on host pve3 at 2024-10-03T04:05:52.552804Z
    osd.1 crashed on host pve4 at 2024-10-03T04:23:23.994340Z
    osd.1 crashed on host pve4 at 2024-10-03T04:23:44.774001Z
    osd.1 crashed on host pve4 at 2024-10-03T04:23:00.920855Z
    osd.4 crashed on host pve3 at 2024-10-03T04:23:25.834193Z
    osd.4 crashed on host pve3 at 2024-10-03T04:23:46.800257Z
    osd.4 crashed on host pve3 at 2024-10-03T04:23:02.602489Z
    osd.4 crashed on host pve3 at 2024-10-03T07:46:17.277797Z
    osd.4 crashed on host pve3 at 2024-10-03T07:45:57.326995Z
    osd.4 crashed on host pve3 at 2024-10-03T07:45:38.570018Z
 

Attachments

Hi @gurubert ! Thanks for your reply.

Yes, the OSDs are gone. I can't get them into the pool no more.
I changed the min_size to 1, but no change. No replications were made and still can't start my VMs.
I guess this cluster is gone. I'll start it all again from scratch.