Hi,
I had trouble with my ceph cluster after rebooting the nodes sequentially.
This is fixed in the meantime, however there's an error message when executing ceph health detail:
root@ld3955:~# ceph health detail
HEALTH_WARN 2 pools have many more objects per pg than average; Reduced data availability: 3 pgs inactive; clock skew detected on mon.ld5506
MANY_OBJECTS_PER_PG 2 pools have many more objects per pg than average
pool hdd objects per pg (29) is more than 29 times cluster average (1)
pool pve_cephfs_data objects per pg (41) is more than 41 times cluster average (1)
PG_AVAILABILITY Reduced data availability: 3 pgs inactive
pg 4.4b is stuck inactive since forever, current state unknown, last acting [76]
pg 4.28a is stuck inactive since forever, current state unknown, last acting [76]
pg 4.30f is stuck inactive since forever, current state unknown, last acting [76]
Checking the
root@ld3955:~# ceph pg 4.4b query
{
"state": "unknown",
"snap_trimq": "[]",
"snap_trimq_len": 0,
"epoch": 10106,
"up": [
104,
148,
178
],
"acting": [
76
],
[...]
"peer_info": [],
"recovery_state": [
{
"name": "Started/Primary/Peering/WaitActingChange",
"enter_time": "2019-06-12 13:43:12.205232",
"comment": "waiting for pg acting set to change"
},
{
"name": "Started",
"enter_time": "2019-06-12 13:43:12.195757"
}
],
"agent_state": {}
}
Can you please advise how to fix this?
THX
I had trouble with my ceph cluster after rebooting the nodes sequentially.
This is fixed in the meantime, however there's an error message when executing ceph health detail:
root@ld3955:~# ceph health detail
HEALTH_WARN 2 pools have many more objects per pg than average; Reduced data availability: 3 pgs inactive; clock skew detected on mon.ld5506
MANY_OBJECTS_PER_PG 2 pools have many more objects per pg than average
pool hdd objects per pg (29) is more than 29 times cluster average (1)
pool pve_cephfs_data objects per pg (41) is more than 41 times cluster average (1)
PG_AVAILABILITY Reduced data availability: 3 pgs inactive
pg 4.4b is stuck inactive since forever, current state unknown, last acting [76]
pg 4.28a is stuck inactive since forever, current state unknown, last acting [76]
pg 4.30f is stuck inactive since forever, current state unknown, last acting [76]
Checking the
root@ld3955:~# ceph pg 4.4b query
{
"state": "unknown",
"snap_trimq": "[]",
"snap_trimq_len": 0,
"epoch": 10106,
"up": [
104,
148,
178
],
"acting": [
76
],
[...]
"peer_info": [],
"recovery_state": [
{
"name": "Started/Primary/Peering/WaitActingChange",
"enter_time": "2019-06-12 13:43:12.205232",
"comment": "waiting for pg acting set to change"
},
{
"name": "Started",
"enter_time": "2019-06-12 13:43:12.195757"
}
],
"agent_state": {}
}
Can you please advise how to fix this?
THX