Hi,
i'm looking for some help/ideas/advices in order to solve the problem that occurs on my metadata
server after the server reboot.
"Ceph status" warns about my MDS being "read only" but the fileystem and the data seem healthy.
It is still possible to access the content of my cephfs volumes since it's read only but i don't know how
to make my filesystem writable again.
Logs keeps showing the same error when i restart the MDS server :
More info:
All MDSs, MONs and OSDs are in version 16.2.9 and running pve 7.2-5
i'm looking for some help/ideas/advices in order to solve the problem that occurs on my metadata
server after the server reboot.
"Ceph status" warns about my MDS being "read only" but the fileystem and the data seem healthy.
It is still possible to access the content of my cephfs volumes since it's read only but i don't know how
to make my filesystem writable again.
Logs keeps showing the same error when i restart the MDS server :
Code:
2022-11-04T11:50:14.506+0100 7fbbf83c2700 1 mds.0.6872 handle_mds_map state change up:reconnect --> up:rejoin
2022-11-04T11:50:14.510+0100 7fbbf83c2700 1 mds.0.6872 rejoin_start
2022-11-04T11:50:14.510+0100 7fbbf83c2700 1 mds.0.6872 rejoin_joint_start
2022-11-04T11:50:14.702+0100 7fbbf83c2700 1 mds.0.6872 rejoin_done
2022-11-04T11:50:15.546+0100 7fbbf83c2700 1 mds.node3-5 Updating MDS map to version 6881 from mon.3
2022-11-04T11:50:15.546+0100 7fbbf83c2700 1 mds.0.6872 handle_mds_map i am now mds.0.6872
2022-11-04T11:50:15.546+0100 7fbbf83c2700 1 mds.0.6872 handle_mds_map state change up:rejoin --> up:active
2022-11-04T11:50:15.546+0100 7fbbf83c2700 1 mds.0.6872 recovery_done -- successful recovery!
2022-11-04T11:50:15.550+0100 7fbbf83c2700 1 mds.0.6872 active_start
2022-11-04T11:50:15.558+0100 7fbbf83c2700 1 mds.0.6872 cluster recovered.
2022-11-04T11:50:18.190+0100 7fbbf5bbd700 -1 mds.pinger is_rank_lagging: rank=0 was never sent ping request.
2022-11-04T11:50:18.190+0100 7fbbf5bbd700 -1 mds.pinger is_rank_lagging: rank=1 was never sent ping request.
2022-11-04T11:50:18.554+0100 7fbbf23b6700 1 mds.0.cache.dir(0x1000006cf14) commit error -22 v 1933183
2022-11-04T11:50:18.554+0100 7fbbf23b6700 -1 log_channel(cluster) log [ERR] : failed to commit dir 0x1000006cf14 object, errno -22
2022-11-04T11:50:18.554+0100 7fbbf23b6700 -1 mds.0.6872 unhandled write error (22) Invalid argument, force readonly...
2022-11-04T11:50:18.554+0100 7fbbf23b6700 1 mds.0.cache force file system read-only
2022-11-04T11:50:18.554+0100 7fbbf23b6700 0 log_channel(cluster) log [WRN] : force file system read-only
More info:
Code:
cluster:
id: f36b996f-221d-4bcb-834b-19fc20bcad6b
health: HEALTH_WARN
1 MDSs are read only
1 MDSs behind on trimming
services:
mon: 5 daemons, quorum node2-4,node2-5,node3-4,node3-5,node1-1 (age 22h)
mgr: node2-4(active, since 28h), standbys: node2-5, node3-4, node3-5, node1-1
mds: 3/3 daemons up, 3 standby
osd: 112 osds: 112 up (since 22h), 112 in (since 2w)
data:
volumes: 2/2 healthy
pools: 12 pools, 529 pgs
objects: 8.54M objects, 1.9 TiB
usage: 7.8 TiB used, 38 TiB / 46 TiB avail
pgs: 491 active+clean
29 active+clean+snaptrim
9 active+clean+snaptrim_wait
All MDSs, MONs and OSDs are in version 16.2.9 and running pve 7.2-5