Bonjour;
One of my 3 servers crash.
After reboot , every thing is ok, but the monitor on it, would'n't start.
i tried to restart the server, destroy the monitor , recreate it. No luck.
So now, i only have two monitor :
I think the monitor is running :
But on the interface, i have status ; stopped , adress : unknown , quorum no.
Any idea how to "clean up " the monitor , in order to recerate it correctly ?
Thanks.
dark26
Solution : with this :
https://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/
i succeeded to repair the monitor.
One of my 3 servers crash.
After reboot , every thing is ok, but the monitor on it, would'n't start.
i tried to restart the server, destroy the monitor , recreate it. No luck.
So now, i only have two monitor :
Code:
root@p2:~# ceph -s
cluster:
id: 4124cd8e-01ed-4a0d-b97b-737100ffccd2
health: HEALTH_WARN
mon p1 is low on available space
services:
mon: 2 daemons, quorum p1,p3 (age 40h)
mgr: p1(active, since 46h), standbys: p3, p2
mds: cephfs:1 {0=p3=up:active} 2 up:standby
osd: 3 osds: 3 up (since 40h), 3 in (since 45h)
data:
pools: 3 pools, 250 pgs
objects: 16.19k objects, 62 GiB
usage: 184 GiB used, 173 GiB / 357 GiB avail
pgs: 250 active+clean
io:
client: 62 KiB/s rd, 733 KiB/s wr, 1 op/s rd, 40 op/s wr
I think the monitor is running :
Code:
root@p2:/var/lib/ceph# service ceph-mon@p2 status
● ceph-mon@p2.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: active (running) since Sat 2019-11-23 18:09:53 CET; 1 day 16h ago
Main PID: 28053 (ceph-mon)
Tasks: 27
Memory: 778.9M
CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@p2.service
└─28053 /usr/bin/ceph-mon -f --cluster ceph --id p2 --setuser ceph --setgroup ceph
nov. 23 18:09:53 p2 systemd[1]: Started Ceph cluster monitor daemon.
nov. 24 00:00:00 p2 ceph-mon[28053]: 2019-11-24 00:00:00.833 7f96be5d2700 -1 Fail to open '/proc/266427/cmdline' error = (2) No such file or directory
nov. 24 00:00:00 p2 ceph-mon[28053]: 2019-11-24 00:00:00.861 7f96be5d2700 -1 received signal: Hangup from <unknown> (PID: 266427) UID: 0
nov. 24 00:00:00 p2 ceph-mon[28053]: 2019-11-24 00:00:00.885 7f96be5d2700 -1 received signal: Hangup from pkill -1 -x ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw (PID: 266429) UID:
nov. 25 00:00:00 p2 ceph-mon[28053]: 2019-11-25 00:00:00.751 7f96be5d2700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw (PID: 805408) UID
nov. 25 00:00:00 p2 ceph-mon[28053]: 2019-11-25 00:00:00.775 7f96be5d2700 -1 received signal: Hangup from pkill -1 -x ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw (PID: 805409) UID:
lines 1-17/17 (END)
But on the interface, i have status ; stopped , adress : unknown , quorum no.
Any idea how to "clean up " the monitor , in order to recerate it correctly ?
Thanks.
dark26
Solution : with this :
https://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/
i succeeded to repair the monitor.
Last edited: