[SOLVED] ceph mgr and mon issue after upgrade to Luminous

RobFantini · Sep 4, 2017

after upgrade here ceph status:

Code:

# ceph -s
  cluster:
    id:     75bc38f7-d42c-449b-88ed-488c7778a551
    health: HEALTH_OK
  services:
    mon: 3 daemons, quorum 2,sys8,1
    mgr: no daemons active
    osd: 18 osds: 18 up, 18 in
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   2005 MB used, 7952 GB / 7954 GB avail
    pgs:

so no managers are active.

I tried to delete a mon. then was going to recreate it but got this:

Code:

# pveceph destroymon sys8
ceph manager directory '/var/lib/ceph/mgr/ceph-sys8' not found

Is there another way to delete mons ?

I figure delete and then add will create the managers.

dcsapak · Sep 4, 2017

RobFantini said:
I figure delete and then add will create the managers.

this is correct

RobFantini said:
# pveceph destroymon sys8
ceph manager directory '/var/lib/ceph/mgr/ceph-sys8' not found

did you do this on the host where the 'sys8' monitor is? (the web interface does this automatically on the correct host)

RobFantini · Sep 4, 2017

the mons were deleted , the missing mdr directory was just a warning.

Now like another recent thread I can no create a mon.

Code:

# pveceph createmon
got timeout

RobFantini · Sep 4, 2017

and this also times out: ceph -s .

RobFantini · Sep 4, 2017

dcsapak said:
this is correct

did you do this on the host where the 'sys8' monitor is? (the web interface does this automatically on the correct host)

Yes.

pabernethy · Sep 4, 2017

Did you destroy all monitors? Because then your cluster is no longer quorate. Then you have to extract the monmap, edit it to contain the monitors you want (at least a quorate amount) and inject it back into all nodes.

RobFantini · Sep 4, 2017

ok I am following: http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/?highlight=extract monmap

2-No quorum? Grab the monmap directly from another monitor (this assumes the monitor you are grabbing the monmap from has id ID-FOO and has been stopped): ceph-mon -i ID-FOO --extract-monmap /tmp/monmap

i am using cli on the only node that has a mon.

Code:

sys10  ~ # ls /var/lib/ceph/mon/
ceph-1/

sys10  ~ # ceph-mon -i  1 --extract-monmap /tmp/monmap
2017-09-04 10:45:57.943927 7ff4458c1f80 -1 IO error: lock /var/lib/ceph/mon/ceph-1/store.db/LOCK: Resource temporarily unavailable

2017-09-04 10:45:57.944052 7ff4458c1f80 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-1': (22) Invalid argument

am I doing something wrong ?

RobFantini · Sep 4, 2017

now that ceph status times out , pve is starting to have issues. for instance when I click 'summary' for a node or vm 'status' info is missing .

and a vm reported a time issue. that can lead to data issues.

while I'd like to debug and fix this, I may need to scrap our ceph setup and start over.

Just in case we run in to issues with vm's - what is the command to stop ceph?

pabernethy · Sep 4, 2017

RobFantini said:
this assumes the monitor you are grabbing the monmap from […] has been stopped

has it?

RobFantini · Sep 4, 2017

pabernethy said:
has it?

No. what is the command to do that? still learning systemd syntax here

aderumier · Sep 4, 2017

RobFantini said:
No. what is the command to do that? still learning systemd syntax here

systemctl (status|stop|start) ceph-mon@0 (remplace 0 with id of the monitor , same for ceph-osd, .. )

pabernethy · Sep 4, 2017

Code:

USAGE: pveceph stop [<service>]
  Stop ceph services.
  <service>  (mon|mds|osd|mgr)\.[A-Za-z0-9\-]{1,32}
             Ceph service name.

RobFantini · Sep 4, 2017

thanks and another question.
after this

Code:

sys10  ~ # systemctl stop ceph-mon@1
sys10  ~ # ceph-mon -i  1 --extract-monmap /tmp/monmap
2017-09-04 11:11:55.788874 7f44e5c6ff80 -1 wrote monmap to /tmp/monmap

how do decode /tmp/monmap in to an editable text file? I know how to do for crush. and can not find how to yet.. still searching

RobFantini · Sep 4, 2017

i think i found it..
monmaptool --print /tmp/monmap

Code:

monmaptool: monmap file /tmp/monmap
epoch 35
fsid 75bc38f7-d42c-449b-88ed-488c7778a551
last_changed 2017-09-04 10:13:46.194524
created 2017-02-26 09:11:33.212436
0: 10.11.12.8:6789/0 mon.sys8
1: 10.11.12.10:6789/0 mon.1

so will try to just inject that file..

RobFantini · Sep 4, 2017

Code:

sys10  ~ # ceph-mon -i 1 --inject-monmap /tmp/monmap
sys10  ~ # systemctl start ceph-mon@1

# systemctl status ceph-mon@1
● ceph-mon@1.service - Ceph cluster monitor daemon
   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-mon@.service.d
           └─ceph-after-pve-cluster.conf
   Active: active (running) since Mon 2017-09-04 11:22:45 EDT; 21s ago
 Main PID: 21556 (ceph-mon)
    Tasks: 20
   CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@1.service
           └─21556 /usr/bin/ceph-mon -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

Sep 04 11:22:45 sys10 systemd[1]: Started Ceph cluster monitor daemon.

However ceph -s still times out..

pabernethy · Sep 4, 2017

If only mon@1 is running that's no surprise. mon@sys8 has to run too, for the cluster to be quorate.

RobFantini · Sep 4, 2017

probably i need inject the other mon. will try to do so at correct node.

Code:

ceph-mon -i  0  --inject-monmap /tmp/monmap

pabernethy · Sep 4, 2017

RobFantini said:
probably i need inject the other mon. will try to do so at correct node.

You really shouldn't run commands without understanding what they do.
Your monmap contains two mons. 1 and sys8. So the IPs they are assigned should be the nodes that run those monitors.

RobFantini · Sep 4, 2017

so at sys8:

Code:

# ceph-mon -i sys8  --inject-monmap /tmp/monmap    
2017-09-04 11:31:42.511553 7f58f27bcf80 -1 monitor data directory at '/var/lib/ceph/mon/ceph-sys8' does not exist: have yo
u run 'mkfs'?

is there a way around that?

pabernethy · Sep 4, 2017

It's basically telling you that the monitor doesn't exist. So you need to create it.

[SOLVED] ceph mgr and mon issue after upgrade to Luminous

Famous Member

Proxmox Staff Member

Famous Member

Famous Member

Famous Member

Proxmox Retired Staff

Famous Member

Famous Member

Proxmox Retired Staff

Famous Member

Renowned Member

Proxmox Retired Staff

Famous Member

Famous Member

Famous Member

Proxmox Retired Staff

Famous Member

Proxmox Retired Staff

Famous Member

Proxmox Retired Staff

We value your privacy