Some osd's didnt start after cluster upgrade

Lamarus

Well-Known Member
Sep 18, 2017
52
1
46
i upgraded my ceph cluster from Nautilus to Octopus and some osd's services didnt start automaticaly. Please help me understand why ?

ceph --version ceph version 15.2.16 (a6b69e817d6c9e6f02d0a7ac3043ba9cdbda1bdf) octopus (stable)
 

Attachments

  • Screenshot 2022-06-20 at 20-04-48 petr-stor1 - Proxmox Virtual Environment.png
    Screenshot 2022-06-20 at 20-04-48 petr-stor1 - Proxmox Virtual Environment.png
    31.6 KB · Views: 8
osd.32 is still version 12. Have you restarted the OSD systemd unit after installing the latest packages?
yes, i did systemctl restart ceph-osd.target on all nodes. When ugraded from Luminous to Nautilus and further.
 
Last edited:
Sometimes ceph-osd.target does not restart all OSDs for reasons unknown to me.
Have you tried systemctl restart ceph-osd@32.service ?
restart failed, logs said that no keyring file

Jun 21 18:15:18 petr-stor6 ceph-osd[995660]: 2022-06-21T18:15:18.340+0600 7fa87587ad80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-32/keyring: (2) No such file or directory Jun 21 18:15:18 petr-stor6 ceph-osd[995660]: 2022-06-21T18:15:18.340+0600 7fa87587ad80 -1 AuthRegistry(0x55bcc20ca940) no keyring found at /var/lib/ceph/osd/ceph-32/keyring, disabling cephx Jun 21 18:15:18 petr-stor6 ceph-osd[995660]: 2022-06-21T18:15:18.340+0600 7fa87587ad80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-32/keyring: (2) No such file or directory Jun 21 18:15:18 petr-stor6 ceph-osd[995660]: 2022-06-21T18:15:18.340+0600 7fa87587ad80 -1 AuthRegistry(0x7ffff1dfcf20) no keyring found at /var/lib/ceph/osd/ceph-32/keyring, disabling cephx

directory /var/lib/ceph/osd/ceph-32/ is empty for some reason...
 
directory /var/lib/ceph/osd/ceph-32/ is empty for some reason..

this directory should be mounted

can you verify with

#df ?


maybe related

https://forum.proxmox.com/threads/directory-var-lib-ceph-osd-ceph-id-is-empty.57344/

"Did it create the /etc/ceph/osd/{OSDID}-GUID.json file(s)? If not then try to run the ceph-volume simple scan /dev/sdX1 and see if it gets created."

https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes
"
On each host, tell ceph-volume to adapt the OSDs created with ceph-disk using the following two commands:

ceph-volume simple scan
ceph-volume simple activate --all
"
 
this directory should be mounted

can you verify with

#df ?


maybe related

https://forum.proxmox.com/threads/directory-var-lib-ceph-osd-ceph-id-is-empty.57344/

"Did it create the /etc/ceph/osd/{OSDID}-GUID.json file(s)? If not then try to run the ceph-volume simple scan /dev/sdX1 and see if it gets created."

https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes
"
On each host, tell ceph-volume to adapt the OSDs created with ceph-disk using the following two commands:

ceph-volume simple scan
ceph-volume simple activate --all
"
How to indicate the binding of a block device to the osd ? I can't find 32 osd not in lsblk not in webgui proxmox. And appreciate your help !
 
when i try to re-enable osd 0 i get error:
ceph-volume simple activate 0 41a5b45b-5b4c-4898-9ae8-458093085842 --> Required devices (data, and journal) not present for filestore --> filestore devices found: ['data'] --> RuntimeError: Unable to activate filestore OSD due to missing devices root@petr-stor1:~# nano /etc/ceph/osd/0-41a5b45b-5b4c-4898-9ae8-458093085842.json

osd config file:
cat /etc/ceph/osd/0-41a5b45b-5b4c-4898-9ae8-458093085842.json { "active": "ok", "ceph_fsid": "58350e35-dbea-448f-9445-bd3864a14b8e", "cluster_name": "ceph", "data": { "path": "/dev/sdb1", "uuid": "41a5b45b-5b4c-4898-9ae8-458093085842" }, "fsid": "41a5b45b-5b4c-4898-9ae8-458093085842", "journal.bak": { "path": "/dev/disk/by-partuuid/ef5dbe63-4b32-4853-aa7f-6c87885b47c2", "uuid": "ef5dbe63-4b32-4853-aa7f-6c87885b47c2" }, "journal_uuid": "ef5dbe63-4b32-4853-aa7f-6c87885b47c2", "keyring": "AQCQZSpUmMZrDRAASMQTZOlHfI++NEkUMWB3Mw==", "magic": "ceph osd volume v026", "ready": "ready", "require_osd_release": 15, "systemd": "", "whoami": 0, "type": "filestore" }