[SOLVED] Ceph OSD not start on reboot

Mar 23, 2016
19
3
23
Hi,

I am used 3 ceph node with 3 osd/node.
If reboot nodes.
The 3. node 3. osd cannot start:
bluestore(/var/lib/ceph/osd/ceph-8/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-8/block: (13) Permission denied

ls -l /dev/sd*
brw-rw---- 1 root disk 8, 32 júl 21 13.57 /dev/sdc
brw-rw---- 1 root disk 8, 33 júl 21 13.57 /dev/sdc1
brw-rw---- 1 ceph ceph 8, 34 júl 21 14.58 /dev/sdc2
brw-rw---- 1 root disk 8, 48 júl 21 13.57 /dev/sdd
brw-rw---- 1 root disk 8, 49 júl 21 13.57 /dev/sdd1
brw-rw---- 1 ceph ceph 8, 50 júl 21 14.58 /dev/sdd2
brw-rw---- 1 root disk 8, 64 júl 21 13.57 /dev/sde
brw-rw---- 1 root disk 8, 65 júl 21 13.57 /dev/sde1
brw-rw---- 1 root root 8, 66 júl 21 14.58 /dev/sde2
If i run:

chown ceph:ceph /dev/sde2
systemctl reset-failed ceph-osd@8

osd start normally.
This happens every time you restart, and only with node3 osd3.
Thank you in advance.

L,
 
Which Ceph version? (pveversion -v)

Have you tried to destroy and recreate the OSD? To do so, and to avoid a rebalance/recover until the OSD has been recreated, first enable the "norecover" and "norebalance" OSD flags. Then stop the OSD and set it to OUT. Once it is stopped and out, you can destroy it (make sure the "Cleanup Disk" checkbox is active).

Then recreate the OSD and once it is back UP and IN, disable the previously set OSD flags to let Ceph recreate the data on that OSD. Once the cluster is healthy again, you can try to reboot that node and see if the problem persists.
 
proxmox-ve: 7.0-2 (running kernel: 5.11.22-2-pve)
pve-manager: 7.0-10 (running version: 7.0-10/d2f465d3)
pve-kernel-5.11: 7.0-5
pve-kernel-helper: 7.0-5
pve-kernel-5.4: 6.4-3
pve-kernel-5.3: 6.1-6
pve-kernel-5.11.22-2-pve: 5.11.22-4
pve-kernel-5.11.22-1-pve: 5.11.22-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
ceph: 16.2.5-pve1
ceph-fuse: 16.2.5-pve1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.2.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-5
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-9
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.4-1
proxmox-backup-file-restore: 2.0.4-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-5
pve-cluster: 7.0-3
pve-container: 4.0-8
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-10
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

Thanks, I'll try
L,
 
I re-created the OSD. Unfortunately, it still doesn't start after reboot. There is a problem with the owner of the sde device:
brw-rw ---- 1 root disk 8, 32 Jul 26 16.48 sdc
brw-rw ---- 1 root disk 8, 33 Jul 26 16.48 sdc1
brw-rw ---- 1 ceph ceph 8, 34 Jul 26 16.49 sdc2
brw-rw ---- 1 root disk 8, 48 Jul 26 16.48 sdd
brw-rw ---- 1 root disk 8, 49 Jul 26 16.48 sdd1
brw-rw ---- 1 ceph ceph 8, 50 Jul 26 16.49 sdd2
brw-rw ---- 1 root disk 8, 64 Jul 26 16.48 sde

I created osd-s two years ago, now I deleted and re-created it on the sde device. The allocation of partitions is completely different on the newly created OSD. Is this causing a problem? Should I re-create all the OSDs?
 

Attachments

  • osds.png
    osds.png
    7.2 KB · Views: 28
I ask again: I need to recreate all OSDs?
You don't need to, but occassionally the on disk format for new OSDs is changed. Recreating all OSDs (one by one) can be done and might be necessary at some point in the future if the old OSD format is not supported anymore.

Therefore, if you want to, yes, recreate them, one by one, waiting for the cluster to become healthy in between.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!