[SOLVED] Ceph OSD issue

mada

Member
Aug 16, 2017
99
3
13
37
Hello,

I used 3 nodes of Ceph with 3 x 5TB drive and 2 SSD as Journal however once i did creat the OSD they are not visible in the GUI and shows down in putty i checked with google but can't find any solution
root@xxx:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
0 0 osd.0 down 1.00000 1.00000
1 0 osd.1 down 1.00000 1.00000
2 0 osd.2 down 1.00000 1.00000
3 0 osd.3 down 1.00000 1.00000
4 0 osd.4 down 1.00000 1.00000
5 0 osd.5 down 1.00000 1.00000
6 0 osd.6 down 1.00000 1.00000
root@xxx:~#


root@xxxx:/var/log/ceph# cat /etc/ceph/ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.1.0/24
fsid = fb03a602-6640-425d-adeb-0eb01ada6b27
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network = 10.10.1.0/24

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.ceph3]
host = ceph3
mon addr = 10.10.1.3:6789

[mon.ceph4]
host = ceph4
mon addr = 10.10.1.4:6789

[mon.ceph2]
host = ceph2
mon addr = 10.10.1.2:6789


root@xxx:~# ceph health detail |more
HEALTH_WARN 7 osds down
OSD_DOWN 7 osds down
osd.0 () is down
osd.1 () is down
osd.2 () is down
osd.3 () is down
osd.4 () is down
osd.5 () is down
osd.6 () is down
root@xxx:~#



proxmox-ve: 5.2-2 (running kernel: 4.15.17-1-pve)
pve-manager: 5.2-1 (running version: 5.2-1/0fcd7879)
pve-kernel-4.15: 5.2-1
pve-kernel-4.15.17-1-pve: 4.15.17-9
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-31
libpve-guest-common-perl: 2.0-16
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-23
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-18
pve-cluster: 5.0-27
pve-container: 2.0-23
pve-docs: 5.2-4
pve-firewall: 3.0-9
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-5
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-5
qemu-server: 5.0-26
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
 
This usually happens, when the disks where used before. See the thread below, to remove the OSDs (starting with #2).
https://forum.proxmox.com/threads/phantom-destroyed-osd.43794/

Once done, use dd to zero the first ~200MB of the disk, this usually is enough to remove any leftover. After that a creation of the OSD should work.
 
  • Like
Reactions: mada

I used both way and never fixed.

remove the osd
# ceph osd out osd. {osd-num}
# ceph osd crush remove osd. {osd-num}
ceph osd down osd.
# ceph auth del osd. {osd-num}
# ceph osd rm osd. {osd-num}

parted /dev/sdb
sgdisk -Z /dev/sdb

root@xx:~# dd if=/dev/sdb of=/fil.img
^C^X2233312+0 records in
2233312+0 records out
1143455744 bytes (1.1 GB, 1.1 GiB) copied, 7.74869 s, 148 MB/s
root@xx:~#

root@xx:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-2 0 host ceph2
-1 0 root default
0 0 osd.0 down 0 1.00000
root@xx:~#

I did format again and do

removed the osd
parted /dev/sdb
sgdisk -Z /dev/sdb

root@xx:~# dd if=/dev/null of=/dev/sdb
0+0 records in
0+0 records out
0 bytes copied, 0.000122055 s, 0.0 kB/s
root@xx:~#

root@xx:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-2 0 host ceph2
-1 0 root default
0 0 osd.0 down 0 1.00000
root@xx:~#


still same issue
 
more update i just used

dd if=/dev/zero of=/dev/sdd bs=1M count=1024 conv=fdatasync

it works now.

thanks