cannot stop ceph OSD from command line

Magneto

Well-Known Member
Jul 30, 2017
133
4
58
45
Hi,

I am trying to stop a CEPH OSD from the command line but it doesn't want to stop:

Code:
root@virt2:~# ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.50461 root default
-3       21.83487     host virt1
 0   hdd  7.27829         osd.0      up  1.00000 1.00000
 1   hdd  7.27829         osd.1      up  1.00000 1.00000
 2   hdd  7.27829         osd.2      up  1.00000 1.00000
-5       21.83487     host virt2
 3   hdd  7.27829         osd.3      up  1.00000 1.00000
 4   hdd  7.27829         osd.4      up  1.00000 1.00000
 5   hdd  7.27829         osd.5      up  1.00000 1.00000
-7       21.83487     host virt3
 6   hdd  7.27829         osd.6      up  1.00000 1.00000
 7   hdd  7.27829         osd.7      up  1.00000 1.00000
 8   hdd  7.27829         osd.8      up  1.00000 1.00000
root@virt2:~# /etc/init.d/ceph stop osd.3
[ ok ] Stopping ceph (via systemctl): ceph.service.

It doesn't produce any errors, but it doesn't stop either.

Code:
root@virt2:~# ps aux  | grep osd
ceph      2297  1.5  1.8 2885032 2171084 ?     Ssl  Nov06  23:32 /usr/bin/ceph-osd -f --cluster ceph --id 4 --setuser ceph --setgroup ceph
ceph      2421  1.6  1.9 3007524 2243800 ?     Ssl  Nov06  26:17 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
ceph      2765  1.3  1.8 2837440 2112020 ?     Ssl  Nov06  21:22 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
root     26124  0.0  0.0 352440 30112 pts/0    Sl+  13:04   0:00 /usr/bin/python2.7 /usr/bin/ceph osd perf
root     26128  0.0  0.0  12788   988 pts/2    S+   13:04   0:00 grep osd
root@virt2:~# ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.50461 root default
-3       21.83487     host virt1
 0   hdd  7.27829         osd.0      up  1.00000 1.00000
 1   hdd  7.27829         osd.1      up  1.00000 1.00000
 2   hdd  7.27829         osd.2      up  1.00000 1.00000
-5       21.83487     host virt2
 3   hdd  7.27829         osd.3      up  1.00000 1.00000
 4   hdd  7.27829         osd.4      up  1.00000 1.00000
 5   hdd  7.27829         osd.5      up  1.00000 1.00000
-7       21.83487     host virt3
 6   hdd  7.27829         osd.6      up  1.00000 1.00000
 7   hdd  7.27829         osd.7      up  1.00000 1.00000
 8   hdd  7.27829         osd.8      up  1.00000 1.00000

I can, however, stop it from within Proxmox web interface:

Code:
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.50461 root default
-3       21.83487     host virt1
 0   hdd  7.27829         osd.0      up  1.00000 1.00000
 1   hdd  7.27829         osd.1      up  1.00000 1.00000
 2   hdd  7.27829         osd.2      up  1.00000 1.00000
-5       21.83487     host virt2
 3   hdd  7.27829         osd.3    down  1.00000 1.00000
 4   hdd  7.27829         osd.4      up  1.00000 1.00000
 5   hdd  7.27829         osd.5      up  1.00000 1.00000
-7       21.83487     host virt3
 6   hdd  7.27829         osd.6      up  1.00000 1.00000
 7   hdd  7.27829         osd.7      up  1.00000 1.00000
 8   hdd  7.27829         osd.8      up  1.00000 1.00000
root@virt2:~# ps aux  | grep osd
ceph      2297  1.5  1.8 2885032 2171084 ?     Ssl  Nov06  23:33 /usr/bin/ceph-osd -f --cluster ceph --id 4 --setuser ceph --setgroup ceph
ceph      2421  1.6  1.9 3007524 2264100 ?     Ssl  Nov06  26:18 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
root     27947  0.0  0.0  12788   996 pts/2    S+   13:05   0:00 grep osd

Code:
root@virt2:~# pveversion -v
proxmox-ve: 5.1-25 (running kernel: 4.13.4-1-pve)
pve-manager: 5.1-36 (running version: 5.1-36/131401db)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
pve-kernel-4.10.17-2-pve: 4.10.17-20
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.2-pve1~bpo90
ceph: 12.2.1-pve3


What command does proxmox use to stop the OSD?
 
Code:
systemctl stop ceph-osd@ID
(replace ID accordingly)
 
Code:
systemctl stop ceph-osd@ID
(replace ID accordingly)
Hi Fabian,

That doesn't seem to work:

root@virt2:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 65.50461 root default
-3 21.83487 host virt1
0 hdd 7.27829 osd.0 up 1.00000 1.00000
1 hdd 7.27829 osd.1 up 1.00000 1.00000
2 hdd 7.27829 osd.2 up 1.00000 1.00000
-5 21.83487 host virt2
3 hdd 7.27829 osd.3 up 1.00000 1.00000
4 hdd 7.27829 osd.4 up 1.00000 1.00000
5 hdd 7.27829 osd.5 up 1.00000 1.00000
-7 21.83487 host virt3
6 hdd 7.27829 osd.6 up 1.00000 1.00000
7 hdd 7.27829 osd.7 up 1.00000 1.00000
8 hdd 7.27829 osd.8 up 1.00000 1.00000
root@virt2:~# systemctl stop ceph-osd@1


root@virt2:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 65.50461 root default
-3 21.83487 host virt1
0 hdd 7.27829 osd.0 up 1.00000 1.00000
1 hdd 7.27829 osd.1 up 1.00000 1.00000
2 hdd 7.27829 osd.2 up 1.00000 1.00000
-5 21.83487 host virt2
3 hdd 7.27829 osd.3 up 1.00000 1.00000
4 hdd 7.27829 osd.4 up 1.00000 1.00000
5 hdd 7.27829 osd.5 up 1.00000 1.00000
-7 21.83487 host virt3
6 hdd 7.27829 osd.6 up 1.00000 1.00000
7 hdd 7.27829 osd.7 up 1.00000 1.00000
8 hdd 7.27829 osd.8 up 1.00000 1.00000
 
are you running the command on the right host? what does "journalctl -b -u ceph-osd@ID" say?
 
are you running the command on the right host? what does "journalctl -b -u ceph-osd@ID" say?
Aha! So I can only run that command from the host node on which the OSD resides, not from any host node. Gotcha ;)