Hi,
We have a 3 node pve cluster with ceph
Recently on on of the nodes (on pxmx3) we did a jackout and jackin of the same hdd
Before jackout the drive letter was /dev/sdg
After jack out the drive letter becomes /dev/sdi
Due to this our osd.16 on host pxmx3 went down (down and out). We tried starting osd.16 from proxmox GUI but no luck.
We destroyed the osd.16 from GUI but we are still able to see osd.16 when we run ceph-volume command
When we checked the status of the destroy command of osd.16 we found
We tried to create new OSD from gui but it shows that "No Disk Unused"
When we run the command pveceph osd create give error "already in use".
Current Ceph OSD status:
ceph-volume lvm list:
lsblk:
Please help to understand and solve this issue.
Thanks,
Jayesh
pveversion
pve-manager/6.2-12/b287dd27 (running kernel: 5.4.65-1-pve)
We have a 3 node pve cluster with ceph
Recently on on of the nodes (on pxmx3) we did a jackout and jackin of the same hdd
Before jackout the drive letter was /dev/sdg
After jack out the drive letter becomes /dev/sdi
Due to this our osd.16 on host pxmx3 went down (down and out). We tried starting osd.16 from proxmox GUI but no luck.
We destroyed the osd.16 from GUI but we are still able to see osd.16 when we run ceph-volume command
When we checked the status of the destroy command of osd.16 we found
destroy OSD osd.16
Remove osd.16 from the CRUSH map
Remove the osd.16 authentication key.
Remove OSD osd.16
--> Zapping: /dev/ceph-78396b57-965e-497b-9c03-49e5a4747435/osd-block-12445079-ad19-4157-b7e5-f3cbb4ca71f9
--> Unmounting /var/lib/ceph/osd/ceph-16
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-16
stderr: umount: /var/lib/ceph/osd/ceph-16 unmounted
Running command: /bin/dd if=/dev/zero of=/dev/ceph-78396b57-965e-497b-9c03-49e5a4747435/osd-block-12445079-ad19-4157-b7e5-f3cbb4ca71f9 bs=1M count=10 conv=fsync
stderr: /bin/dd: fsync failed for '/dev/ceph-78396b57-965e-497b-9c03-49e5a4747435/osd-block-12445079-ad19-4157-b7e5-f3cbb4ca71f9': Input/output error
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0150309 s, 698 MB/s
--> RuntimeError: command returned non-zero exit status: 1
command '/usr/sbin/ceph-volume lvm zap --osd-id 16 --destroy' failed: exit code 1
command '/sbin/pvremove /dev/sdi' failed: Insecure dependency in exec while running with -T switch at /usr/share/perl/5.28/IPC/Open3.pm line 178.
TASK OK
We tried to create new OSD from gui but it shows that "No Disk Unused"
When we run the command pveceph osd create give error "already in use".
Current Ceph OSD status:
ceph-volume lvm list:
someuser@pxmx3:~$ sudo ceph-volume lvm list
[sudo] password for someuser:
====== osd.12 ======
[block] /dev/ceph-45ec37f5-20cf-43ec-9972-774efaac5fdd/osd-block-07e247d6-3432-4aef-b882-ceed3422a1dd
block device /dev/ceph-45ec37f5-20cf-43ec-9972-774efaac5fdd/osd-block-07e247d6-3432-4aef-b882-ceed3422a1dd
block uuid t4j0ea-9eXx-npcI-t0y6-JdxJ-Arkq-DzYKLI
cephx lockbox secret
cluster fsid 57ba4b78-40ba-41f6-848d-16b51ab7447f
cluster name ceph
crush device class None
encrypted 0
osd fsid 07e247d6-3432-4aef-b882-ceed3422a1dd
osd id 12
osdspec affinity
type block
vdo 0
devices /dev/sdb
====== osd.13 ======
[block] /dev/ceph-f36b0660-b413-48d9-a286-a871ce28f11c/osd-block-51d9c77f-7d66-497e-b343-7af05de6b9cd
block device /dev/ceph-f36b0660-b413-48d9-a286-a871ce28f11c/osd-block-51d9c77f-7d66-497e-b343-7af05de6b9cd
block uuid P3oC03-WRJT-2EMf-Tue6-xWi2-htR6-ueu2ZI
cephx lockbox secret
cluster fsid 57ba4b78-40ba-41f6-848d-16b51ab7447f
cluster name ceph
crush device class None
encrypted 0
osd fsid 51d9c77f-7d66-497e-b343-7af05de6b9cd
osd id 13
osdspec affinity
type block
vdo 0
devices /dev/sdc
====== osd.14 ======
[block] /dev/ceph-c069b553-6114-40ef-98b8-fcb2fdd7b33b/osd-block-6d2d0a5b-ecdf-4107-afbb-7e8c78a3b4d6
block device /dev/ceph-c069b553-6114-40ef-98b8-fcb2fdd7b33b/osd-block-6d2d0a5b-ecdf-4107-afbb-7e8c78a3b4d6
block uuid lFd782-dRVn-G5hM-V5eJ-DO40-6wiY-3Re2ZO
cephx lockbox secret
cluster fsid 57ba4b78-40ba-41f6-848d-16b51ab7447f
cluster name ceph
crush device class None
encrypted 0
osd fsid 6d2d0a5b-ecdf-4107-afbb-7e8c78a3b4d6
osd id 14
osdspec affinity
type block
vdo 0
devices /dev/sde
====== osd.16 ======
[block] /dev/ceph-78396b57-965e-497b-9c03-49e5a4747435/osd-block-12445079-ad19-4157-b7e5-f3cbb4ca71f9
block device /dev/ceph-78396b57-965e-497b-9c03-49e5a4747435/osd-block-12445079-ad19-4157-b7e5-f3cbb4ca71f9
block uuid WXP0D7-bj7r-MhJY-a7Jh-TQ08-pdEK-PNxpRY
cephx lockbox secret
cluster fsid 57ba4b78-40ba-41f6-848d-16b51ab7447f
cluster name ceph
crush device class None
encrypted 0
osd fsid 12445079-ad19-4157-b7e5-f3cbb4ca71f9
osd id 16
osdspec affinity
type block
vdo 0
devices /dev/sdi
====== osd.17 ======
[block] /dev/ceph-ba5d669b-2ccb-4866-bd9a-a381a59f708e/osd-block-24f8a30c-7b8d-4333-b11f-cef677474ae8
block device /dev/ceph-ba5d669b-2ccb-4866-bd9a-a381a59f708e/osd-block-24f8a30c-7b8d-4333-b11f-cef677474ae8
block uuid HUaw80-YmSs-qRUn-OYYE-5loR-fK4s-qHasb2
cephx lockbox secret
cluster fsid 57ba4b78-40ba-41f6-848d-16b51ab7447f
cluster name ceph
crush device class None
encrypted 0
osd fsid 24f8a30c-7b8d-4333-b11f-cef677474ae8
osd id 17
osdspec affinity
type block
vdo 0
devices /dev/sdh
lsblk:
Please help to understand and solve this issue.
Thanks,
Jayesh