osd move issue

RobFantini

Famous Member
May 24, 2012
2,023
107
133
Boston,Mass
trying to move an osd created using latest ceph nautilus on pve6

at pve > ceph > osd : stop the osd

physically move it

reload pve page. osd still shows on original node.

so moved the ssd back to the original node.

reload pve page, pressed start for the osd . fail:
Code:
Job for ceph-osd@26.service failed because the control process exited with error code.
See "systemctl status ceph-osd@26.service" and "journalctl -xe" for details.
TASK ERROR: command '/bin/systemctl start ceph-osd@26' failed: exit code 1

Code:
 # systemctl status ceph-osd@26.service
● ceph-osd@26.service - Ceph object storage daemon osd.26
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: failed (Result: exit-code) since Tue 2019-08-13 10:13:42 EDT; 3min 3s ago
  Process: 2357817 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 26 (code=exited, status=0/SUCCESS)
  Process: 2357822 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 26 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
 Main PID: 2357822 (code=exited, status=1/FAILURE)

Aug 13 10:13:42 sys8 systemd[1]: Failed to start Ceph object storage daemon osd.26.
Aug 13 10:13:52 sys8 systemd[1]: ceph-osd@26.service: Start request repeated too quickly.
Aug 13 10:13:52 sys8 systemd[1]: ceph-osd@26.service: Failed with result 'exit-code'.
Aug 13 10:13:52 sys8 systemd[1]: Failed to start Ceph object storage daemon osd.26.
Aug 13 10:14:29 sys8 systemd[1]: ceph-osd@26.service: Start request repeated too quickly.
Aug 13 10:14:29 sys8 systemd[1]: ceph-osd@26.service: Failed with result 'exit-code'.
Aug 13 10:14:29 sys8 systemd[1]: Failed to start Ceph object storage daemon osd.26.
Aug 13 10:16:36 sys8 systemd[1]: ceph-osd@26.service: Start request repeated too quickly.
Aug 13 10:16:36 sys8 systemd[1]: ceph-osd@26.service: Failed with result 'exit-code'.
Aug 13 10:16:36 sys8 systemd[1]: Failed to start Ceph object storage daemon osd.26.


Am I doing something wrong?

Note this pool is not used so I can do plenty of testing today.
 
How does the layout of the OSD look like? And was the OSD set out too?
 
How does the layout of the OSD look like? And was the OSD set out too?

on the 1st test I did not press out.

a few minutes ago i did another test.

pve: stop . out
move disk
reload osd page at pve
osd still shows at orig node.
moved ssd back to orig node.

press IN for the osd. reload. in showed ok/green.

press start: pve reported 'TASK OK'

reload page, shows osd as down/in

cli: dmesg is what i have to show:
Code:
[Tue Aug 13 10:57:23 2019] libceph: osd56 down
[Tue Aug 13 10:57:37 2019] libceph: osd56 weight 0x0 (out)



[Tue Aug 13 10:58:16 2019] sd 0:0:4:0: device_block, handle(0x000d)
[Tue Aug 13 10:58:17 2019] sd 0:0:4:0: device_unblock and setting to running, handle(0x000d)
[Tue Aug 13 10:58:17 2019] sd 0:0:4:0: [sdn] Synchronizing SCSI cache
[Tue Aug 13 10:58:17 2019] sd 0:0:4:0: [sdn] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[Tue Aug 13 10:58:17 2019] mpt3sas_cm0: removing handle(0x000d), sas_addr(0x4433221103000000)
[Tue Aug 13 10:58:17 2019] mpt3sas_cm0: enclosure logical id(0x500304801b872f01), slot(3)
[Tue Aug 13 10:58:17 2019] mpt3sas_cm0: enclosure level(0x0000), connector name(     )
[Tue Aug 13 11:00:03 2019] scsi 0:0:5:0: Direct-Access     ATA      INTEL SSDSC2BX40 DL22 PQ: 0 ANSI: 6
[Tue Aug 13 11:00:03 2019] scsi 0:0:5:0: SATA: handle(0x000d), sas_addr(0x4433221103000000), phy(3), device_name(0x0000000000000000)
[Tue Aug 13 11:00:03 2019] scsi 0:0:5:0: enclosure logical id (0x500304801b872f01), slot(3)
[Tue Aug 13 11:00:03 2019] scsi 0:0:5:0: enclosure level(0x0000), connector name(     )
[Tue Aug 13 11:00:03 2019] scsi 0:0:5:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: Power-on or device reset occurred
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: Attached scsi generic sg13 type 0
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] 781422768 512-byte logical blocks: (400 GB/373 GiB)
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] 4096-byte physical blocks
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] Write Protect is off
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] Mode Sense: 9b 00 10 08
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA
[Tue Aug 13 11:00:03 2019] sd 0:0:5:0: [sdq] Attached SCSI disk
[Tue Aug 13 11:00:51 2019] libceph: osd56 weight 0x10000 (in)
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:22 2019] Buffer I/O error on dev dm-0, logical block 97517552, async page read
[Tue Aug 13 11:01:22 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:22 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Tue Aug 13 11:01:22 2019] Buffer I/O error on dev dm-0, logical block 0, async page read

status:
Code:
# systemctl status ceph-osd@56.service
● ceph-osd@56.service - Ceph object storage daemon osd.56
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: failed (Result: exit-code) since Tue 2019-08-13 11:01:31 EDT; 5min ago
  Process: 1749832 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 56 (code=exited, status=0/SUCCESS)
  Process: 1749837 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 56 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
 Main PID: 1749837 (code=exited, status=1/FAILURE)

Aug 13 11:01:31 pve10 systemd[1]: ceph-osd@56.service: Service RestartSec=100ms expired, scheduling restart.
Aug 13 11:01:31 pve10 systemd[1]: ceph-osd@56.service: Scheduled restart job, restart counter is at 3.
Aug 13 11:01:31 pve10 systemd[1]: Stopped Ceph object storage daemon osd.56.
Aug 13 11:01:31 pve10 systemd[1]: ceph-osd@56.service: Start request repeated too quickly.
Aug 13 11:01:31 pve10 systemd[1]: ceph-osd@56.service: Failed with result 'exit-code'.
Aug 13 11:01:31 pve10 systemd[1]: Failed to start Ceph object storage daemon osd.56.
 
so I did 3 test so far. I am moving the osd's back to original nodes and attempting to get the osd's back in and up.

ceph osd tree shows the drives as down.

on the 1st node - restarting the node brought the osd back up.

i'll try a few more things to get the osd's up, and if no luck will restart the nodes.
 
The place on the crushmap will only change once the OSD started on another node. And depending on how your OSDs are created, the may need additional steps to sync the last content to the OSD.
 
The place on the crushmap will only change once the OSD started on another node. And depending on how your OSDs are created, the may need additional steps to sync the last content to the OSD.

I understand. And there should be a defined way for nautilus created osd to be moved from one node to another If you or someone else has a suggested plan I can test in our lab.
 
From top of my head:
  1. set OSD out
  2. stop OSD service
  3. deactivate LVM (if OSD made with ceph-volume) / unmount OSD partition
  4. remove disk from server
  5. input disk into other server
  6. restart ceph-osd.service to restart/re-activate OSDs (or activate with ceph-volume)
Then the location on the crush map should also be on the other server.
 
From top of my head:
  1. set OSD out
  2. stop OSD service
  3. deactivate LVM (if OSD made with ceph-volume) / unmount OSD partition
  4. remove disk from server
  5. input disk into other server
  6. restart ceph-osd.service to restart/re-activate OSDs (or activate with ceph-volume)
Then the location on tehe crush map should also be on the other server.

Hello Alwin - How do I deactivate an osd LVM?
I read the man page to ceph-volume , and i did not see an option to deactivate lvm.
 
Last edited:
Hello Alwin - How do I deactivate an osd LVM?
I read the man page to ceph-volume , and i did not see an option to deactivate lvm.
You shouldn't need to, but just in case.
Code:
# logical volume
lvchange -an /dev/vg_name/lv_name

# volume group
vgchange -an vg_name
 
Hello
I tried moving an osd without lvm deactivate, that did not work.

So moved osd back, rebooted to activate it [ could not get it up otherwise]

1- at pve set osd out
2-at pve stop the osd

3- worked on deactivate lvm:
this seemed to work as no output resulted:
Code:
 lvchange -an  /dev/ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0/osd-block-76124e4d-8f42-4e82-97e3-6c67ef4a22c6


However I am not able to figure out how to make vgchange deactivate . I've read man page and searched for examples.

running the command outputs that the lv is active:
Code:
vgchange  -a n /dev/ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0
  0 logical volume(s) in volume group "ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0" now active

I'll continue to research and test, it has been a long time since using lvm...

Any suggestions on how to deactivate ?

PS: to get the lvm name for the osd:
Code:
 ls    -l      /var/lib/ceph/osd/ceph-68/block
then look for output from that command here:
Code:
lvs
 
Last edited:
However I am not able to figure out how to make vgchange deactivate . I've read man page and searched for examples.
You can just specify the VG or LV name respectively, no patch needed.

PS: to get the lvm name for the osd:
Alternatively:
Code:
ceph-volume lvm list
 
You can just specify the VG or LV name respectively, no patch needed.
I do not know what you mean by 'no patch needed'

Question - this should have worked ?
Code:
vgchange  -a n /dev/ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0
  0 logical volume(s) in volume group "ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0" now active
 
Hello

please note: the ' now active ' :
Code:
vgchange  -a n /dev/ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0
  0 logical volume(s) in volume group "ceph-eab7fc8d-051a-4756-a8e5-1a3acb3e92c0" now active

as of now the only way i can move an osd from one system to another is to

1- stop/out/destroy
2- move
3- add an osd.


I have spent 10-12 hours trying different ways.
there seems to be something sticking the lvm to the original node.


Or I could be missing some piece of information... perhaps an osd move could be tried in someones lab?
 
Last edited:
I forgot to add the export/import steps:
  1. set OSD out
  2. stop OSD service
  3. deactivate LVM (if OSD made with ceph-volume) / unmount OSD partition
  4. export the VG (vgexport <VG-ID>)
  5. remove disk from server
  6. input disk into other server
  7. run pvscan to see if the disk is seen by LVM
  8. import the VG (vgimport <VG-ID>)
  9. then activate the single osd ceph-volume lvm activate <ID> <osd fsid>
  10. last but not least, ceph osd in <ID>
 
OK I followed the steps, and wads not ablt to move the drive.

the 1st error cam after this:
Code:
 # vgimport ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c
  Volume group "ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c" successfully imported

then error here:
Code:
# ceph-volume lvm activate --all
--> OSD ID 3 FSID fdcc37da-c93e-4161-a4c3-45e82f695292 process is active. Skipping activation
--> Activating OSD ID 16 FSID 10cb7d13-893d-45d4-a711-0bd0a76194e6
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-16
--> Absolute path not found for executable: restorecon
--> Ensure $PATH environment variable contains common executable locations
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-16
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c/osd-block-10cb7d13-893d-45d4-a711-0bd0a76194e6 --path /var/lib/ceph/osd/ceph-16 --no-mon-config
 stderr: 2019-08-30 14:22:14.934 7f446e26e140 -1 bluestore(/dev/ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c/osd-block-10cb7d13-893d-45d4-a711-0bd0a76194e6) _read_bdev_label failed to open /dev/ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c/osd-block-10cb7d13-893d-45d4-a711-0bd0a76194e6: (2) No such file or directory
failed to read label for /dev/ceph-da1b7ac2-64fc-47e0-8c21-3ba9507da14c/osd-block-10cb7d13-893d-45d4-a711-0bd0a76194e6: (2) No such file or directory
-->  RuntimeError: command returned non-zero exit status: 1

At pve > ceph > osd : The osd still shows at the original node as down and out.


I can post more notes on the steps I did
 
ATM, the only thing I can think of is that there might have been data in the cache not written to disk. If you could try again and do a blockdev --flushbufs /dev/sdX when the OSD has been stopped and after it has been exported. It might write all outstanding data to disk.
 
For now we are done with the osd moves, so perhaps someone else could do further testing. in about 2 weeks we'll have 2 more to move. it would be nice if this just worked as before.
 
  • Like
Reactions: Tmanok

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!