CEPH - OSD List virtual drives

watnow101

Active Member
Apr 9, 2018
15
0
41
43
Hi all.

Is there a way where I can get a list of disk drives of VM's on a specific OSD?

Currently I am running the following.

ceph osd map SSD vm-100-disk-0

But that is when I already know the disk name.

I want to find out which VM is hogging DIsk IO on the OSD that is currently slugish.
 
Is there a way where I can get a list of disk drives of VM's on a specific OSD?

Not directly

Currently I am running the following.

ceph osd map SSD vm-100-disk-0

The above command does not show where the data are stored. To get this more commands have to be run:

Code:
rbd -p <pool> info <virtual-disk>

returns some basic information, note the field block_name_prefix which looks e.g. like "rbd_data.1603ec6b8b4567"

A list about all data objects in the pool can be created by
Code:
rados ls -p <pool>

There will be a huge output, to get the objects assigned to the respective virtual disk select the items containing the above determined block_name_prefix

Afterwards the location where (on which osd) an object is stored can be figured out by
Code:
ceph osd map <pool> <object>

Doing this for each object you will get the OSDs where the virtual disk is stored

Here an example how to do all this comprehensively for one disk:
Code:
rados ls -p SSD | grep  `rbd -p SSD info vm-100-disk-0 | grep block | cut -b 21-43` | sed 's/rbd_/ceph osd map SSD rbd_/g' > tempfile;chmod 777 tempfile;./tempfile

But that is when I already know the disk name.
Yes, if you want to get the information for a certain OSD you have to to this for all virtual disks and select the desired OSD.
I want to find out which VM is hogging DIsk IO on the OSD that is currently slugish.
If you have a look at the data for one disk you will probably see that it spread across (almost) all OSDs, therefore your request cannot be fulfilled as you expect. If you have a (too) slow disk remove it simply from your OSDs - its content will be automatically redistributed to the remaining OSDs.
 
@Richard

I came across this and wanted to know how to interpret the output. I ran the
ceph osd map <pool> <object>

command and got:
1578037721271.png

I don't understand which results indicates which OSD the VM data is currently on?
When cross referencing the different numbers (e.g., pg 19.adf.., or ([21,22], p21)...) , those numbers don't seem to resonate with the OSD numbers. Any guidance here?

Thank you for this, it helps somewhat but an issue I have now is with an HDD pool, which doesn't response to "rbd ls -l PoolHDD", so I don't suspect the "ceph osd map..." command to work at the moment. Ideally my goal is to find out how to figure out which physical disk a VM is currently running on.

Been having HDD errors and have used some roundabout ways to move VMs to avoid messing with its OSD.

Thank you!