Ceph Quincy - rbd: listing images failed after 17.2.4 to 17.2.5 upgrade

cyp

Renowned Member
Feb 9, 2015
20
1
68
Hi,

I have upgraded a ceph quincy cluster from 17.2.4 to 17.2.5, it seems running fine after restart all monitors, managers and osds but I have error if I try to backup or start a VMs

Code:
ERROR: Backup of VM 386 failed - no such volume 'global:vm-386-disk-0'

I also get error if I try to look storage content on the ui
Code:
rbd error: rbd: listing images failed: (2) No such file or directory (500)

And same if I try to get long description with the cli

Code:
root@pve04:~# rbd -p .mgr list
vm-386-disk-0
vm-393-disk-0
vm-396-disk-0
vm-397-disk-0
vm-591-disk-0
vm-593-disk-0
vm-597-disk-0
vm-792-disk-0
vm-799-disk-0
vm-894-disk-0
root@pve24:~# rbd -p .mgr list --long
rbd: error opening vm-386-disk-0: (2) No such file or directory
rbd: error opening vm-393-disk-0: (2) No such file or directory
rbd: error opening vm-396-disk-0: (2) No such file or directory
rbd: error opening vm-397-disk-0: (2) No such file or directory
rbd: error opening vm-591-disk-0: (2) No such file or directory
rbd: error opening vm-593-disk-0: (2) No such file or directory
rbd: error opening vm-597-disk-0: (2) No such file or directory
rbd: error opening vm-792-disk-0: (2) No such file or directory
rbd: error opening vm-799-disk-0: (2) No such file or directory
rbd: error opening vm-894-disk-0: (2) No such file or directory
NAME  SIZE  PARENT  FMT  PROT  LOCK
rbd: listing images failed: (2) No such file or directory
root@pve04:~#

Despite that, VMs launched before upgrade continue to run fine.

Any advice to fix that?

Thanks!
 
Some additional information, I try to use rados command to get more details about pool data.
Number of rbd_headers match with the numbers of volume.

Code:
root@pve24:~# rados -p .mgr ls |grep rbd_header
rbd_header.ad6c3c3396b67
rbd_header.e348823bba989
rbd_header.12376b7eef267f
rbd_header.b898284269e2b
rbd_header.d46f219a2e559
rbd_header.95f666fcb0920
rbd_header.b96ae18421a0c
rbd_header.10f6f7f6396b40
rbd_header.18349cffaa11c
rbd_header.b9a1af9d9f3cc
root@pve24:~#

But I don't have any rbd_id so I guess the problem can be here.
Any way to recover this rbd_id?
 
rbd_id do not seem mandatory, according to the doc, rbd_directory object maps image names to ids
https://docs.ceph.com/en/quincy/dev/rbd-layering/#renaming

So I get a look to rbd_directory metada, and it seems also ok, volume name match existing headers


Code:
root@pve24:~# rados -p .mgr listomapvals rbd_directory
id_10f6f7f6396b40
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 35  39 37 2d 64 69 73 6b 2d  |....vm-597-disk-|
00000010  30                                                |0|
00000011

id_12376b7eef267f
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 38  39 34 2d 64 69 73 6b 2d  |....vm-894-disk-|
00000010  30                                                |0|
00000011

id_18349cffaa11c
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 35  39 31 2d 64 69 73 6b 2d  |....vm-591-disk-|
00000010  30                                                |0|
00000011

id_95f666fcb0920
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 35  39 33 2d 64 69 73 6b 2d  |....vm-593-disk-|
00000010  30                                                |0|
00000011

id_ad6c3c3396b67
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 37  39 39 2d 64 69 73 6b 2d  |....vm-799-disk-|
00000010  30                                                |0|
00000011

id_b898284269e2b
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 33  38 36 2d 64 69 73 6b 2d  |....vm-386-disk-|
00000010  30                                                |0|
00000011

id_b96ae18421a0c
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 33  39 37 2d 64 69 73 6b 2d  |....vm-397-disk-|
00000010  30                                                |0|
00000011

id_b9a1af9d9f3cc
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 33  39 33 2d 64 69 73 6b 2d  |....vm-393-disk-|
00000010  30                                                |0|
00000011

id_d46f219a2e559
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 37  39 32 2d 64 69 73 6b 2d  |....vm-792-disk-|
00000010  30                                                |0|
00000011

id_e348823bba989
value (17 bytes) :
00000000  0d 00 00 00 76 6d 2d 33  39 36 2d 64 69 73 6b 2d  |....vm-396-disk-|
00000010  30                                                |0|
00000011

name_vm-386-disk-0
value (17 bytes) :
00000000  0d 00 00 00 62 38 39 38  32 38 34 32 36 39 65 32  |....b898284269e2|
00000010  62                                                |b|
00000011

name_vm-393-disk-0
value (17 bytes) :
00000000  0d 00 00 00 62 39 61 31  61 66 39 64 39 66 33 63  |....b9a1af9d9f3c|
00000010  63                                                |c|
00000011

name_vm-396-disk-0
value (17 bytes) :
00000000  0d 00 00 00 65 33 34 38  38 32 33 62 62 61 39 38  |....e348823bba98|
00000010  39                                                |9|
00000011

name_vm-397-disk-0
value (17 bytes) :
00000000  0d 00 00 00 62 39 36 61  65 31 38 34 32 31 61 30  |....b96ae18421a0|
00000010  63                                                |c|
00000011

name_vm-591-disk-0
value (17 bytes) :
00000000  0d 00 00 00 31 38 33 34  39 63 66 66 61 61 31 31  |....18349cffaa11|
00000010  63                                                |c|
00000011

name_vm-593-disk-0
value (17 bytes) :
00000000  0d 00 00 00 39 35 66 36  36 36 66 63 62 30 39 32  |....95f666fcb092|
00000010  30                                                |0|
00000011

name_vm-597-disk-0
value (18 bytes) :
00000000  0e 00 00 00 31 30 66 36  66 37 66 36 33 39 36 62  |....10f6f7f6396b|
00000010  34 30                                             |40|
00000012

name_vm-792-disk-0
value (17 bytes) :
00000000  0d 00 00 00 64 34 36 66  32 31 39 61 32 65 35 35  |....d46f219a2e55|
00000010  39                                                |9|
00000011

name_vm-799-disk-0
value (17 bytes) :
00000000  0d 00 00 00 61 64 36 63  33 63 33 33 39 36 62 36  |....ad6c3c3396b6|
00000010  37                                                |7|
00000011

name_vm-894-disk-0
value (18 bytes) :
00000000  0e 00 00 00 31 32 33 37  36 62 37 65 65 66 32 36  |....12376b7eef26|
00000010  37 66                                             |7f|
00000012
 
I am beyond confused, you are using the .mgr pool for RBD?

In my experience disks are expressed by the path poolName/vm-id-disk-0 etc
 
  • Like
Reactions: wigor
In my experience disks are expressed by the path poolName/vm-id-disk-0 etc

As I understand, it's two ways to write the same thing

Code:
root@pve22:~# rbd -p .mgr info vm-386-disk-0
rbd: error opening image vm-386-disk-0: (2) No such file or directory
root@pve22:~# rbd info .mgr/vm-386-disk-0
rbd: error opening image vm-386-disk-0: (2) No such file or directory
 
Hey cyp,

as alyarb mentioned, i think you search in the wrong pool? Respectively using the .mgr for vm-disks may lead to problems.
 
Last edited:
I don't find other pools


Code:
root@pve24:~# ceph osd lspools
1 .mgr
root@pve24:~#

The name with a dot at beginning is a bit weird, so I also think something happen with the name during upgrade but /etc/pve/storage.cfg backup show it was named liked that since the install.

Code:
rbd: global
        content images,rootdir
        krbd 0
        pool .mgr
 
I think .mgr is an ceph-internal pool.
Maybe it´s contents have been erased by the update.
 
Think you are rigth, the problem must come from the name https://docs.ceph.com/en/latest/rados/operations/pools/#pool-names

I don't remember how it has been created, probably just by adding storage using this already existing default pool (maybe Proxmox interface must add something to prevent that).

I have created a new pool using the default rbd name. I will try to find if I can manage to export current image object from .mgr pool to rbd
 
By default RBD is not even a permitted application on the .mgr pool. You definitely want to create an RBD-dedicated pool.