[SOLVED] rbd error: rbd: listing images failed: (2) No such file or directory (500)

skywavecomm · Aug 1, 2019

When viewing the content on our ceph rbd within the Proxmox GUI, it displays this error message. No errors for running VMs on the ceph cluster or moving disks or anything.

It's more annoying than anything but how can I resolve this issue without having to create a new pool and transfer data from the existing one over since that's not really a fix?

Thanks!

Alwin · Aug 5, 2019

Do you have the keyring for the pool in place? If you disabled ceph auth, then you also need to remove the keyring for the storage.
https://pve.proxmox.com/pve-docs/chapter-pvesm.html#_authentication

skywavecomm · Aug 5, 2019

I just removed the pool and rebuilt it and the issue was resolved.

Thank you though!

Ralf Zenklusen · Aug 19, 2019

I see the same error in the GUI after a fresh 3 node cluster install of 6.0.5.
Everything seems to run fine, but the above error is shown at GUI>node>cephpool>content.
/etc/pve/priv/ceph.client.admin.keyring is identical on all nodes.

What can be done instead of recreating?

Thanks.

Alwin · Aug 20, 2019

@Ralf Zenklusen, the admin keyring file is a separate one from the storage keyring, see the link.
https://pve.proxmox.com/pve-docs/chapter-pvesm.html#_authentication

Ralf Zenklusen · Aug 20, 2019

Unfortunately I'm not sure what you would like to point out.

There's no external cluster - just the 3 nodes with onboard SSD used to create the ceph cluster with a single pool called "cephpool".

/etc/pve/storage.cfg shows:
rbd: cephpool
content rootdir,images
krbd 0
pool cephpool

And cat /etc/pve/priv/ceph/cephpool.keyring
[client.admin]
key = same-on-all-3-nodes==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"

The file /etc/ceph/ceph.client.admin.keyring only exists on one node (the first created) though.

/etc/pve/priv/ceph/cephfs.secret contains the same key, so that seems ok.

Any other idea?

Ralf Zenklusen · Aug 20, 2019

Ok, solved...

rbd ls -l cephpool
showed a few correct images but also:
rbd: error opening vm-124-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 80 GiB 2
vm-101-disk-0 80 GiB 2
vm-102-disk-0 80 GiB 2
vm-124-disk-2 550 GiB 2

vm-124-disk-1 was created via "GUI>Create CT" for an unprivileged container.
I then made the mistake to try to restore a privileged containers backup - which obviously failed.
Not sure, but I think the container disappeared when the restore failed or maybe I did delete it.
I then recreated the container with ID124 which resulted in the image vm-124-disk-2 to be created.
But obviously vm-124-disk-1 did not correctly deleted and resulted in this problem.

I simply deleted the image manually:
rbd rm vm-124-disk-1 -p cephpool

So there could be a problem after a failed restore...

skywavecomm · Apr 15, 2020

Ralf Zenklusen said:
Ok, solved...

rbd ls -l cephpool
showed a few correct images but also:
rbd: error opening vm-124-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 80 GiB 2
vm-101-disk-0 80 GiB 2
vm-102-disk-0 80 GiB 2
vm-124-disk-2 550 GiB 2

vm-124-disk-1 was created via "GUI>Create CT" for an unprivileged container.
I then made the mistake to try to restore a privileged containers backup - which obviously failed.
Not sure, but I think the container disappeared when the restore failed or maybe I did delete it.
I then recreated the container with ID124 which resulted in the image vm-124-disk-2 to be created.
But obviously vm-124-disk-1 did not correctly deleted and resulted in this problem.

I simply deleted the image manually:
rbd rm vm-124-disk-1 -p cephpool

So there could be a problem after a failed restore...

Yeah if I cancel a disk move to a ceph pool, it does say `Removing image: 1% complete...` but then is canceled at 2% so it seems that cancelling a disk move cancels the disk from being removed on the ceph pool. @Alwin

Alwin · Apr 15, 2020

This is an old thread, please open up a new one. Also if not done so already, upgrade to the latest packages.

wiguyot · Oct 12, 2021

I have encountered same kind of problem (proxmox v6) with ceph rbd. I problem came from a VM image on the rbd that wasn't destroyed during the destruction of the VM.
I have found the problem switching from "rbd -p my-pool list" to "rbd -p my-pool list --long". I had i line more in the short version. It was the faulty image to remove by "rbd -p my-pool rm my-faulty-file".

Urbaman · Mar 8, 2022

Hi, helped me out with this error, searching for a loose VM disk and finally destroying it.

brosky · Apr 7, 2022

just a quick follow up, i've encountered this error when I move a disk from local storage to ceph storage. All tasks finished without errors but somehow if you already have a disk with the same name on the pool, proxmox will rename your new image and update the VM config.

This happens when you cancel migration jobs - pve does not clean up after a cancelled job.

Just list the pool with --long and without then compare the results in a excel file or diff.

ZZ9 · Jun 8, 2022

Unfortunately I ran into this problem (again).
The last time the info given by wiguyot helped (removing a failed image). But not this time.
I can add an external newly created ceph-pool via gui (or command line)
As long as the pool is empty I can do a list command on one of the proxmox-servers ("rbd ls -p rbd" or "rbd ls -p rbd --long")
But as soon as I copy something over to the new pool on one of the ceph-servers (i.e. "rbd deep cp BackupPool/vm-107-disk-0 rbd/vm-107-disk-0")
"rbd ls -p rbd" works (and shows the vm-107-disk-0) but "rbd ls -p rbd --long" no longer does (on proxmox). No Output.
And the Webinterface shows no images as well and gives a connection error. ("Connection timed out (596)")
(on the ceph-cluster "rbd ls -p rbd --long" always shows the expected files)

I am stuck here for some days now. Could somebody maybe point me in the right direction or give a hint where to debug more?

(proxmox is V 7.2.4 and ceph 17.2.0)
Last test for today: When creating a new VM the disk is actually created on the cluster/pool. (as "rbd list --long shows on the ceph-servers) but cannot start and I cannot remove the VM afterwards...)

Update: After rebooting every ceph-node one by one is worked again out of the sudden.... I have no clue what happend.

gunterwa · Jun 22, 2022

Ralf Zenklusen said:
Ok, solved...

rbd ls -l cephpool
showed a few correct images but also:
rbd: error opening vm-124-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 80 GiB 2
vm-101-disk-0 80 GiB 2
vm-102-disk-0 80 GiB 2
vm-124-disk-2 550 GiB 2

vm-124-disk-1 was created via "GUI>Create CT" for an unprivileged container.
I then made the mistake to try to restore a privileged containers backup - which obviously failed.
Not sure, but I think the container disappeared when the restore failed or maybe I did delete it.
I then recreated the container with ID124 which resulted in the image vm-124-disk-2 to be created.
But obviously vm-124-disk-1 did not correctly deleted and resulted in this problem.

I simply deleted the image manually:
rbd rm vm-124-disk-1 -p cephpool

So there could be a problem after a failed restore...

Your solution does work for me, GUI can now correctlly show ceph pool and error msg -- "rdb error: rdb: listing images failed: (2) No such file or directory (500)" disappeared. THX .).

proxwolfe · Oct 2, 2022

In my case, the error appeared after removing a VM. The output showed that the image was being removed
("100% complete. Done.") but was followed by this error. The VM was still shown under the respective node. Removing it again did not help. This has happened more than once.

I checked the content of the respective CEPH but the disk image was nowhere to be seen.

I then just deleted the VM's conf file (in /etc/pve/qemu-server and the VM from the GUI.

garikarh · Feb 1, 2023

as I did not the first time of course.
addition: the disk in the virtual machine was eventually removed from the configuration (vm without hdd )
the task was to delete vm-106-disk-1
pay attention to the different types of commands listing the directory of the storage in different ways displayed vm-106-disk-1
================================


Linux pve3 5.15.83-1-pve #1 SMP PVE 5.15.83-1 (2022-12-15T00:00Z) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb 1 16:11:18 MSK 2023 on pts/0

root@pve3:~# rbd -p pool-seph list
vm-100-disk-0
vm-101-disk-0
vm-101-disk-1
vm-102-disk-0
vm-104-disk-0
vm-105-disk-0
vm-107-disk-0
vm-106-disk-1

root@pve3:~# rbd -p pool-seph rm vm-106-disk-1
2023-02-01T16:23:51.715+0300 7f45d2704700 -1 librbd::image::preRemoveRequest: 0x7f45b0068cb0 check_image_watchers: image has watchers - not removing
Removing image: 0% complete...failed.
rbd: error: image still has watchers
This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.

root@pve3:~# rbd -p pool-seph list
vm-100-disk-0
vm-101-disk-0
vm-101-disk-1
vm-102-disk-0
vm-104-disk-0
vm-105-disk-0
vm-107-disk-0
vm-106-disk-1

root@pve3:~# rbd ls -l pool-seph
rbd: error opening vm-106-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 50 GiB 2 excl
vm-101-disk-0 100 GiB 2 excl
vm-101-disk-1 500 GiB 2 excl
vm-102-disk-0 200 GiB 2 excl
vm-104-disk-0 100 GiB 2 excl
vm-105-disk-0 100 GiB 2 excl
vm-107-disk-0 100 GiB 2 excl
rbd: listing images failed: (2) No such file or directory

root@pve3:~# rbd -p pool-seph list
vm-100-disk-0
vm-101-disk-0
vm-101-disk-1
vm-102-disk-0
vm-104-disk-0
vm-105-disk-0
vm-107-disk-0
vm-106-disk-1
root@pve3:~# rbd -p pool-seph rm vm-106-disk-1
Removing image: 100% complete...done.

root@pve3:~# rbd -p pool-seph list
vm-100-disk-0
vm-101-disk-0
vm-101-disk-1
vm-102-disk-0
vm-104-disk-0
vm-105-disk-0
vm-107-disk-0

root@pve3:~# rbd ls -l pool-seph
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 50 GiB 2 excl
vm-101-disk-0 100 GiB 2 excl
vm-101-disk-1 500 GiB 2 excl
vm-102-disk-0 200 GiB 2 excl
vm-104-disk-0 100 GiB 2 excl
vm-105-disk-0 100 GiB 2 excl
vm-107-disk-0 100 GiB 2 excl
root@pve3:~#

sahostking · Mar 1, 2023

This helped cleaning out old lingering images that werent used.
Thanks

hvisage · Sep 30, 2024

Alwin said:
This is an old thread, please open up a new one. Also if not done so already, upgrade to the latest packages.

fun part, had an exact similar issue on PVE 8.2.x and this thread fixed it for me

Search

Search

[SOLVED] rbd error: rbd: listing images failed: (2) No such file or directory (500)

skywavecomm

Active Member

Alwin

Proxmox Retired Staff

skywavecomm

Active Member

Ralf Zenklusen

New Member

Alwin

Proxmox Retired Staff

Ralf Zenklusen

New Member

Ralf Zenklusen

New Member

skywavecomm

Active Member

Alwin

Proxmox Retired Staff

wiguyot

New Member

Urbaman

Member

brosky

Well-Known Member

ZZ9

Member

gunterwa

Member

proxwolfe

Well-Known Member

garikarh

New Member

sahostking

Renowned Member

hvisage

Renowned Member