[SOLVED] rbd error: rbd: listing images failed: (2) No such file or directory (500)

skywavecomm

New Member
May 30, 2019
15
0
1
28
When viewing the content on our ceph rbd within the Proxmox GUI, it displays this error message. No errors for running VMs on the ceph cluster or moving disks or anything.

It's more annoying than anything but how can I resolve this issue without having to create a new pool and transfer data from the existing one over since that's not really a fix?

Thanks!
 

skywavecomm

New Member
May 30, 2019
15
0
1
28
I just removed the pool and rebuilt it and the issue was resolved.

Thank you though!
 

Ralf Zenklusen

New Member
Apr 24, 2016
3
4
1
I see the same error in the GUI after a fresh 3 node cluster install of 6.0.5.
Everything seems to run fine, but the above error is shown at GUI>node>cephpool>content.
/etc/pve/priv/ceph.client.admin.keyring is identical on all nodes.

What can be done instead of recreating?

Thanks.
 

Ralf Zenklusen

New Member
Apr 24, 2016
3
4
1
Unfortunately I'm not sure what you would like to point out.

There's no external cluster - just the 3 nodes with onboard SSD used to create the ceph cluster with a single pool called "cephpool".

/etc/pve/storage.cfg shows:
rbd: cephpool
content rootdir,images
krbd 0
pool cephpool

And cat /etc/pve/priv/ceph/cephpool.keyring
[client.admin]
key = same-on-all-3-nodes==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"

The file /etc/ceph/ceph.client.admin.keyring only exists on one node (the first created) though.

/etc/pve/priv/ceph/cephfs.secret contains the same key, so that seems ok.

Any other idea?
 

Ralf Zenklusen

New Member
Apr 24, 2016
3
4
1
Ok, solved...

rbd ls -l cephpool
showed a few correct images but also:
rbd: error opening vm-124-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 80 GiB 2
vm-101-disk-0 80 GiB 2
vm-102-disk-0 80 GiB 2
vm-124-disk-2 550 GiB 2

vm-124-disk-1 was created via "GUI>Create CT" for an unprivileged container.
I then made the mistake to try to restore a privileged containers backup - which obviously failed.
Not sure, but I think the container disappeared when the restore failed or maybe I did delete it.
I then recreated the container with ID124 which resulted in the image vm-124-disk-2 to be created.
But obviously vm-124-disk-1 did not correctly deleted and resulted in this problem.

I simply deleted the image manually:
rbd rm vm-124-disk-1 -p cephpool

So there could be a problem after a failed restore...
 

skywavecomm

New Member
May 30, 2019
15
0
1
28
Ok, solved...

rbd ls -l cephpool
showed a few correct images but also:
rbd: error opening vm-124-disk-1: (2) No such file or directory
NAME SIZE PARENT FMT PROT LOCK
vm-100-disk-0 80 GiB 2
vm-101-disk-0 80 GiB 2
vm-102-disk-0 80 GiB 2
vm-124-disk-2 550 GiB 2

vm-124-disk-1 was created via "GUI>Create CT" for an unprivileged container.
I then made the mistake to try to restore a privileged containers backup - which obviously failed.
Not sure, but I think the container disappeared when the restore failed or maybe I did delete it.
I then recreated the container with ID124 which resulted in the image vm-124-disk-2 to be created.
But obviously vm-124-disk-1 did not correctly deleted and resulted in this problem.

I simply deleted the image manually:
rbd rm vm-124-disk-1 -p cephpool

So there could be a problem after a failed restore...
Yeah if I cancel a disk move to a ceph pool, it does say `Removing image: 1% complete...` but then is canceled at 2% so it seems that cancelling a disk move cancels the disk from being removed on the ceph pool. @Alwin
 

Alwin

Proxmox Staff Member
Aug 1, 2017
4,617
447
88
This is an old thread, please open up a new one. Also if not done so already, upgrade to the latest packages.
 
  • Like
Reactions: skywavecomm

wiguyot

New Member
Oct 12, 2021
2
3
1
52
I have encountered same kind of problem (proxmox v6) with ceph rbd. I problem came from a VM image on the rbd that wasn't destroyed during the destruction of the VM.
I have found the problem switching from "rbd -p my-pool list" to "rbd -p my-pool list --long". I had i line more in the short version. It was the faulty image to remove by "rbd -p my-pool rm my-faulty-file".
 

brosky

Member
Oct 13, 2015
22
2
23
just a quick follow up, i've encountered this error when I move a disk from local storage to ceph storage. All tasks finished without errors but somehow if you already have a disk with the same name on the pool, proxmox will rename your new image and update the VM config.

This happens when you cancel migration jobs - pve does not clean up after a cancelled job.

Just list the pool with --long and without then compare the results in a excel file or diff.
 
  • Like
Reactions: Urbaman

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!