Got messages [rbd error: rbd: listing images failed: (2) No such file or directory (500)] in Cephpool after last upgrade

hape

Renowned Member
Jun 10, 2013
75
5
73
Hello all,

after upgrade all CEPH-Clustermembers to latest 6.1 i got the message

[rbd error: rbd: listing images failed: (2) No such file or directory (500)]

when looking into the cephpool. Subsequently i cannot migrate a VM to another node in the cluster. Not online and also not offline.

I read something about keyring with those incident. But why is this the result off only a carried out upgrade?
 
When did you get this error? If it was during a task please share the full task log.
From which version to latest did you upgrade?
 
I'm not shure that the failure is really since the last upgrade. I can see the overview of the ceph-pooI, but cannot see the content of it, with the message:

[rbd error: rbd: listing images failed: (2) No such file or directory (500)].

It's possible that this failure is since a longer time than the last upgrade.

My last upgrade was from 6.0 to 6.1.

Also i'm not shure that it is a mistake to name the hosts in a ceph-cluster with overlapping hostnames. I.e.

virthost-1
virthost-1a
virthost-1b
...

In this scenario i saw the first one in the ceph mon overview twice and on of them as inactive

Is it possible to repair the pool online?

What kind of ongoing information (logs) do you need?
 
Sorry I'm not sure what you are talking about. What overview and what content are you talking about? Please provide e.g. the commands you issued with there outputs or whatever else lead to that error.

Please post your /etc/pve/ceph.conf and the output of:
# ceph quorum_status -f json-pretty
 
I cannot see the content of the ceph-pool-storage available on all ceph-cluster-members, but I can see the Summary of that Storage. On other Clusters i can see the images of all in such a CEPH-POOL-Storage stored VMs. When i click on the Content TAB on PVE i get the described message:

[rbd error: rbd: listing images failed: (2) No such file or directory (500)].

Screenshot_20200103_131357.png

Here is the output of the requested command and the complete ceph.conf.
 

Attachments

Ok, thanks for clarification.
What does the following command output:
# rbd ls -l <name-of-your-pool>
 
The name of the storage is different to the name of the pool itself. I hope that's not authoritative.

Here is the output.
 

Attachments

Here you go:
rbd: error opening vm-111001-disk-0: (2) No such file or directory

Edit:
Probably a task failed in the past and this is a leftover, make sure the disk isn't used and remove it manually with:
# rbd rm vm-111001-disk-0 -p ceph-pool-1
 
  • Like
Reactions: goofyx
Perfect !!!!

That was it. Thanks a lot. Now i can see the hole content of the storage again.

Regards

Hans-Peter
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!