rbd map: "RBD image feature set mismatch"

escoreal · Aug 2, 2017

Hello,

I am testing PVE 5 with Ceph (12.1) and wanted to "map" a ceph volume but I get an error. Is this a bug? Did that work with another versions of PVE or Ceph?

Thanks,
esco

Code:

# rbd map <ceph-pool>/foo
rbd: sysfs write failed
RBD image feature set mismatch. Try disabling features unsupported by the kernel with "rbd feature disable".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
# dmesg |tail -n1
[1355258.253726] rbd: image foo: image uses unsupported features: 0x38

fabian · Aug 2, 2017

how did you create that image? the rbd kernel driver does not support all the image features and tunables of the newer releases.

escoreal · Aug 2, 2017

Just plain installation of PVE 5 with Ceph 12.1 (pveceph install..) and VMs (with volumes) created from the PVE web interface. Nothing special. So you can't reproduce this?

fabian · Aug 2, 2017

escoreal said:
Just plain installation of PVE 5 with Ceph 12.1 (pveceph install..) and VMs (with volumes) created from the PVE web interface. Nothing special. So you can't reproduce this?

volumes created for VMs are not created with the rbd map limitations in mind - Qemu accesses them using librbd, which supports all the new features. volumes created for containers should be created with a reduced feature set, because PVE maps them using rbd map and mounts them before handing the mounted volume over to the container. if you want VMs to use the kernel rbd driver, you can set the krbd flag on the storage. but note that this will potentially reduce performance compared to the librbd-based configuration.

escoreal · Aug 2, 2017

I don't "want" VMs to use the kernel rbd driver. The question was if this is a bug? So, if I understand this right the rbd kernel module is not up to date? Will this stay this way?

I just want easy direct access from the host to the volumes. If this will stay this way I could script some workaround with "rbd-nbd" and "ln".

fabian · Aug 3, 2017

escoreal said:
I don't "want" VMs to use the kernel rbd driver. The question was if this is a bug? So, if I understand this right the rbd kernel module is not up to date? Will this stay this way?

I just want easy direct access from the host to the volumes. If this will stay this way I could script some workaround with "rbd-nbd" and "ln".

no, this is not a bug. the rbd kernel module is uptodate - it's just always a bit behind librbd regarding newly introduced features, because it is not directly maintained by the ceph project but needs to go through the regular kernel development process. if you want your VM volumes to support rbd map, you need to disable certain features on the associated rbd images (like PVE does for container volumes).

escoreal · Aug 3, 2017

ok, So I added "rbd default features = 5" to the ceph.conf. Default was 61

So I only have "layering" (1) and "exclusive-lock" (4). "object-map" (8), "fast-diff" (16) and "deep-flatten" (32) are now disabled by default.

Technical John · Jun 12, 2018

I'm bringing this thread back because I just experienced this same sort of behavior, and wanted to elaborate on what happened so the devs can take notice.

I have a testing cluster of 4 nodes in my home lab. I have Ceph set up across 3 nodes with different hard drives. I have a file storage VM running Nas4Free in HA with it's storage on the Ceph.

We had a power outage and when the cluster came back up I had to reboot each of the nodes again because the cluster ended up in an inconsistent state. After the reboot of each of the nodes then the cluster was working properly. However the HA filestore wasn't coming up, and on further examination there were log reports showing there was a problem with mounting the hard drives from Ceph. I detached the ceph hard drives and attempted to rbd mount one of them and got this error:

RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable test/vm-100-disk-1 object-map fast-diff deep-flatten".

The Ceph cluster reports that everything is working fine and there are no errors.

This was working before the power outage, and I realize this is an edge case... but I'm wanting to learn, so what led to RBD seeing an image feature set mismatch? Is there a corruption in the configuration somewhere that I should look for?

Technical John · Jun 12, 2018

Forgot to mention, following the instructions in the original error and running this code fixes the issue:
rbd feature disable test/vm-100-disk-1 object-map fast-diff deep-flatten

So what would have caused that image to report those unsupported features?

janos · Jun 12, 2018

Hi,

Use NBD for mapping. Its slovest than KRBD but working with all new moder featuers:

Code:

modprobe nbd
rbd-nbd  -m mon1,mon2,mon3 --user CEPHUSER -k /etc/pve/priv/ceph/whatever.keyring map pool/image
rbd-nbd unmap /dev/nbd0

Technical John · Jun 12, 2018

Are you indicating that following your suggestion will get Proxmox to switch to using nbd? At what point did my installation change to needing this?

I started out with the latest version 5 download and installed each node with the no-subscription repository and have kept them updated... so was there an update at some point that I applied that could have caused this situation on the next boot?

janos · Jun 12, 2018

Technical John said:
Are you indicating that following your suggestion will get Proxmox to switch to using nbd? At what point did my installation change to needing this?

I started out with the latest version 5 download and installed each node with the no-subscription repository and have kept them updated... so was there an update at some point that I applied that could have caused this situation on the next boot?

No. I simply wrote, if you want to mount RBD image with extra feature sets, you can do this on this way. QEMU using libvirt for direct connection to RBD images, and LXC doenst have enabled these features by default, because its also using KRBD.

Technical John · Jun 12, 2018

Ok, I get that, and thank you.

However, what I'm after is why I might have received these errors in the first place when I was not attempting to use any extra features.

janos · Jun 12, 2018

Please paste the image details.

Code:

rbd info pool/image

Technical John · Jun 12, 2018

Sure...

So I fixed the vm disk in question already using the command I mentioned before, so here's the fixed image info:

rbd image 'vm-100-disk-1':
size 8192 MB in 2048 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.2beb5b643c9869
format: 2
features: layering, exclusive-lock
flags:
create_timestamp: Sun May 13 17:25:46 2018

But I have a couple other less important VM disks that have the same issue, and I can see from the output that the image shows the unsupported features:

rbd image 'vm-110-disk-1':
size 40960 MB in 10240 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.24208174b0dc51
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Wed May 9 12:50:14 2018

So, thinking back, I believe I had used the "Move disk" button on the VM Hardware page to migrate these from a local ZFS storage to the Ceph storage I had set up later. In addition, I don't think I had taken the whole cluster down between that time and now...

So, would the "Move disk" command possibly have resulted in this inconsistency issue?

Alwin · Jun 12, 2018

KRBD doesn't support all features yet, this is hard for LXC containers, as those can't use librbd. For VMs this is no problem, as qemu can use librbd and access the images directly. On a move disk the features are disabled, at least in newer versions of PVE.

Technical John · Jun 12, 2018

Interestingly, the VMs that were affected were all qemu. I have one lxc, and it came up without error.

I can certainly run the command on the affected disks and resolve the issue, but I'm a little concerned that this error only showed up after a reboot and not before.

Alwin · Jun 13, 2018

Technical John said:
Interestingly, the VMs that were affected were all qemu. I have one lxc, and it came up without error.

Qemu can use both ways, so it depends on how your storage is configured. The checkbox "krbd" (when ticked), activates the use of mapped images through the kernel. To use a ceph pool, there should be two storages configured, one that uses krbd and one that doesn't. Both can point to the same pool.

Technical John said:
I can certainly run the command on the affected disks and resolve the issue, but I'm a little concerned that this error only showed up after a reboot and not before.

There must have happend something before that copied a image or configured a VM differently. In any case, please update to the lastest version, as not only PVE but also Ceph has newer packages available.

Maxemilian Hilbrand · Sep 12, 2018

After upgrading a ceph cluster tonight, creating new images (qemu & lxc) creates volumes with object-map, deep-flatten and layering features enabled. This hinders the vm to start - with the message described above. Today I have upgraded ceph again (new update available) - but no result.
After removing the features on the disk image with rbd feature disable pool/image I were able to start. The problem we will face is automatic creation of images. I don't care about the (possibly missing) features on the disk image, but about not being able to automate image creation. How can I override the default image settings for rbd devices?

Thanks

Alwin · Sep 12, 2018

Maxemilian Hilbrand said:
How can I override the default image settings for rbd devices?

This sounds like a bug, to work around it, you can set the image features that are used on creation in the ceph.conf.

rbd map: "RBD image feature set mismatch"

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Active Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

New Member

Proxmox Retired Staff