[SOLVED] Possible bluestoreDB bug in Ceph 17.2.8

Deerom · Monday at 14:30

Hi all,

Recently the latest version of PVE-Ceph Quincy released (17.2.8)* within the Enterprise repository.
However on another (non-PVE ceph) storage cluster we experienced quite the nasty bug related to BluestoreDB (https://tracker.ceph.com/issues/69764). Causing OSD's to crash and recover, putting heavy load on the cluster.

Update-day is coming up soon, and I'd like to make sure that we don't have to rollback quite a few Ceph clusters. Looking at the tracker the fix would be targeted for the next release of Ceph Reef (18.2.5).
So my question would be: Is this an active issue for PVE-ceph packages, or has this issue been resolved within this release?

Thanks in advance!
Best regards

aaron · Monday at 14:42

AFAICT, 18.2.4 is the most current release. Once 18.2.5 is released (shouldn't be far away from what I can tell), we will package it for Proxmox VE and will slowly push it out through the repository chain, test->no-subscription->enterprise.

But I cannot make any guarantees when the 18.2.5 packages will be available in the respective repositories.

Deerom · Monday at 14:59

Hi Aaron,

We have a couple clusters still running Ceph Quincy, my fear was more the minor-release upgrade from 17.2.7 > 17.2.8 where this bug is present.
It seems that Ceph 17.2.8 is exposed to this bug release under the BlueFS.cc file:
https://git.proxmox.com/?p=ceph.git...0;hb=b009440314ae417689ea1c7d5d9e5874e7e3812b
Line: 3116 _log_advance_seq();

This bug had been introduced in commit: https://github.com/ceph/ceph/pull/57241

We had to roll-back an entire cluster to 17.2.7 for it to become stable again, hence the fear.

We'd like to verify that this issue would not have any impact before upgrading.
FYI I have made a support-ticket, I don't mind if we continue there.

Best Regards,
- Demian

Deerom · Monday at 16:30

Hi all,

Thanks to all the amazing staff at Proxmox!
I just received my answer regarding this fear, they already fixed this in their PVE-ceph version!

(https://git.proxmox.com/?p=ceph.git...d;hb=71ce71edd912bf31ab1ce63723f43911814c6e3a)

For any of the people that encounter this issue on non pve-ceph clusters, my solution was to rollback to 17.2.7 and restart all daemons.
That seems to have done the trick!

With kind regards,
- Demian

Search

Search

[SOLVED] Possible bluestoreDB bug in Ceph 17.2.8

Deerom

New Member

aaron

Proxmox Staff Member

Deerom

New Member

Deerom

New Member

We value your privacy