Ceph 16.2.6 - CEPHFS failed after upgrade from 16.2.5

dlasher

Active Member
Mar 23, 2011
137
11
38
TL;DR - Upgrade from 16.2.5 to 16.2.6 - CEPHFS fails to start after upgrade, all MDS in "standby" - requires

Code:
ceph fs compat <fs name> add_incompat 7 "mds uses inline data"

to work again.


Longer version :

pve-manager/7.0-11/63d82f4e (running kernel: 5.11.22-5-pve)

apt dist-upgraded, CEPH was upgraded from 16.2.5 to 16.2.6, which then made the GUI complain about differing versions. Following my historical mathod, which lines up with online advice:
* restart MGR
* restart MDS
* restart OSD

Followed that advice, MDS's all went standby, wouldn't start. Restarted everything, including the entire cluster(s) themselves, rank 0/1 both showed "failed" and all MDS's in standby, but nothing alive.

tried a lot of steps with "ceph-journal-tool" (good education) - nothing brought them back online. lots of google directions around "damaged" MDS, none of that applied or worked.

finally stumbled across this thread : https://www.mail-archive.com/ceph-users@ceph.io/msg12463.html

* stop all MDS
* ceph fs set cephfs max_mds 1
* ceph fs set cephfs allow_standby_replay false
* ceph fs compat <fs name> add_incompat 7 "mds uses inline data"
* start MDS
* magic ensues!


Posting here for other PMX users, this is a nasty stone to trip over, cost me easily 20 hours this weekend, and mild amounts of panic.


(more details here : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!