I upgraded my CEPH cluster without properly following the mon upgrade so they were no longer on leveldb.
Proxmox and CEPH were updated to latest for current release.
https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy
The monitors were still running as leveldb.
I upgraded all nodes to the quincy release 17.2.6 and restarted the mons.
At this point the cluster stopped responding.
`ceph` commands do not work since the service fails to start.
Are there steps for recovery?
1) Roll back to Pacific without being able to use CEPH commands (ceph orch upgrade start --ceph-version <version>).
2) Rebuild the monitors using data from the OSDs while maintaining Quincy release.
3) Is this actually the bug shared: https://tracker.ceph.com/issues/58156 about 17.2.6 (which is what Proxmox/CEPH upgrades to?
I ran the upgrade on another cluster prior to this without issue. The Mons were set with RocksDB and running on Quincy 17.2.6.
Proxmox and CEPH were updated to latest for current release.
https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy
- The upgrade to Quincy states a recommendation that Mons are using RocksDB.
- Leveldb support has been removed from quincy.
The monitors were still running as leveldb.
- Does this mean the mons cannot work at all since they are levelDB?
I upgraded all nodes to the quincy release 17.2.6 and restarted the mons.
At this point the cluster stopped responding.
`ceph` commands do not work since the service fails to start.
Are there steps for recovery?
1) Roll back to Pacific without being able to use CEPH commands (ceph orch upgrade start --ceph-version <version>).
2) Rebuild the monitors using data from the OSDs while maintaining Quincy release.
3) Is this actually the bug shared: https://tracker.ceph.com/issues/58156 about 17.2.6 (which is what Proxmox/CEPH upgrades to?
I ran the upgrade on another cluster prior to this without issue. The Mons were set with RocksDB and running on Quincy 17.2.6.
Code:
ceph-crash[53537]: WARNING:ceph-crash:post /var/lib/ceph/crash/2023-10-06T21:58:29.079369Z_b01e4225-9d3b-4fa4-9599-a4b6928f845c as client.crash.NODE failed: Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')
-2> 2023-10-06T14:58:29.073-0700 7fa22e24ba00 -1 _open error initializing leveldb db back storage in /var/lib/ceph/mon/ceph-0/store.db
-1> 2023-10-06T14:58:29.073-0700 7fa22e24ba00 -1 ./src/mon/MonitorDBStore.h: In function 'void MonitorDBStore::_open(const string&)' thread 7fa22e24ba00 time 2023-10-06T14:58:29.075051-0700
./src/mon/MonitorDBStore.h: 634: ceph_abort_msg("MonitorDBStore: error initializing keyvaluedb back storage")
Last edited: