Proxmox 7 ceph pacific fresh install, can I downgrade Ceph to Octopus?

philliphs

Member
May 22, 2021
13
0
6
39
I have a cluster with 3 nodes with Ceph. I updated node 3 to Proxmox 7 when it lost network connectivity due to bonded LACP network settings (solved on by this thread https://forum.proxmox.com/threads/u...ond-lacp-interface-not-working-anymore.92060/). Before I found out about the solution, I decided to do fresh install of node 3 with different IP and names. Later on I joined the cluster fine.

The problem appeared when I decided to install Ceph Pacific on the node 3. Joining cluster and adding OSD is fine. But when I created monitor on node 3, the whole ceph network went haywire. Note node 1 and 2 are still Octopus version.
The monitor on node 3 was stopped and cannot be started. It also cannot be destroyed with error of monitor does not exists. I assume it became ghost monitor.
Presumably because of this, all 3 nodes were piling ceph mon log on RAM and root disk. All my 3 nodes crashed because of 100% RAM usage (32GB RAM on each node) and all the root disk are full. I found out later the ceph mon log was exceeding 30GB on disk. I found out using this command
Code:
lsof | sort -n -r -k8 | more
The only way I can disable the node 3 monitor was following this thread https://forum.proxmox.com/threads/ghost-monitor-in-ceph-cluster.58683/
Code:
systemctl disable ceph-mon@pve00-3
systemctl disable ceph-mon@pve00-3.service
Then I removed the directory /var/lib/ceph/mon/pve00-3
and re run the
Code:
systemctl disable ceph-mon@pve00-3
systemctl disable ceph-mon@pve00-3.service

I later managed to install another node (node 4) with ceph octopus just for third monitor quorum with no OSD on a spare machine.

Would I be able to downgrade node 3 to Octopus so I can make it as monitor and detach the node 4?
I was planning to upgrade node 1 and 2 to pacific but decided to wait further for pacific to stabilize as I had enough headache. I read about cluster crash on this thread https://forum.proxmox.com/threads/ceph-16-2-pacific-cluster-crash.92367/ that is fixed only on Pacific Test repo at the moment.

Sorry for the messy writing. I am just writing what happened in case anyone encountered the same problem.

Thanks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!