[SOLVED] Remnants of ceph configuration after pveceph purge?

fartboner

New Member
Aug 3, 2023
10
0
1
I'm new to proxmox and ceph, and I messed up a cluster I have. I didn't have anything critical so I decided I'd start fresh. My cluster was unhealthy so I couldn't gracefully turn everything down so I assumed I'd need to purge. I read a few threads that mention how to purge ceph and I followed them. The ceph installation wizard within the web GUI gives me the option to reinstall but I'm running into an error on the configuration tab on the web interface after installation.

"Could not connect to ceph cluster despite configured monitors (500)"

Where can I find where the remnants are left over so that I can have a clean reinstall?


EDIT:
I rebooted after running my purge script that I compiled from the different threads and it looks like I'm able to finish via the wizard. False alarm!
 
Last edited:
Can you tell me in more detail how you purged Ceph?
 
Can you tell me in more detail how you purged Ceph?
Because nothing was responding, I SSH'd into each host and ran this script:
Bash:
#!/bin/bash
systemctl restart pvestatd
rm -rf /etc/systemd/system/ceph*
killall -9 ceph-mon ceph-mgr ceph-mds
umount /var/lib/ceph/osd/ceph*
rm -rf /etc/ceph /etc/pve/ceph.conf /etc/pve/priv/ceph* /var/lib/ceph
pveceph purge
systemctl restart pvestatd
apt purge ceph-mon ceph-osd ceph-mds -y
systemctl restart pvestatd
rm /etc/init.d/ceph
# my additions
mkdir /etc/ceph
mkdir /var/lib/ceph/
mkdir /var/lib/ceph/bootstrap-osd
mkdir /var/lib/ceph/mgr

After that, I was able to reinstall ceph but ran into that monitor issue. After a reboot that seems to be resolved, though.
 
did you run that script on all nodes?
Can you give me the output of journalctl -b?
 
I'm intermittently receiving "got timeout (500)" messages when making changes or viewing the ceph section of the hosts. Is that something to be concerned about?
 
try
  • removing the mkdir additions of you from the script
  • run the script
  • run on the terminalapt reinstall ceph ceph-common
 
try
  • removing the mkdir additions of you from the script
  • run the script
  • run on the terminalapt reinstall ceph ceph-common
I'm currently reinstalling ceph on the remaining nodes, and setting up OSDs. I think that script would destroy the cluster again? Or is this to resolve the "got timeout (500)" messages?
 
Oh.

Well glad it works now :)

If you are concerned with got timeout messages you can journalctl -f and see if something strange pops up
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!