Proxmox Ceph Issue - Ceph Config Wiped

wrmanis

New Member
Oct 6, 2014
4
0
1
Hello,

Thanks for taking a look:

I'm having an issue where I had a running ceph cluster with two nodes and a couple TB of data containing VM's. I added a third node when I realized I wanted to remove it to make some hardware changes. I followed the standard procedure for removing a node from ceph by running:

pveceph purge

Now, according to the ceph documentation - this only affects the node it was run on. Unfortunately, Proxmox seems to have taken a different implementation. All my config files on the other nodes are erased, and I have no direct backup of these configuration files. Apparently I need to recreate my ceph cluster.

On my two orignial nodes I have run:

pveceph init --network 172.30.0.0/16

pveceph createmon

And now I can see my two original nodes in the gui. My previous pool of course does not match up with my current pool (which shows as none). But my real trouble is the OSD's. I want to reattach the OSD's to the disks they were originally attached to so I can get my cluster back up and running without losing any data. Of course that last part is paramount. In the GUI it's telling me that there are "no unused disks".

Any suggestions on where to begin?

Thanks a bunch for your help,
 
pveceph purge

this is NOT the standard procedure for removing a ceph node.

this command delete ALL ceph configs, so what you see here is expected.

I hope you can restore your data from a backup.
 
Hi,
do you have overwrite your keys? That's would be very bad!

Otherwise you must "simply" recreate your ceph.conf.

Look at your osd-disk and see a short ceph.conf:
Code:
# cat /var/lib/ceph/osd/ceph-0/ceph_fsid
17c25f98-9f45-4b79-b0fa-95f11873a2cb
# cat /etc/ceph/ceph.conf
[global]
fsid = 17c25f98-9f45-4b79-b0fa-95f11873a2cb
mon_initial_members = ceph-01, ceph-02, ceph-03
mon_host = 192.168.2.61,192.168.2.62,192.168.2.63
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_pool_default_min_size = 1 # Allow writing one copy in a degraded state.
If your keys are valid, this should be all.

Udo
 
Hey Udo,

By the way, I've read a bunch of your posts since I started using Ceph - You're the man! Thanks a bunch for your time and help on here.

I realize I've probably gone past the point of no return, and I'll be restoring from backups as I write this. But I'd like to take the time and try to understand where I went wrong and what I could have done to fix it.

Lets imagine I had just run "Pveceph purge" - and that was it. If I'd had a copy of my ceph.conf, could I have put that back in /etc/ceph and simply restarted the services? I take that by running another pveceph init I'd overwritten the important part - the key. But I'm still uncertain about how I would have reconnected the OSD's back to their original drives. How would I have gone about that?

Thanks again,
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!