Hello!
I'm trying to create a disaster recovery plan for our PVE cluster including CEPH. Our current config involves three monitors on our three servers.
We'll be using three monitors and standard pool configuration (3 replicas). I'm trying to set up a manual for deleting monitor configuration and running on only one monitor.
I got a monmap from quorate ceph, removed the two stopped monitors from it and edited ceph.conf accordingly:
Using these commands I was able to remove monitor configuration and recreate one monitor that has a quorum (third host nextcloudc is fully shutdown and monitor service on second host nextcloudb is stopped manually):
But as you can see, now the monitor doesn't see any OSDs, pools or managers for cephfs. I'm trying to do this without recreating everything manually but I'll resort to it if I have to. I'll be very thankfull for your help and/or insights if what I'm trying to do makes sense.
I have a backup of /var/lib/ceph and original monmap in case it can help.
Thank you very much and have a nice rest of the day!
I'm trying to create a disaster recovery plan for our PVE cluster including CEPH. Our current config involves three monitors on our three servers.
We'll be using three monitors and standard pool configuration (3 replicas). I'm trying to set up a manual for deleting monitor configuration and running on only one monitor.
I got a monmap from quorate ceph, removed the two stopped monitors from it and edited ceph.conf accordingly:
Code:
root@nextclouda:~# ceph mon getmap -o /root/monmap
root@nextclouda:~# monmaptool --rm nextcloudb /root/monmap
root@nextclouda:~# monmaptool --rm nextcloudc /root/monmap
root@nextclouda:~# cat /etc/pve/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 10.0.0.1/24
fsid = cf282c03-77a3-458d-8989-b4a477f121dd
mon_allow_pool_delete = true
mon_host = 10.0.1.1
#10.0.1.2 10.0.1.3
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.0.1.1/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[client.crash]
keyring = /etc/pve/ceph/$cluster.$name.keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[mds.nextclouda]
host = nextclouda
mds_standby_for_name = pve
[mds.nextcloudb]
host = nextcloudb
mds_standby_for_name = pve
[mds.nextcloudc]
host = nextcloudc
mds_standby_for_name = pve
[mon.nextclouda]
public_addr = 10.0.1.1
#[mon.nextcloudb]
# public_addr = 10.0.1.2
#
#[mon.nextcloudc]
# public_addr = 10.0.1.3
Using these commands I was able to remove monitor configuration and recreate one monitor that has a quorum (third host nextcloudc is fully shutdown and monitor service on second host nextcloudb is stopped manually):
Code:
root@nextclouda:~# systemctl stop ceph-mon@nextclouda
root@nextclouda:~# rm -rf /var/lib/ceph/mon/ceph-nextclouda
root@nextclouda:~# ceph-mon --monmap /root/monmap --keyring /etc/pve/priv/ceph.mon.keyring --mkfs -i nextclouda -m 10.0.1.1
root@nextclouda:~# chown -R ceph:ceph /var/lib/ceph/mon/ceph-nextclouda
root@nextclouda:~# systemctl start ceph-mon@nextclouda
root@nextclouda:~# ceph -s
cluster:
id: cf282c03-77a3-458d-8989-b4a477f121dd
health: HEALTH_WARN
mon is allowing insecure global_id reclaim
services:
mon: 1 daemons, quorum nextclouda (age 50s)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
root@nextclouda:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
But as you can see, now the monitor doesn't see any OSDs, pools or managers for cephfs. I'm trying to do this without recreating everything manually but I'll resort to it if I have to. I'll be very thankfull for your help and/or insights if what I'm trying to do makes sense.
I have a backup of /var/lib/ceph and original monmap in case it can help.
Code:
package versions:
proxmox-ve: 8.3.0 (running kernel: 6.8.12-9-pve)
pve-manager: 8.3.5 (running version: 8.3.5/dac3aa88bac3f300)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.8: 6.8.12-9
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
proxmox-kernel-6.8.12-8-pve-signed: 6.8.12-8
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
ceph: 19.2.1-pve2
ceph-fuse: 19.2.1-pve2
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
dnsmasq: 2.90-4~deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve1
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.1
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.0
libpve-cluster-perl: 8.1.0
libpve-common-perl: 8.3.0
libpve-guest-common-perl: 5.2.0
libpve-http-server-perl: 5.2.0
libpve-network-perl: 0.10.1
libpve-rs-perl: 0.9.3
libpve-storage-perl: 8.3.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.3.7-1
proxmox-backup-file-restore: 3.3.7-1
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.8
pve-cluster: 8.1.0
pve-container: 5.2.5
pve-docs: 8.3.1
pve-edk2-firmware: 4.2025.02-3
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.15-3
pve-ha-manager: 4.0.6
pve-i18n: 3.4.1
pve-qemu-kvm: 9.2.0-5
pve-xtermjs: 5.5.0-1
qemu-server: 8.3.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2
Thank you very much and have a nice rest of the day!