[SOLVED] Ceph mirror not working

fxandrei

Renowned Member
Jan 10, 2013
154
14
83
So i have followed the instructions here https://pve.proxmox.com/wiki/Ceph_RBD_Mirroring , in order to mirror images from a ceph cluster to another.
I have done this before, and it has worked, but not this time.

I think i have done everything correct, like creating preparing the pool, creating the user, copying the master cluster config and the keyring, etc.

I now have only one image set to be mirror on the master cluster.
But when i execute this on the backup cluster, i get nothing:

rbd mirror pool status ssd_pool --verbose
health: OK
images: 0 total


The service seems to be ok.
systemctl status ceph-rbd-mirror@rbd-mirror.backup.service
ceph-rbd-mirror@rbd-mirror.backup.service - Ceph rbd mirror daemon
Loaded: loaded (/etc/systemd/system/ceph-rbd-mirror@.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-02-01 09:54:49 EET; 43min ago
Main PID: 4574 (rbd-mirror)
Tasks: 49
Memory: 26.2M
CGroup: /system.slice/system-ceph\x2drbd\x2dmirror.slice/ceph-rbd-mirror@rbd-mirror.backup.service
└─4574 /usr/bin/rbd-mirror -f --cluster ceph --id rbd-mirror.backup --setuser root --setgroup root


The thing is the the stop command takes a while (about a minute).

And one time i saw this in the logs, after trying to start the service after starting a stop command, and then trying to force close it (ctrl+c),

2022-02-01 10:42:17.454 7fa66348a700 0 rbd::mirror::LeaderWatcher: 0x55aa17eaac00 handle_get_locker: breaking leader lock after 3 failed attempts to acquire


All the nodes can see each other (i can ping from one to the other).

What could i check ?
 
So it turns out the problem was that not all nodes could communicate with each other.
So the problem was related to ipsec it seems.
The clusters were behind gateways that had an ipsec tunnel. And all could ping each other but one node could not communicate at all with the nodes on the other cluster. Im not actually sure why, but it seems to be related to the fact that all the nodes had to do everything on a different gateway that the default one.

To fix this i just replaced IPSec with OpenVPN.

Now everything works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!