So i have followed the instructions here https://pve.proxmox.com/wiki/Ceph_RBD_Mirroring , in order to mirror images from a ceph cluster to another.
I have done this before, and it has worked, but not this time.
I think i have done everything correct, like creating preparing the pool, creating the user, copying the master cluster config and the keyring, etc.
I now have only one image set to be mirror on the master cluster.
But when i execute this on the backup cluster, i get nothing:
rbd mirror pool status ssd_pool --verbose
health: OK
images: 0 total
The service seems to be ok.
systemctl status ceph-rbd-mirror@rbd-mirror.backup.service
ceph-rbd-mirror@rbd-mirror.backup.service - Ceph rbd mirror daemon
Loaded: loaded (/etc/systemd/system/ceph-rbd-mirror@.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-02-01 09:54:49 EET; 43min ago
Main PID: 4574 (rbd-mirror)
Tasks: 49
Memory: 26.2M
CGroup: /system.slice/system-ceph\x2drbd\x2dmirror.slice/ceph-rbd-mirror@rbd-mirror.backup.service
└─4574 /usr/bin/rbd-mirror -f --cluster ceph --id rbd-mirror.backup --setuser root --setgroup root
The thing is the the stop command takes a while (about a minute).
And one time i saw this in the logs, after trying to start the service after starting a stop command, and then trying to force close it (ctrl+c),
2022-02-01 10:42:17.454 7fa66348a700 0 rbd::mirror::LeaderWatcher: 0x55aa17eaac00 handle_get_locker: breaking leader lock after 3 failed attempts to acquire
All the nodes can see each other (i can ping from one to the other).
What could i check ?
I have done this before, and it has worked, but not this time.
I think i have done everything correct, like creating preparing the pool, creating the user, copying the master cluster config and the keyring, etc.
I now have only one image set to be mirror on the master cluster.
But when i execute this on the backup cluster, i get nothing:
rbd mirror pool status ssd_pool --verbose
health: OK
images: 0 total
The service seems to be ok.
systemctl status ceph-rbd-mirror@rbd-mirror.backup.service
ceph-rbd-mirror@rbd-mirror.backup.service - Ceph rbd mirror daemon
Loaded: loaded (/etc/systemd/system/ceph-rbd-mirror@.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-02-01 09:54:49 EET; 43min ago
Main PID: 4574 (rbd-mirror)
Tasks: 49
Memory: 26.2M
CGroup: /system.slice/system-ceph\x2drbd\x2dmirror.slice/ceph-rbd-mirror@rbd-mirror.backup.service
└─4574 /usr/bin/rbd-mirror -f --cluster ceph --id rbd-mirror.backup --setuser root --setgroup root
The thing is the the stop command takes a while (about a minute).
And one time i saw this in the logs, after trying to start the service after starting a stop command, and then trying to force close it (ctrl+c),
2022-02-01 10:42:17.454 7fa66348a700 0 rbd::mirror::LeaderWatcher: 0x55aa17eaac00 handle_get_locker: breaking leader lock after 3 failed attempts to acquire
All the nodes can see each other (i can ping from one to the other).
What could i check ?