Ceph rbd mirror force promote

willybong

Active Member
Apr 22, 2020
33
4
28
Hello everyone,

I’m currently setting up my first Ceph mirror configuration and have a few questions regarding its behavior.
For example, I’m uncertain about how to force-promote an image on my DR cluster (site-b) during the synchronization process.
From what I’ve read in the documentation, in a disaster scenario occurring during synchronization, a force-promote operation promotes the last snapshot received by the DR cluster. However, as noted:

"Since this mode is not as fine-grained as journaling, the complete delta between two snapshots will need to be synced prior to use during a failover scenario. Any partially applied set of deltas will be rolled back at the moment of failover."


When I attempt to force-promote an image, I encounter the following error:

Code:
root@pve1-b:~# rbd mirror image promote ceph-pool/vm-103-disk-1 --force
2025-01-09T09:42:40.412+0100 7983d4e006c0 -1 librbd::mirror::snapshot::util:  can_create_primary_snapshot: cannot rollback
2025-01-09T09:42:40.412+0100 7983d4e006c0 -1 librbd::mirror::snapshot::PromoteRequest: 0x7983b0001d40 send: cannot promote
2025-01-09T09:42:40.412+0100 7983d4e006c0 -1 librbd::mirror::PromoteRequest: 0x7983b401a810 handle_promote: failed to promote image: (22) Invalid argument
rbd: error promoting image to primary
2025-01-09T09:42:40.412+0100 7983d84f1780 -1 librbd::api::Mirror: image_promote: failed to promote image

I’ve checked the snapshots on my DR cluster (site-b) and always see the latest snapshot of the image present there.

I have configured periodic snapshots to run every 3 minutes.
On the main cluster (site-a), I always retain the last 5 snapshots, while on the DR cluster (site-b), only the most recent snapshot is kept.
I assume that this latest snapshot is overwritten during the synchronization process

My main question is: How does Ceph handle promotion for an image when the data hasn’t been fully received on the DR cluster (site-b)?

Thank you!

Regards
 
Hi, we also have a Wiki page for Ceph Mirroring[0], if site A is still available, you first need to demote it:

Promote images on site B​

By promoting an image or a all images in a pool, we can tell Ceph that they are now the primary ones to be used. In a planned failover, we would first demote the images on site A before we promote the images on site B. In a recovery situation with site A down, we need to `--force` the promotion.

To promote a single image, run the following command:

Code:
root@site-b $ rbd mirror image promote <pool>/<image> --force

To promote all images in a pool, run the following command:

Code:
root@site-b $ rbd mirror pool promote <pool> --force
After this, our guests should start fine.



[0] https://pve.proxmox.com/wiki/Ceph_RBD_Mirroring#Failover_Recovery
 
Hi, we also have a Wiki page for Ceph Mirroring[0], if site A is still available, you first need to demote it:





[0] https://pve.proxmox.com/wiki/Ceph_RBD_Mirroring#Failover_Recovery


Hi @KevinS,

Thank you for your response.
In a planned failover, the system works perfectly.

But the issue I’m facing is that, in a recovery scenario (with no connection between site-a and site-b, similar to a DR scenario), there might be ongoing synchronization processes.
As a result, the complete image may not be fully available on site-b.

For this reason, when I attempt to force-promote an image (VM disk), the command returns the error I mentioned earlier.

In general, during a disaster scenario, I cannot guarantee that all synchronization processes have been fully completed.

The message in the output, "cannot rollback," makes me think that Ceph doesn’t have a restore point to revert to the previous snapshot.
From what I’ve observed, Ceph RBD mirror in snapshot mode keeps, by default, 5 snapshots on site-a and 1 snapshot on site-b (the recovery site). However, if I interrupt the incremental sync process for the single snapshot on site-b, the image is no longer available, even if I use promote --force.
Is what I’m saying correct?
I hope I’ve been as clear as possible.

Thank you
 
Last edited:
Hi @KevinS,
I wanted to share my findings with you:
When I attempt to force-promote a snapshot that is 88% copied and still in a syncing state, the force promotion process gets stuck during the promotion.
please see below the snapshot number 1247065

Code:
Image: vm-103-disk-0
Snapshots:
SNAPID   NAME                                                                                           SIZE    PROTECTED  TIMESTAMP                 NAMESPACE                                                 
1247064  .mirror.non_primary.0e0c83ba-b709-4bf9-832d-013ca194b00e.fff33483-93fa-4923-8240-6ca810a68744  21 GiB             Fri Jan 10 16:54:01 2025  mirror (non-primary peer_uuids:[] a946fac3-067c-47c1-a80c-05187eb77a30:431652 copied)
----------------------------------------
Image: vm-103-disk-1
Snapshots:
SNAPID   NAME                                                                                           SIZE   PROTECTED  TIMESTAMP                 NAMESPACE                                                   
1247059  .mirror.non_primary.ab2549b1-78bc-48f5-be58-85cdc470b3ae.3bb6f0ff-b22d-4d6b-b210-2f30edbbc011  5 GiB             Fri Jan 10 16:51:00 2025  mirror (non-primary peer_uuids:[] a946fac3-067c-47c1-a80c-05187eb77a30:431647 copied)
1247065  .mirror.non_primary.ab2549b1-78bc-48f5-be58-85cdc470b3ae.2f65824c-f6bc-49f9-9982-c4804fa9a0e1  5 GiB             Fri Jan 10 16:54:01 2025  mirror (non-primary peer_uuids:[] a946fac3-067c-47c1-a80c-05187eb77a30:431653 88% copied)
----------------------------------------
Image: vm-104-disk-0
Snapshots:
SNAPID   NAME                                                                                           SIZE    PROTECTED  TIMESTAMP                 NAMESPACE                                                 
1247066  .mirror.non_primary.04c18313-1f08-4343-89dd-b236a6e3937a.97d6f893-aa52-4d16-a951-cd9c3410711b  21 GiB             Fri Jan 10 16:54:01 2025  mirror (non-primary peer_uuids:[] a946fac3-067c-47c1-a80c-05187eb77a30:431654 copied)

This command is totally stuck
Code:
rbd mirror image promote ceph-pool/vm-103-disk-1 --force

I tried to catch the problem:

Code:
 subprocess.run(promote_command, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.11/subprocess.py", line 550, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1199, in communicate
    self.wait()
  File "/usr/lib/python3.11/subprocess.py", line 1262, in wait
    return self._wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1997, in _wait
    (pid, sts) = self._try_wait(0)
                 ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1955, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

It's a very critical situation because in case on I have a disaster in site-a and the comunication between site-a and site-b no works I cannot promote site-b.
I hope I’ve been as clear as possible.

Thank you
bye