Randomly(?) replication jobs for some containers or virtual machines fail, usually persistently.
Even deleting and recreating the jobs and possibly existing volumes does not fix this. Sometimes it just doesn't work.
What's even more weird:
If I delete the failing replication job and delete exsiting volumes, I can migrate the container or vm just fine - which does a replication in the background.
(And I can even create a new replaction job back to the original node and that job works fine then as well.)
This workaround is fine for small workloads.
But it's impractical for workloads wih lots of data, where a replication from scratch will take hours.
Anyone any idea what might cause this/how to fix this?
Even deleting and recreating the jobs and possibly existing volumes does not fix this. Sometimes it just doesn't work.
What's even more weird:
If I delete the failing replication job and delete exsiting volumes, I can migrate the container or vm just fine - which does a replication in the background.
(And I can even create a new replaction job back to the original node and that job works fine then as well.)
This workaround is fine for small workloads.
But it's impractical for workloads wih lots of data, where a replication from scratch will take hours.
Anyone any idea what might cause this/how to fix this?
Last edited: