Replication failure after miscellaneous snapshots

frankz

Renowned Member
Nov 16, 2020
420
27
68
Hello everybody . I wanted to report anomalous behavior regarding replicas on a cluster of a VM. The VM is replicating on 3 nodes on a ZFS share named "zshare". I realized that if I take snapshots from that VM and restore the snapshot after a short time, it happens that the cluster reports an error in the replication, even as if the disk has I / O problems. But this is incorrect as this VM has been moved several times to other nodes. The fact is that the error only occurs when you take snapshots. Here is the error:

May 02 19:34:03 pve4 pvesr[20526]: send/receive failed, cleaning up snapshot(s).. May 02 19:34:03 pve4 pvesr[20526]: 103-1: got unexpected replication job error - command 'set -o pipefail && pvesm export zshare:vm-103-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_103-1_1619976840__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve3' root@192.168.2.37 -- pvesm import zshare:vm-103-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255 May 02 19:34:03 pve4 systemd[1]: pvesr.service: Succeeded. May 02 19:34:03 pve4 systemd[1]: Started Proxmox VE replication runner.
Furthermore, the snapshot bug still persists which is not deleted on replication to other nodes, ie if I delete a snapshot on the VM and this snasphot is replicated it is not deleted. the following bug

By deleting the replication job and deleting the replicated images and creating the job from scratch, everything starts working again
 

Attachments

  • a.png
    a.png
    43.4 KB · Views: 9
  • b.png
    b.png
    28 KB · Views: 10
Last edited:
  • Like
Reactions: timdonovan
Hello. everyone was wondering if the snapshot replication bug is under attention. Crdo is important as in the clustern structure you do not have the possibility to perform spanshots in replicas because the system goes into error, forcing the administrator to cancel the replicas and do them again. If you have the VM with> 200Gb disks available it becomes a problem. Thank you all
 
Hi,
yes, we are working on it, proposed patches are here and here.
 
  • Like
Reactions: frankz