The VM show replication error ( HOW I can solve It)

Jul 21, 2022
4
0
6
Replication Log

2022-07-21 09:33:00 100-0: start replication job
2022-07-21 09:33:00 100-0: guest => VM 100, running => 3114591
2022-07-21 09:33:00 100-0: volumes => SCME-REP-DS01:vm-100-disk-0
2022-07-21 09:33:01 100-0: freeze guest filesystem
2022-07-21 09:33:04 100-0: create snapshot '__replicate_100-0_1658385180__' on SCME-REP-DS01:vm-100-disk-0
2022-07-21 09:33:04 100-0: thaw guest filesystem
2022-07-21 09:33:05 100-0: using secure transmission, rate limit: none
2022-07-21 09:33:05 100-0: full sync 'SCME-REP-DS01:vm-100-disk-0' (__replicate_100-0_1658385180__)
2022-07-21 09:33:06 100-0: full send of SCME-REP-DS01/vm-100-disk-0@__replicate_100-0_1655900100__ estimated size is 84.0G
2022-07-21 09:33:06 100-0: send from @__replicate_100-0_1655900100__ to SCME-REP-DS01/vm-100-disk-0@__replicate_100-0_1658385180__ estimated size is 17.1G
2022-07-21 09:33:06 100-0: total estimated size is 101G
2022-07-21 09:33:06 100-0: volume 'SCME-REP-DS01/vm-100-disk-0' already exists
2022-07-21 09:33:06 100-0: warning: cannot send 'SCME-REP-DS01/vm-100-disk-0@__replicate_100-0_1655900100__': signal received
2022-07-21 09:33:06 100-0: warning: cannot send 'SCME-REP-DS01/vm-100-disk-0@__replicate_100-0_1658385180__': Broken pipe
2022-07-21 09:33:06 100-0: cannot send 'SCME-REP-DS01/vm-100-disk-0': I/O error
2022-07-21 09:33:06 100-0: command 'zfs send -Rpv -- SCME-REP-DS01/vm-100-disk-0@__replicate_100-0_1658385180__' failed: exit code 1
2022-07-21 09:33:06 100-0: delete previous replication snapshot '__replicate_100-0_1658385180__' on SCME-REP-DS01:vm-100-disk-0
2022-07-21 09:33:06 100-0: end replication job with error: command 'set -o pipefail && pvesm export SCME-REP-DS01:vm-100-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1658385180__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.10.154.2 -- pvesm import SCME-REP-DS01:vm-100-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1658385180__ -allow-rename 0' failed: exit code 255
 
e.g. with 'zpool status'
'dmesg' , etc.
 
2022-07-21 09:33:06 100-0: delete previous replication snapshot '__replicate_100-0_1658385180__' on SCME-REP-DS01:vm-100-disk-0
Was there some issue at some point earlier? Is there more than one replication job running for this VM? For example, 2 replication jobs to 2 different nodes?
In my experience, it can happen that the snapshots don't match anymore after HA recovered such a VM. In such a case, removing the failing replication job and recreating it should help. The first run will take some time again.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!