Replication + Migration Error

yena

Renowned Member
Nov 18, 2011
378
5
83
Hello,
in a 3 Nodes proxmox 5 cluster ZFS Replication,
if i migrate a vps (lxc) from master to slave, Raplicatin stop working on the new node:

2017-11-15 11:47:00 100-2: start replication job
2017-11-15 11:47:00 100-2: guest => CT 100, running => 1
2017-11-15 11:47:00 100-2: volumes => SSDstorage:subvol-100-disk-1
2017-11-15 11:47:01 100-2: freeze guest filesystem
2017-11-15 11:47:01 100-2: create snapshot '__replicate_100-2_1510742820__' on SSDstorage:subvol-100-disk-1
2017-11-15 11:47:01 100-2: thaw guest filesystem
2017-11-15 11:47:01 100-2: full sync 'SSDstorage:subvol-100-disk-1' (__replicate_100-2_1510742820__)
2017-11-15 11:47:02 100-2: full send of SSDstorage/subvol-100-disk-1@__replicate_100-1_1510742732__ estimated size is 8.01G
2017-11-15 11:47:02 100-2: send from @__replicate_100-1_1510742732__ to SSDstorage/subvol-100-disk-1@__replicate_100-2_1510742820__ estimated size is 1.41M
2017-11-15 11:47:02 100-2: total estimated size is 8.01G
2017-11-15 11:47:02 100-2: TIME SENT SNAPSHOT
2017-11-15 11:47:02 100-2: SSDstorage/subvol-100-disk-1 name SSDstorage/subvol-100-disk-1 -
2017-11-15 11:47:02 100-2: volume 'SSDstorage/subvol-100-disk-1' already exists
2017-11-15 11:47:02 100-2: warning: cannot send 'SSDstorage/subvol-100-disk-1@__replicate_100-1_1510742732__': signal received
2017-11-15 11:47:02 100-2: TIME SENT SNAPSHOT
2017-11-15 11:47:02 100-2: warning: cannot send 'SSDstorage/subvol-100-disk-1@__replicate_100-2_1510742820__': Broken pipe
2017-11-15 11:47:02 100-2: cannot send 'SSDstorage/subvol-100-disk-1': I/O error
2017-11-15 11:47:02 100-2: command 'zfs send -Rpv -- SSDstorage/subvol-100-disk-1@__replicate_100-2_1510742820__' failed: exit code 1
2017-11-15 11:47:02 100-2: delete previous replication snapshot '__replicate_100-2_1510742820__' on SSDstorage:subvol-100-disk-1
2017-11-15 11:47:02 100-2: end replication job with error: command 'set -o pipefail && pvesm export SSDstorage:subvol-100-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_100-2_1510742820__ | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=storage' root@192.168.100.15 -- pvesm import SSDstorage:subvol-100-disk-1 zfs - -with-snapshots 1' failed: exit code 255

-----------------------------------------------------------------------------------

Workaround ( to avoid risk of manually delete the correct volume ):

On the new slave:


- zfs rename SSDstorage/subvol-100-disk-1 SSDstorage/subvol-999-disk-1
resync
test all
delete old volume
- zfs destroy SSDstorage/subvol-999-disk-1

In Practice, every time i migrate from Master to Slave and again from Slave to Master with active replication i have this error:

2017-11-15 11:58:56 ERROR: switch replication job target failed - replication job for guest '100' to target 'local/nodo1' already exists






 
Last edited:
Hi,

can't reproduce it here
please send your <vmid>.config and the output of pveversion -v
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!