Bug ? After online migration, replication job fails

saphirblanc

Well-Known Member
Jul 4, 2017
49
0
46
Hi,

I've migrated a VM from a node to another using the CLI and --online --with-local-disks and it worked perfectly! However, when it was time for the replication's job to the third node, this one failed :

2018-09-10 06:56:00 145-0: start replication job
2018-09-10 06:56:00 145-0: guest => VM 145, running => 38278
2018-09-10 06:56:00 145-0: volumes => local-zfs:vm-145-disk-1
2018-09-10 06:56:01 145-0: freeze guest filesystem
2018-09-10 06:56:01 145-0: create snapshot '__replicate_145-0_1536555360__' on local-zfs:vm-145-disk-1
2018-09-10 06:56:01 145-0: thaw guest filesystem
2018-09-10 06:56:01 145-0: full sync 'local-zfs:vm-145-disk-1' (__replicate_145-0_1536555360__)
2018-09-10 06:56:02 145-0: full send of rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__ estimated size is 2.57G
2018-09-10 06:56:02 145-0: total estimated size is 2.57G
2018-09-10 06:56:02 145-0: TIME SENT SNAPSHOT
2018-09-10 06:56:02 145-0: rpool/data/vm-145-disk-1 name rpool/data/vm-145-disk-1 -
2018-09-10 06:56:02 145-0: volume 'rpool/data/vm-145-disk-1' already exists
2018-09-10 06:56:02 145-0: warning: cannot send 'rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__': signal received
2018-09-10 06:56:02 145-0: cannot send 'rpool/data/vm-145-disk-1': I/O error
2018-09-10 06:56:02 145-0: command 'zfs send -Rpv -- rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__' failed: exit code 1
2018-09-10 06:56:02 145-0: delete previous replication snapshot '__replicate_145-0_1536555360__' on local-zfs:vm-145-disk-1
2018-09-10 06:56:02 145-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-145-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_145-0_1536555360__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=XXX -- pvesm import local-zfs:vm-145-disk-1 zfs - -with-snapshots 1' failed: exit code 255

Tried to launch it manually, same result. After that, I deleted the replication job and created it again to the same target node and it's working again as previously.

Is this a bug or did I do something wrong ?

Thanks,
 
if you migrate with 'online with local-disks' we cannot take the snapshots with the replication status with us, so this is expected to fail
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!