Bug ? After online migration, replication job fails

saphirblanc · Sep 10, 2018

Hi,

I've migrated a VM from a node to another using the CLI and --online --with-local-disks and it worked perfectly! However, when it was time for the replication's job to the third node, this one failed :

2018-09-10 06:56:00 145-0: start replication job
2018-09-10 06:56:00 145-0: guest => VM 145, running => 38278
2018-09-10 06:56:00 145-0: volumes => local-zfs:vm-145-disk-1
2018-09-10 06:56:01 145-0: freeze guest filesystem
2018-09-10 06:56:01 145-0: create snapshot '__replicate_145-0_1536555360__' on local-zfs:vm-145-disk-1
2018-09-10 06:56:01 145-0: thaw guest filesystem
2018-09-10 06:56:01 145-0: full sync 'local-zfs:vm-145-disk-1' (__replicate_145-0_1536555360__)
2018-09-10 06:56:02 145-0: full send of rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__ estimated size is 2.57G
2018-09-10 06:56:02 145-0: total estimated size is 2.57G
2018-09-10 06:56:02 145-0: TIME SENT SNAPSHOT
2018-09-10 06:56:02 145-0: rpool/data/vm-145-disk-1 name rpool/data/vm-145-disk-1 -
2018-09-10 06:56:02 145-0: volume 'rpool/data/vm-145-disk-1' already exists
2018-09-10 06:56:02 145-0: warning: cannot send 'rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__': signal received
2018-09-10 06:56:02 145-0: cannot send 'rpool/data/vm-145-disk-1': I/O error
2018-09-10 06:56:02 145-0: command 'zfs send -Rpv -- rpool/data/vm-145-disk-1@__replicate_145-0_1536555360__' failed: exit code 1
2018-09-10 06:56:02 145-0: delete previous replication snapshot '__replicate_145-0_1536555360__' on local-zfs:vm-145-disk-1
2018-09-10 06:56:02 145-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-145-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_145-0_1536555360__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=XXX -- pvesm import local-zfs:vm-145-disk-1 zfs - -with-snapshots 1' failed: exit code 255

Tried to launch it manually, same result. After that, I deleted the replication job and created it again to the same target node and it's working again as previously.

Is this a bug or did I do something wrong ?

Thanks,

dcsapak · Sep 10, 2018

if you migrate with 'online with local-disks' we cannot take the snapshots with the replication status with us, so this is expected to fail

Search

Search

Bug ? After online migration, replication job fails

saphirblanc

Well-Known Member

dcsapak

Proxmox Staff Member