Tried to migrate some servers for server hardware intervention.
Migration failed and yet another nightmare to understand why.
Replication seemed to be working, but still the migration of several machines failed.
I now have to examine (with servers down) which replicate to retrieve and how... Not sure which server has the best copy...
task started by HA resource agent
2017-12-20 21:53:58 starting migration of CT 104 to node 'p4' (10.0.0.4)
2017-12-20 21:53:58 found local volume 'ZfsStorage:subvol-104-disk-2' (in current VM config)
2017-12-20 21:53:58 start replication job
2017-12-20 21:53:58 guest => CT 104, running => 0
2017-12-20 21:53:58 volumes => ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:58 create snapshot '__replicate_104-2_1513803238__' on ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:58 full sync 'ZfsStorage:subvol-104-disk-2' (__replicate_104-2_1513803238__)
2017-12-20 21:53:59 full send of vmpool/subvol-104-disk-2@__replicate_104-0_1513802705__ estimated size is 4.19G
2017-12-20 21:53:59 send from @__replicate_104-0_1513802705__ to vmpool/subvol-104-disk-2@__replicate_104-2_1513803238__ estimated size is 624B
2017-12-20 21:53:59 total estimated size is 4.19G
2017-12-20 21:53:59 TIME SENT SNAPSHOT
2017-12-20 21:53:59 vmpool/subvol-104-disk-2 name vmpool/subvol-104-disk-2 -
2017-12-20 21:53:59 volume 'vmpool/subvol-104-disk-2' already exists
2017-12-20 21:53:59 command 'zfs send -Rpv -- vmpool/subvol-104-disk-2@__replicate_104-2_1513803238__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2017-12-20 21:53:59 delete previous replication snapshot '__replicate_104-2_1513803238__' on ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:59 end replication job with error: command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
2017-12-20 21:53:59 ERROR: command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
2017-12-20 21:53:59 aborting phase 1 - cleanup resources
2017-12-20 21:53:59 start final cleanup
2017-12-20 21:53:59 ERROR: migration aborted (duration 00:00:02): command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
TASK ERROR: migration aborted
Migration failed and yet another nightmare to understand why.
Replication seemed to be working, but still the migration of several machines failed.
I now have to examine (with servers down) which replicate to retrieve and how... Not sure which server has the best copy...
task started by HA resource agent
2017-12-20 21:53:58 starting migration of CT 104 to node 'p4' (10.0.0.4)
2017-12-20 21:53:58 found local volume 'ZfsStorage:subvol-104-disk-2' (in current VM config)
2017-12-20 21:53:58 start replication job
2017-12-20 21:53:58 guest => CT 104, running => 0
2017-12-20 21:53:58 volumes => ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:58 create snapshot '__replicate_104-2_1513803238__' on ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:58 full sync 'ZfsStorage:subvol-104-disk-2' (__replicate_104-2_1513803238__)
2017-12-20 21:53:59 full send of vmpool/subvol-104-disk-2@__replicate_104-0_1513802705__ estimated size is 4.19G
2017-12-20 21:53:59 send from @__replicate_104-0_1513802705__ to vmpool/subvol-104-disk-2@__replicate_104-2_1513803238__ estimated size is 624B
2017-12-20 21:53:59 total estimated size is 4.19G
2017-12-20 21:53:59 TIME SENT SNAPSHOT
2017-12-20 21:53:59 vmpool/subvol-104-disk-2 name vmpool/subvol-104-disk-2 -
2017-12-20 21:53:59 volume 'vmpool/subvol-104-disk-2' already exists
2017-12-20 21:53:59 command 'zfs send -Rpv -- vmpool/subvol-104-disk-2@__replicate_104-2_1513803238__' failed: got signal 13
send/receive failed, cleaning up snapshot(s)..
2017-12-20 21:53:59 delete previous replication snapshot '__replicate_104-2_1513803238__' on ZfsStorage:subvol-104-disk-2
2017-12-20 21:53:59 end replication job with error: command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
2017-12-20 21:53:59 ERROR: command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
2017-12-20 21:53:59 aborting phase 1 - cleanup resources
2017-12-20 21:53:59 start final cleanup
2017-12-20 21:53:59 ERROR: migration aborted (duration 00:00:02): command 'set -o pipefail && pvesm export ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1 -snapshot __replicate_104-2_1513803238__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=p4' root@10.0.0.4 -- pvesm import ZfsStorage:subvol-104-disk-2 zfs - -with-snapshots 1' failed: exit code 255
TASK ERROR: migration aborted