Both nodes are now upgraded and I've started moving some VMs to the latest upgraded node. One of them seems to be locked into migration. I *think* I saw a little hiccup in the quorum connection when the migration started and it reports as aborted. Looking at the log (below) it seems to not be able to find a file and I can confirm from the cli that it doesn't exist. how can I recover this?
Do I have to build a new VM and attach the existing qcow somehow? If so, is there a document outlining the right way?
Code:
2020-08-06 14:22:53 starting migration of VM 106 to node 'pvn2' (10.11.12.252)
2020-08-06 14:22:53 found local, replicated disk 'local-zfs:vm-106-disk-1' (in current VM config)
2020-08-06 14:22:53 replicating disk images
2020-08-06 14:22:53 start replication job
2020-08-06 14:22:53 guest => VM 106, running => 0
2020-08-06 14:22:53 volumes => local-zfs:vm-106-disk-1
2020-08-06 14:22:56 create snapshot '__replicate_106-1_1596738173__' on local-zfs:vm-106-disk-1
2020-08-06 14:22:56 using secure transmission, rate limit: none
2020-08-06 14:22:56 incremental sync 'local-zfs:vm-106-disk-1' (__replicate_106-1_1596717257__ => __replicate_106-1_1596738173__)
2020-08-06 14:22:59 send from @__replicate_106-1_1596717257__ to rpool/data/vm-106-disk-1@__replicate_106-1_1596738173__ estimated size is 71.9M
2020-08-06 14:22:59 total estimated size is 71.9M
2020-08-06 14:22:59 TIME SENT SNAPSHOT rpool/data/vm-106-disk-1@__replicate_106-1_1596738173__
2020-08-06 14:23:00 rpool/data/vm-106-disk-1@__replicate_106-1_1596717257__ name rpool/data/vm-106-disk-1@__replicate_106-1_1596717257__ -
2020-08-06 14:23:00 14:23:00 2.10M rpool/data/vm-106-disk-1@__replicate_106-1_1596738173__
2020-08-06 14:23:01 14:23:01 27.2M rpool/data/vm-106-disk-1@__replicate_106-1_1596738173__
2020-08-06 14:23:02 14:23:02 63.1M rpool/data/vm-106-disk-1@__replicate_106-1_1596738173__
2020-08-06 14:23:03 successfully imported 'local-zfs:vm-106-disk-1'
2020-08-06 14:23:03 delete previous replication snapshot '__replicate_106-1_1596717257__' on local-zfs:vm-106-disk-1
2020-08-06 14:23:04 (remote_finalize_local_job) delete stale replication snapshot '__replicate_106-1_1596717257__' on local-zfs:vm-106-disk-1
2020-08-06 14:23:05 end replication job
2020-08-06 14:23:05 copying local disk images
2020-08-06 14:23:05 ERROR: unable to open file '/etc/pve/nodes/pvn1/qemu-server/106.conf.tmp.29700' - Device or resource busy
2020-08-06 14:23:05 aborting phase 1 - cleanup resources
2020-08-06 14:23:05 ERROR: unable to open file '/etc/pve/nodes/pvn1/qemu-server/106.conf.tmp.29700' - Device or resource busy
2020-08-06 14:23:05 ERROR: migration aborted (duration 00:00:12): unable to open file '/etc/pve/nodes/pvn1/qemu-server/106.conf.tmp.29700' - Device or resource busy
TASK ERROR: migration aborted
Do I have to build a new VM and attach the existing qcow somehow? If so, is there a document outlining the right way?