[SOLVED] Stopped migration stuck in gui w/ permission errors.

dtiKev

Well-Known Member
Apr 7, 2018
78
4
48
After losing one node and adding in a new one I was moving some guests to the new member node. One of them has failed and is stuck from the gui side. Is there a manual way to reset this back to how it was before the start of the migration? I'm not sure why there would be permission problems as several others migrated without problems. Output is shown below. If it matters, I do have a recent backup on the node that I was trying to move from.

Code:
2021-02-10 08:31:17 starting migration of VM 108 to node 'pvnOne' (x.y.z.244)
2021-02-10 08:31:18 found local disk 'TwoT:108/vm-108-disk-0.qcow2' (in current VM config)
2021-02-10 08:31:18 copying local disk images
2021-02-10 08:31:21 Formatting '/storage/2t/images/108/vm-108-disk-0.qcow2', fmt=qcow2 cluster_size=65536 preallocation=metadata compression_type=zlib size=536870912000 lazy_refcounts=off refcount_bits=16
2021-02-10 09:52:54 131092064+0 records in
2021-02-10 09:52:54 131092064+0 records out
2021-02-10 09:52:54 536953094144 bytes (537 GB, 500 GiB) copied, 4894.18 s, 110 MB/s
2021-02-10 09:52:54 12594+31614398 records in
2021-02-10 09:52:54 12594+31614398 records out
2021-02-10 09:52:54 536953094144 bytes (537 GB, 500 GiB) copied, 4627.14 s, 116 MB/s
2021-02-10 09:52:54 successfully imported 'TwoT:108/vm-108-disk-0.qcow2'
2021-02-10 09:52:54 volume 'TwoT:108/vm-108-disk-0.qcow2' is 'TwoT:108/vm-108-disk-0.qcow2' on the target
2021-02-10 09:52:54 ERROR: unable to open file '/etc/pve/nodes/pvn2/qemu-server/108.conf.tmp.1064' - Permission denied
2021-02-10 09:52:54 aborting phase 1 - cleanup resources
2021-02-10 09:52:54 ERROR: unable to open file '/etc/pve/nodes/pvn2/qemu-server/108.conf.tmp.1064' - Permission denied
2021-02-10 09:52:54 ERROR: found stale volume copy 'TwoT:108/vm-108-disk-0.qcow2' on node 'pvnOne'
2021-02-10 09:52:54 ERROR: migration aborted (duration 01:21:38): unable to open file '/etc/pve/nodes/pvn2/qemu-server/108.conf.tmp.1064' - Permission denied
TASK ERROR: migration aborted
 
Code:
~# qm config 108
bootdisk: ide0
cores: 4
ide0: TwoT:108/vm-108-disk-0.qcow2,size=500G
ide2: none,media=cdrom
lock: migrate
memory: 24576
name: scanDev
net0: e1000=D6:6B:5C:F2:C9:15,bridge=vmbr0,firewall=1
numa: 0
ostype: win8
scsihw: virtio-scsi-pci
smbios1: uuid=3b70356a-cc26-428a-8a3f-cdf8e329e181
sockets: 2
vmgenid: dfa667a7-f8ff-439d-afc3-f4aa2b55aa0d

Code:
# systemctl status pve-cluster corosync
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-02-10 15:14:23 EST; 1 day 17h ago
  Process: 1919 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
 Main PID: 1932 (pmxcfs)
    Tasks: 7 (limit: 4915)
   Memory: 65.3M
   CGroup: /system.slice/pve-cluster.service
           └─1932 /usr/bin/pmxcfs

Feb 12 07:46:50 pvn2 pmxcfs[1932]: [dcdb] notice: data verification successful
Feb 12 07:57:07 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:12:07 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:16:39 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:16:39 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:16:39 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:17:07 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:17:07 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:17:14 pvn2 pmxcfs[1932]: [status] notice: received log
Feb 12 08:17:25 pvn2 pmxcfs[1932]: [status] notice: received log

● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-02-10 15:14:24 EST; 1 day 17h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 2048 (corosync)
    Tasks: 9 (limit: 4915)
   Memory: 157.8M
   CGroup: /system.slice/corosync.service
           └─2048 /usr/sbin/corosync -f

Feb 10 16:57:32 pvn2 corosync[2048]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb 10 16:57:32 pvn2 corosync[2048]:   [KNET  ] host: host: 1 has no active links
Feb 10 16:57:34 pvn2 corosync[2048]:   [KNET  ] rx: host: 1 link: 0 is up
Feb 10 16:57:34 pvn2 corosync[2048]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb 10 16:58:48 pvn2 corosync[2048]:   [TOTEM ] Token has not been received in 2737 ms
Feb 12 08:18:25 pvn2 corosync[2048]:   [KNET  ] link: host: 1 link: 0 is down
Feb 12 08:18:25 pvn2 corosync[2048]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb 12 08:18:25 pvn2 corosync[2048]:   [KNET  ] host: host: 1 has no active links
Feb 12 08:18:27 pvn2 corosync[2048]:   [KNET  ] rx: host: 1 link: 0 is up
Feb 12 08:18:27 pvn2 corosync[2048]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)

Code:
# pveversion -v

proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.13.13-2-pve: 4.13.13-33
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.8-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-4
pve-cluster: 6.2-1
pve-container: 3.3-3
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-4
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
 
qm start 108 results in:
VM is locked (migrate)

I feel like this has happened before. Looking at your link now.
 
And bingo!! That did it. I stopped the migration, moved the disk to local/zfs, cold migrated it to the new server and it starts up.
 
  • Like
Reactions: Moayad