Migration and Live Migration not working anymore

niklas_g

New Member
Aug 17, 2020
5
0
1
25
Hey,

i recently noticed that im unable to Migrate VM to different nodes on my Proxmox Cluster. It doesn't matter if it is a Live or an offline Migration, both don't work. I tried different Storage types and tested the migration process from every node to every node.
Syslog says that the task is starting but its not moving the machine. Waited a couple of hours and still didnt migrate the Machine.

Does someone has an Idea?

Syslog:
pvedaemon[989407]: <root@pam> starting task UPID:trick:00203D7D:083FAA9B:5F4F664B:qmigrate:501:root@pam:

Node1:
root@tick:~# pveversion -v proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve) pve-manager: 6.2-11 (running version: 6.2-11/22fb4983) pve-kernel-5.4: 6.2-5 pve-kernel-helper: 6.2-5 pve-kernel-5.3: 6.1-6 pve-kernel-5.4.55-1-pve: 5.4.55-1 pve-kernel-5.4.44-2-pve: 5.4.44-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 ceph: 14.2.10-pve1 ceph-fuse: 14.2.10-pve1 corosync: 3.0.4-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: residual config ifupdown2: 3.0.0-1+pve2 libjs-extjs: 6.0.1-10 libknet1: 1.16-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-2 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.2-1 libpve-guest-common-perl: 3.1-2 libpve-http-server-perl: 3.0-6 libpve-storage-perl: 6.2-6 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1 lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-10 pve-cluster: 6.1-8 pve-container: 3.1-13 pve-docs: 6.2-5 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-2 pve-firmware: 3.1-2 pve-ha-manager: 3.0-9 pve-i18n: 2.1-3 pve-qemu-kvm: 5.0.0-13 pve-xtermjs: 4.7.0-2 qemu-server: 6.2-14 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 0.8.4-pve1

Node2:
root@trick:~# pveversion -v proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve) pve-manager: 6.2-11 (running version: 6.2-11/22fb4983) pve-kernel-5.4: 6.2-5 pve-kernel-helper: 6.2-5 pve-kernel-5.3: 6.1-6 pve-kernel-5.4.55-1-pve: 5.4.55-1 pve-kernel-5.4.44-2-pve: 5.4.44-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 ceph: 14.2.10-pve1 ceph-fuse: 14.2.10-pve1 corosync: 3.0.4-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: residual config ifupdown2: 3.0.0-1+pve2 libjs-extjs: 6.0.1-10 libknet1: 1.16-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-2 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.2-1 libpve-guest-common-perl: 3.1-2 libpve-http-server-perl: 3.0-6 libpve-storage-perl: 6.2-6 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1 lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-10 pve-cluster: 6.1-8 pve-container: 3.1-13 pve-docs: 6.2-5 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-2 pve-firmware: 3.1-2 pve-ha-manager: 3.0-9 pve-i18n: 2.1-3 pve-qemu-kvm: 5.0.0-13 pve-xtermjs: 4.7.0-2 qemu-server: 6.2-14 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 0.8.4-pve1


Node3:

root@track:~# pveversion -v proxmox-ve: 6.2-1 (running kernel: 5.3.18-2-pve) pve-manager: 6.2-11 (running version: 6.2-11/22fb4983) pve-kernel-5.4: 6.2-5 pve-kernel-helper: 6.2-5 pve-kernel-5.3: 6.1-6 pve-kernel-5.4.55-1-pve: 5.4.55-1 pve-kernel-5.4.44-2-pve: 5.4.44-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 ceph: 14.2.10-pve1 ceph-fuse: 14.2.10-pve1 corosync: 3.0.4-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: residual config ifupdown2: 3.0.0-1+pve2 libjs-extjs: 6.0.1-10 libknet1: 1.16-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-2 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.2-1 libpve-guest-common-perl: 3.1-2 libpve-http-server-perl: 3.0-6 libpve-storage-perl: 6.2-6 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1 lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-10 pve-cluster: 6.1-8 pve-container: 3.1-13 pve-docs: 6.2-5 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-2 pve-firmware: 3.1-2 pve-ha-manager: 3.0-9 pve-i18n: 2.1-3 pve-qemu-kvm: 5.0.0-13 pve-xtermjs: 4.7.0-2 qemu-server: 6.2-14 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 0.8.4-pve1

Node4:
root@donald:~# pveversion -v proxmox-ve: 6.2-1 (running kernel: 5.3.18-2-pve) pve-manager: 6.2-11 (running version: 6.2-11/22fb4983) pve-kernel-5.4: 6.2-5 pve-kernel-helper: 6.2-5 pve-kernel-5.3: 6.1-6 pve-kernel-5.4.55-1-pve: 5.4.55-1 pve-kernel-5.4.44-2-pve: 5.4.44-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 ceph: 14.2.10-pve1 ceph-fuse: 14.2.10-pve1 corosync: 3.0.4-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: residual config ifupdown2: 3.0.0-1+pve2 libjs-extjs: 6.0.1-10 libknet1: 1.16-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-2 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.2-1 libpve-guest-common-perl: 3.1-2 libpve-http-server-perl: 3.0-6 libpve-storage-perl: 6.2-6 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1 lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-10 pve-cluster: 6.1-8 pve-container: 3.1-13 pve-docs: 6.2-5 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-2 pve-firmware: 3.1-2 pve-ha-manager: 3.0-9 pve-i18n: 2.1-3 pve-qemu-kvm: 5.0.0-13 pve-xtermjs: 4.7.0-2 qemu-server: 6.2-14 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 0.8.4-pve1
 
hi,

moved to english forum.


the problem might be because you haven't rebooted some machines after doing kernel upgrade. (you can see in pveversion output the running kernel version doesn't match the latest installed one)
 
Oh, I didn't realize that there were different ones. May Bad.

I thought the same but two have the same kernel version and i tried move vms between the two who have matching kernel versions and it still didnt work.
Do you think its still a problem?
 
even if they match right now, the installed kernel version is different than the one running so you will need to update & reboot all of them anyway.
 
I could restart them tonight and all have the same Kernel version. It's still not working
 
okay, it was just a guess to be sure :)

is your cluster connected properly? pvecm status output can be useful to see.

do you find anything in the logs and journal?
 
It's not a bad thing to have the servers up to date, didn't had the chance before for a reboot :D

The only thing i can find in the logs is:
Offline Migration:
Sep 3 12:33:10 donald pvedaemon[167276]: <root@pam> starting task UPID:donald:00039A1E:00171843:5F50C666:qmigrate:514:root@pam:

Live Migration
Sep 3 12:33:50 donald pvedaemon[164363]: <root@pam> starting task UPID:donald:00039B57:001727EC:5F50C68E:qmigrate:516:root@pam: Sep 3 12:34:00 donald systemd[1]: Starting Proxmox VE replication runner... Sep 3 12:34:01 donald systemd[1]: pvesr.service: Succeeded. Sep 3 12:34:01 donald systemd[1]: Started Proxmox VE replication runner.

Nothing else happens. The only thing appearing in my logs are Check_MK checks

Node 1
root@tick:~# pvecm status Cluster information ------------------- Name: entenhausen Config Version: 4 Transport: knet Secure auth: on Quorum information ------------------ Date: Thu Sep 3 12:27:02 2020 Quorum provider: corosync_votequorum Nodes: 4 Node ID: 0x00000001 Ring ID: 1.a6c Quorate: Yes Votequorum information ---------------------- Expected votes: 4 Highest expected: 4 Total votes: 4 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.83.75.11 (local) 0x00000002 1 10.83.75.12 0x00000003 1 10.83.75.13 0x00000004 1 10.83.75.14

Node 2
root@trick:~# pvecm status Cluster information ------------------- Name: entenhausen Config Version: 4 Transport: knet Secure auth: on Quorum information ------------------ Date: Thu Sep 3 12:27:01 2020 Quorum provider: corosync_votequorum Nodes: 4 Node ID: 0x00000002 Ring ID: 1.a6c Quorate: Yes Votequorum information ---------------------- Expected votes: 4 Highest expected: 4 Total votes: 4 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.83.75.11 0x00000002 1 10.83.75.12 (local) 0x00000003 1 10.83.75.13 0x00000004 1 10.83.75.14

Node 3
root@track:~# pvecm status Cluster information ------------------- Name: entenhausen Config Version: 4 Transport: knet Secure auth: on Quorum information ------------------ Date: Thu Sep 3 12:27:06 2020 Quorum provider: corosync_votequorum Nodes: 4 Node ID: 0x00000003 Ring ID: 1.a6c Quorate: Yes Votequorum information ---------------------- Expected votes: 4 Highest expected: 4 Total votes: 4 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.83.75.11 0x00000002 1 10.83.75.12 0x00000003 1 10.83.75.13 (local) 0x00000004 1 10.83.75.14

Node 4
root@donald:~# pvecm status Cluster information ------------------- Name: entenhausen Config Version: 4 Transport: knet Secure auth: on Quorum information ------------------ Date: Thu Sep 3 12:27:14 2020 Quorum provider: corosync_votequorum Nodes: 4 Node ID: 0x00000004 Ring ID: 1.a6c Quorate: Yes Votequorum information ---------------------- Expected votes: 4 Highest expected: 4 Total votes: 4 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.83.75.11 0x00000002 1 10.83.75.12 0x00000003 1 10.83.75.13 0x00000004 1 10.83.75.14 (local)
 
when you go on the task log while the migration process is running do you see anything? (on the gui at the bottom if you double click on the migration task it should show status)

also could you post the configuration of the VM or VMs where this problem occurs? qm config VMID

can you also check if you can connect to/from nodes in the cluster via ssh? like from node1 to node2 and so on (it should log in automatically without any interaction)
 
It's the same with every VM on every node. Doesn't matter which storage it is on. NFS, ZFS over ISCSI, NFS on another machine....

SSH is working fine from every to every node with name and with IP

Task Output:
2020-09-03 12:33:50 starting migration of VM 516 to node 'tick' (10.83.75.11)

Task duration:
Duration: 3h 40m 10s


Config:
root@donald:~# qm config 516 boot: dcn bootdisk: virtio0 cores: 2 ide2: none,media=cdrom lock: migrate memory: 4096 name: 7of9 net0: virtio=BE:0D:66:71:DA:F4,bridge=vmbr0,tag=2 numa: 0 onboot: 1 ostype: l26 scsihw: virtio-scsi-pci smbios1: uuid=99fe2324-01a9-4de8-bb38-0e925cd16579 sockets: 2 startup: order=3 virtio0: geldspeicher:516/vm-516-disk-0.qcow2,size=32G vmgenid: 1eb56c6f-b25e-4174-8ce9-106487d130ec
 
It's the same with every VM on every node. Doesn't matter which storage it is on. NFS, ZFS over ISCSI, NFS on another machine....

SSH is working fine from every to every node with name and with IP

Task Output:
2020-09-03 12:33:50 starting migration of VM 516 to node 'tick' (10.83.75.11)

Task duration:
Duration: 3h 40m 10s


Config:
root@donald:~# qm config 516 boot: dcn bootdisk: virtio0 cores: 2 ide2: none,media=cdrom lock: migrate memory: 4096 name: 7of9 net0: virtio=BE:0D:66:71:DA:F4,bridge=vmbr0,tag=2 numa: 0 onboot: 1 ostype: l26 scsihw: virtio-scsi-pci smbios1: uuid=99fe2324-01a9-4de8-bb38-0e925cd16579 sockets: 2 startup: order=3 virtio0: geldspeicher:516/vm-516-disk-0.qcow2,size=32G vmgenid: 1eb56c6f-b25e-4174-8ce9-106487d130ec
Please tell me if your problem is solved
The fact is that I also have a migration task hanging and nothing happens
 
Hi,
Please tell me if your problem is solved
The fact is that I also have a migration task hanging and nothing happens
what is the output of pveversion -v and qm config <ID> with the ID of the VM? What is the output of pvesm status?
 
Hi,

what is the output of pveversion -v and qm config <ID> with the ID of the VM? What is the output of pvesm status?
pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-15 (running version: 6.2-15/48bd51b6)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.44-1-pve: 5.4.44-1
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-29-pve: 4.15.18-57
pve-kernel-4.15.18-26-pve: 4.15.18-54
pve-kernel-4.15.18-24-pve: 4.15.18-52
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph-fuse: 12.2.13-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-4
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-9
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.0-1
proxmox-backup-client: 0.9.7-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-8
pve-cluster: 6.2-1
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-6
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-19
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve2

qm config 225
boot: order=scsi0;ide2;net0
cores: 1
ide2: none,media=cdrom
memory: 2048
name: TEST250G
net0: virtio=FA:EF:CB:ED:4F:DF,bridge=vmbr0
numa: 0
ostype: l26
scsi0: Synology_2.114-PVE:225/vm-225-disk-0.qcow2,iothread=1,size=250G
scsihw: virtio-scsi-pci
smbios1: uuid=2f24c253-5134-4db2-ae1c-facb927fb21f
sockets: 1
vmgenid: 3731bcbe-5f81-4b43-beab-3a53f593ce7e

pvesm status
file /etc/pve/storage.cfg line 97 (section 'Synology_SA_2.118-ssd') - unable to parse value of 'prune-backups': invalid format - format error
keep-all: property is not defined in schema and the schema does not allow additional properties

file /etc/pve/storage.cfg line 103 (section 'local14') - unable to parse value of 'prune-backups': invalid format - format error
keep-all: property is not defined in schema and the schema does not allow additional properties

file /etc/pve/storage.cfg line 112 (section 'Disk_pve11') - unable to parse value of 'prune-backups': invalid format - format error
keep-all: property is not defined in schema and the schema does not allow additional properties

got timeout
Name Type Status Total Used Available %
Disk_pve11 nfs active 5812038656 0 5519053824 0.00%
ISO_pve11 nfs active 5518474240 161935360 5078348800 2.93%
Synology_0.2 nfs active 56494255104 29356973568 27137281536 51.96%
Synology_2.113 nfs active 149963132416 109709588864 40253543552 73.16%
Synology_2.114-PVE nfs active 85537629184 65197928832 20339700352 76.22%
Synology_2.117 nfs active 74968274048 44279066112 30689207936 59.06%
Synology_SA_2.118-hdd nfs active 84352829056 45889389312 38463439744 54.40%
Synology_SA_2.118-ssd nfs active 10788238848 1482504320 9305734528 13.74%
Synology_SA_2.119 nfs active 84352829056 60027412864 24325416192 71.16%
Synology_SA_2.119-ssd nfs active 10788238848 3005874048 7782364800 27.86%
ceph rbd inactive 0 0 0 0.00%
local dir active 283820576 5469504 265995944 1.93%
local14 dir disabled 0 0 0 N/A
storage_fast_fc lvm disabled 0 0 0 N/A
storage_fc lvm disabled 0 0 0 N/A
vmdata lvmthin active 5772619776 3529379731 2243240044 61.14%
vmdata_PVE02 lvmthin disabled 0 0 0 N/A


Target migrate node:
proxmox-ve: 7.3-1 (running kernel: 5.19.17-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-6
pve-kernel-5.15: 7.3-2
pve-kernel-5.19.17-1-pve: 5.19.17-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-6
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
openvswitch-switch: 2.15.0+ds1-2+deb11u2.1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20221111-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1
 
Last edited:
Do you only get the single line starting migration of VM...etc. in the log output? From that log output, I'd guess that the task might hang when trying to scan for VM images on the storages. I'd try disabling the inactive ceph storage. And for the active storages, I'd try running pvesm list <storage> and see if any of them hang.
 
Do you only get the single line starting migration of VM...etc. in the log output? From that log output, I'd guess that the task might hang when trying to scan for VM images on the storages. I'd try disabling the inactive ceph storage. And for the active storages, I'd try running pvesm list <storage> and see if any of them hang.

The problem was solved by sequentially updating the node to version 6.4.15 and then to version 7.3.6.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!