[SOLVED] Unable to online migrate VM

ufm

Member
Oct 13, 2010
26
0
21
I'm upgraded one pve node in cluster to latest version and change network to OVS.
Now, I can online migrate vm to this node, but can not online migrate from this node to another (offline migration is worked).
Code:
task started by HA resource agent
2020-11-03 11:53:25 starting migration of VM 111 to node 'pve06' (10.5.44.16)
2020-11-03 11:53:25 starting VM 111 on remote node 'pve06'
2020-11-03 11:53:27 start remote tunnel
2020-11-03 11:53:28 ssh tunnel ver 1
2020-11-03 11:53:28 starting online/live migration on unix:/run/qemu-server/111.migrate
2020-11-03 11:53:28 set migration_caps
2020-11-03 11:53:28 migration speed limit: 8589934592 B/s
2020-11-03 11:53:28 migration downtime limit: 100 ms
2020-11-03 11:53:28 migration cachesize: 1073741824 B
2020-11-03 11:53:28 set migration parameters
2020-11-03 11:53:28 spice client_migrate_info
2020-11-03 11:53:28 start migrate command to unix:/run/qemu-server/111.migrate
2020-11-03 11:53:29 migration status error: failed
2020-11-03 11:53:29 ERROR: online migrate failure - aborting
2020-11-03 11:53:29 aborting phase 2 - cleanup resources
2020-11-03 11:53:29 migrate_cancel
2020-11-03 11:53:31 ERROR: migration finished with problems (duration 00:00:06)
TASK ERROR: migration problems
 
Last edited:
Code:
root@pve05:/var/log/pve# pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-15 (running version: 6.2-15/48bd51b6)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-4.15: 5.4-8
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 14.2.11-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-9
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.0-1
proxmox-backup-client: 0.9.4-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-6
pve-cluster: 6.2-1
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-4
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-18
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve2
Code:
root@pve05:/var/log/pve# pvecm status
Cluster information
-------------------
Name:             vcluster
Config Version:   10
Transport:        default
Secure auth:      on

Quorum information
------------------
Date:             Tue Nov  3 12:24:58 2020
Quorum provider:  corosync_votequorum
Nodes:            6
Node ID:          0x00000004
Ring ID:          1.a21d
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   6
Highest expected: 6
Total votes:      6
Quorum:           4
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.5.44.11
0x00000002          1 10.5.44.16
0x00000003          1 10.5.44.17
0x00000004          1 10.5.44.15 (local)
0x00000005          1 10.5.44.14
0x00000006          1 10.5.44.18

All VM can not be migrated from this node.
Code:
root@pve05:/var/log/pve# qm config 111
agent: 1
bootdisk: scsi0
cores: 4
description: Redmine server
memory: 8192
name: dcvolia.red
net0: virtio=16:F4:D3:E8:0A:15,bridge=vmbr0,tag=507
numa: 0
onboot: 1
ostype: l26
scsi0: ceph-hdd:vm-111-disk-0,cache=writeback,discard=on,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=255b4a19-ea6d-4fbb-856d-bf3c0634e29f
sockets: 1
vga: qxl
vmgenid: 3a8fb5f3-d1c0-4bd5-8a47-efbd75103579
 
can you please post the syslogs on the target node and this node? what is the pveversion -v output for the target node?
 
On target node:
Code:
Nov  3 12:39:53 pve06 systemd[1]: Started User Manager for UID 0.
Nov  3 12:39:53 pve06 systemd[1]: Started Session 2355 of user root.
Nov  3 12:40:00 pve06 systemd[1]: Starting Proxmox VE replication runner...
Nov  3 12:40:01 pve06 systemd[1]: pvesr.service: Succeeded.
Nov  3 12:40:01 pve06 systemd[1]: Started Proxmox VE replication runner.
Nov  3 12:40:11 pve06 pmxcfs[3088]: [status] notice: received log
Nov  3 12:40:11 pve06 systemd[1]: Started Session 2357 of user root.
Nov  3 12:40:11 pve06 systemd[1]: session-2357.scope: Succeeded.
Nov  3 12:40:17 pve06 systemd[1]: Started Session 2358 of user root.
Nov  3 12:40:18 pve06 qm[16764]: <root@pam> starting task UPID:pve06:000041C6:0F91C3DE:5FA13392:qmstart:111:root@pam:
Nov  3 12:40:18 pve06 qm[16838]: start VM 111: UPID:pve06:000041C6:0F91C3DE:5FA13392:qmstart:111:root@pam:
Nov  3 12:40:18 pve06 systemd[1]: Started 111.scope.
Nov  3 12:40:18 pve06 systemd-udevd[16843]: Using default interface naming scheme 'v240'.
Nov  3 12:40:18 pve06 systemd-udevd[16843]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov  3 12:40:18 pve06 systemd-udevd[16843]: Could not generate persistent MAC address for tap111i0: No such file or directory
Nov  3 12:40:18 pve06 kernel: [2612173.793795] device tap111i0 entered promiscuous mode
Nov  3 12:40:18 pve06 kernel: [2612173.807342] vmbr0: port 10(tap111i0) entered blocking state
Nov  3 12:40:18 pve06 kernel: [2612173.807345] vmbr0: port 10(tap111i0) entered disabled state
Nov  3 12:40:18 pve06 kernel: [2612173.808084] vmbr0: port 10(tap111i0) entered blocking state
Nov  3 12:40:18 pve06 kernel: [2612173.808086] vmbr0: port 10(tap111i0) entered forwarding state
Nov  3 12:40:19 pve06 qm[16764]: <root@pam> end task UPID:pve06:000041C6:0F91C3DE:5FA13392:qmstart:111:root@pam: OK
Nov  3 12:40:19 pve06 systemd[1]: session-2358.scope: Succeeded.
Nov  3 12:40:19 pve06 systemd[1]: Started Session 2359 of user root.
Nov  3 12:40:19 pve06 QEMU[16853]: kvm: Unknown savevm section or instance 'pbs-state' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
Nov  3 12:40:19 pve06 QEMU[16853]: kvm: load of migration failed: Invalid argument
Nov  3 12:40:20 pve06 kernel: [2612175.161790] vmbr0: port 10(tap111i0) entered disabled state
Nov  3 12:40:20 pve06 systemd[1]: 111.scope: Succeeded.
Nov  3 12:40:20 pve06 systemd[1]: Started Session 2360 of user root.
Nov  3 12:40:21 pve06 qm[16971]: <root@pam> starting task UPID:pve06:000042CC:0F91C54F:5FA13395:qmstop:111:root@pam:
Nov  3 12:40:21 pve06 qm[17100]: stop VM 111: UPID:pve06:000042CC:0F91C54F:5FA13395:qmstop:111:root@pam:
Nov  3 12:40:21 pve06 qm[16971]: <root@pam> end task UPID:pve06:000042CC:0F91C54F:5FA13395:qmstop:111:root@pam: OK
Nov  3 12:40:21 pve06 systemd[1]: session-2360.scope: Succeeded.
Nov  3 12:40:21 pve06 systemd[1]: session-2359.scope: Succeeded.
Nov  3 12:40:22 pve06 systemd[1]: Started Session 2361 of user root.
Nov  3 12:40:22 pve06 systemd[1]: session-2361.scope: Succeeded.
Nov  3 12:40:22 pve06 pmxcfs[3088]: [status] notice: received log
Nov  3 12:40:46 pve06 pmxcfs[3088]: [status] notice: received log
Nov  3 12:41:00 pve06 systemd[1]: Starting Proxmox VE replication runner...

On source node:
Code:
Nov  3 12:40:09 pve05 pveproxy[2837]: worker 23277 started
Nov  3 12:40:11 pve05 pvedaemon[10253]: <root@pam> starting task UPID:pve05:00005AEF:001851AF:5FA1338B:qmigrate:111:root@pam:
Nov  3 12:40:18 pve05 pmxcfs[2460]: [status] notice: received log
Nov  3 12:40:19 pve05 pmxcfs[2460]: [status] notice: received log
Nov  3 12:40:21 pve05 pmxcfs[2460]: [status] notice: received log
Nov  3 12:40:21 pve05 pmxcfs[2460]: [status] notice: received log
Nov  3 12:40:22 pve05 pvedaemon[23279]: migration problems
Nov  3 12:40:22 pve05 pvedaemon[10253]: <root@pam> end task UPID:pve05:00005AEF:001851AF:5FA1338B:qmigrate:111:root@pam: migration problems
Nov  3 12:40:46 pve05 pmxcfs[2460]: [status] notice: received log

pveversion -v on target node
Code:
root@pve06:~# pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-12 (running version: 6.2-12/b287dd27)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-4.15: 5.4-8
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 14.2.11-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 0.9.0-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-12
pve-cluster: 6.1-8
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-1
pve-qemu-kvm: 5.1.0-2
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-14
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve1
 
could you update the target node and try to live migrate again?
 
great! (just for the record: the pve-qemu-kvm version needs to be 5.1.0-4 on both nodes)

you can mark the thread as [SOLVED] by editing the thread prefix