[SOLVED] problem with the last update and a live migration

sidereus

Member
Jul 25, 2019
45
7
13
54
After tonight's update there are many live migrations were failed, what has never happened before. I did reboot all nodes after installing updates, but it didn't help.
The installation log:
Code:
Preparing to unpack .../0-libgstreamer-plugins-base1.0-0_1.14.4-2+deb10u1_amd64.deb ...
Unpacking libgstreamer-plugins-base1.0-0:amd64 (1.14.4-2+deb10u1) over (1.14.4-2) ...
Preparing to unpack .../1-proxmox-backup-client_1.1.3-1_amd64.deb ...
Unpacking proxmox-backup-client (1.1.3-1) over (1.1.1-1) ...
Preparing to unpack .../2-proxmox-widget-toolkit_2.5-2_all.deb ...
Unpacking proxmox-widget-toolkit (2.5-2) over (2.5-1) ...
Preparing to unpack .../3-pve-container_3.3-5_all.deb ...
Unpacking pve-container (3.3-5) over (3.3-4) ...
Preparing to unpack .../4-pve-manager_6.3-7_amd64.deb ...
Unpacking pve-manager (6.3-7) over (6.3-6) ...
Preparing to unpack .../5-pve-qemu-kvm_5.2.0-6_amd64.deb ...
Unpacking pve-qemu-kvm (5.2.0-6) over (5.2.0-5) ...
Setting up pve-container (3.3-5) ...
Setting up proxmox-widget-toolkit (2.5-2) ...
Setting up pve-qemu-kvm (5.2.0-6) ...
Setting up libgstreamer-plugins-base1.0-0:amd64 (1.14.4-2+deb10u1) ...
Setting up pve-manager (6.3-7) ...
Setting up proxmox-backup-client (1.1.3-1) ...
Processing triggers for mime-support (3.62) ...
Processing triggers for libc-bin (2.28-10) ...
Processing triggers for systemd (241-7~deb10u7) ...
Processing triggers for man-db (2.8.5-2) ...
Processing triggers for pve-ha-manager (3.1-1) ...
The log from the one of failed live migrations:
Code:
2021-04-27 02:46:56 use dedicated network address for sending migration traffic (192.168.122.5)
2021-04-27 02:46:56 starting migration of VM 301 to node 'asr5' (192.168.122.5)
2021-04-27 02:46:57 starting VM 301 on remote node 'asr5'
2021-04-27 02:46:58 start remote tunnel
2021-04-27 02:46:59 ssh tunnel ver 1
2021-04-27 02:46:59 starting online/live migration on tcp:192.168.122.5:60000
2021-04-27 02:46:59 set migration_caps
2021-04-27 02:46:59 migration speed limit: 8589934592 B/s
2021-04-27 02:46:59 migration downtime limit: 100 ms
2021-04-27 02:46:59 migration cachesize: 2147483648 B
2021-04-27 02:46:59 set migration parameters
2021-04-27 02:46:59 start migrate command to tcp:192.168.122.5:60000
2021-04-27 02:47:00 migration status: active (transferred 663433381, remaining 13931319296), total 17197539328)
2021-04-27 02:47:00 migration xbzrle cachesize: 2147483648 transferred 0 pages 0 cachemiss 0 overflow 0
2021-04-27 02:47:01 migration status: active (transferred 672836408, remaining 9651982336), total 17197539328)
2021-04-27 02:47:01 migration xbzrle cachesize: 2147483648 transferred 0 pages 0 cachemiss 0 overflow 0
2021-04-27 02:47:02 migration status: active (transferred 1393323902, remaining 7556857856), total 17197539328)
2021-04-27 02:47:02 migration xbzrle cachesize: 2147483648 transferred 0 pages 0 cachemiss 0 overflow 0
2021-04-27 02:47:03 migration status: active (transferred 1924478457, remaining 4483584000), total 17197539328)
2021-04-27 02:47:03 migration xbzrle cachesize: 2147483648 transferred 0 pages 0 cachemiss 0 overflow 0
2021-04-27 02:47:04 migration status: active (transferred 2142633219, remaining 104054784), total 17197539328)
2021-04-27 02:47:04 migration xbzrle cachesize: 2147483648 transferred 0 pages 0 cachemiss 9088 overflow 0
2021-04-27 02:47:04 migration speed: 3276.80 MB/s - downtime 38 ms
2021-04-27 02:47:04 migration status: completed
2021-04-27 02:47:04 ERROR: tunnel replied 'ERR: resume failed - VM 301 not running' to command 'resume 301'
2021-04-27 02:47:12 ERROR: migration finished with problems (duration 00:00:16)
TASK ERROR: migration problems
 
Replying to myself. Problem was not related to the last proxmox update. I have added a new node to cluster, but forgot to turn on a nested virtualizaton there according to the guide. Without this option migrations of windows 2012 server and ubuntu 21.04 guests has failed to there. After turning on a nested virtualization at the all nodes problem gone, migrated successfully. Here the one of these vm config:
Code:
agent: 1,fstrim_cloned_disks=1
balloon: 1024
boot: order=scsi0
cores: 2
cpu: host
hotplug: disk,network,usb
machine: pc-q35-5.2
memory: 16384
name: i-1-win
net0: virtio=72:62:5E:5A:9A:58,bridge=vmbr0,firewall=1,tag=128
numa: 1
ostype: win8
scsi0: ceph_pool:vm-301-disk-0,cache=writeback,discard=on,size=300G
scsihw: virtio-scsi-pci
smbios1: uuid=201008dd-fb18-40c8-aeb4-4209f8dff003
sockets: 2
vmgenid: 273cf21d-2273-4582-ac16-3b163d19e273
 
Thanks for taking the time to share the issue and its solution! - This will certainly help others who also run into the issue while upgrading a cluster.

If possible it would be great if you edit such threads (the 'Edit Thread' button on top of your first post) and select the 'SOLVED' prefix - for the next time - this time I'll mark it as 'SOLVED'

Thanks again!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!