Migration failures I don't understand

LordRatner

Member
Jun 20, 2022
50
13
8
Hi. I'm guessing this is user error, but I still don't understand how to fix it.

I have HA set up and it functions as expected. However when a node comes back online and PVE tries to migrate back tot the higher priority node, I get failures on the VMs. Actually I think all of my VMs fail to migrate if they are running. here are a couple examples:

Nextcloud VM:

nextcloud.jpg

Code:
task started by HA resource agent
2023-01-07 17:41:16 starting migration of VM 112 to node 'node2' (192.168.10.21)
2023-01-07 17:41:16 found local, replicated disk 'local-zfs:vm-112-disk-0' (in current VM config)
2023-01-07 17:41:16 found local, replicated disk 'local-zfs:vm-112-state-pre_install' (referenced by snapshot(s))
2023-01-07 17:41:16 can't migrate local disk 'local-zfs:vm-112-disk-0': online storage migration not possible if snapshot exists
2023-01-07 17:41:16 ERROR: Problem found while scanning volumes - can't migrate VM - check log
2023-01-07 17:41:16 aborting phase 1 - cleanup resources
2023-01-07 17:41:16 ERROR: migration aborted (duration 00:00:00): Problem found while scanning volumes - can't migrate VM - check log
TASK ERROR: migration aborted

And LTSP VM:

LTSP.jpg

Code:
task started by HA resource agent
2023-01-07 17:43:46 starting migration of VM 120 to node 'node2' (192.168.10.21)
2023-01-07 17:43:46 found local disk 'local-zfs:vm-120-disk-0' (in current VM config)
2023-01-07 17:43:46 found local disk 'local-zfs:vm-120-state-Amd' (referenced by snapshot(s))
2023-01-07 17:43:46 found local disk 'local-zfs:vm-120-state-vdi_trouble' (referenced by snapshot(s))
2023-01-07 17:43:46 copying local disk images
Use of uninitialized value $target_storeid in string eq at /usr/share/perl5/PVE/Storage.pm line 778.
Use of uninitialized value $targetsid in concatenation (.) or string at /usr/share/perl5/PVE/QemuMigrate.pm line 678.
2023-01-07 17:43:46 ERROR: storage migration for 'local-zfs:vm-120-disk-0' to storage '' failed - no storage ID specified
2023-01-07 17:43:46 aborting phase 1 - cleanup resources
2023-01-07 17:43:46 ERROR: migration aborted (duration 00:00:00): storage migration for 'local-zfs:vm-120-disk-0' to storage '' failed - no storage ID specified
TASK ERROR: migration aborted

Now, with the first VM (nextcloud), using the GUI shutdown button does nothing (a promblem I have with several VMs). However if I go into the console and "shutdown now," the VM shuts down then immediately the migration succeeds.

The LTSP VM is different. When HA tries to start the VM on the secondary node (after the primary is shut down), it fails to start. Then when the primary node comes online, it fails to migrate back. I can't do anything with it, in fact, unless I just destroy is and restore from a backup.

Ideas?
Thanks,
Seth
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!