Recently upgraded from 6.2 (no problems) to 6.3 using the enterprise repo and am seeing some live migrates fail without reason following it trying to copy the EFI disk across.
Some points:
-It seems to work fine for a new windows VM deployed from the same win 2019 template the problematic VM's were deployed from.
-No issues with unix VM's with EFI disks.
-Offline migrates work fine.
-No issues on proxmox 6.2, only seen issues since upgrading to 6.3
-Local storage with LVM on top
-2 node cluster with local storage on each node.
-If it migrated the normal disk first it will get to 100% and then fail after attempting the EFI disk.
Output from VM that worked once out of many attempts, this being one of the failed attempts:
```
2021-04-22 15:15:29 starting migration of VM 105 to node 'redacted' (redacted)
2021-04-22 15:15:30 found local disk 'raid1:105/vm-105-disk-0.qcow2' (in current VM config)
2021-04-22 15:15:30 found local disk 'raid1:105/vm-105-disk-1.qcow2' (in current VM config)
2021-04-22 15:15:30 copying local disk images
2021-04-22 15:15:30 starting VM 105 on remote node 'vmh01'
2021-04-22 15:15:32 start remote tunnel
2021-04-22 15:15:33 ssh tunnel ver 1
2021-04-22 15:15:33 starting storage migration
2021-04-22 15:15:33 efidisk0: start migration to nbd:unix:/run/qemu-server/105_nbd.migrate:exportname=drive-efidisk0
drive mirror is starting for drive-efidisk0
drive-efidisk0: Cancelling block job
drive-efidisk0: Done.
2021-04-22 15:15:33 ERROR: online migrate failure - mirroring error: drive-efidisk0: mirroring has been cancelled
2021-04-22 15:15:33 aborting phase 2 - cleanup resources
2021-04-22 15:15:33 migrate_cancel
2021-04-22 15:15:36 ERROR: migration finished with problems (duration 00:00:07)
TASK ERROR: migration problems
```
Config file for VM:
```
agent: 1
balloon: 0
bios: ovmf
bootdisk: scsi0
cores: 2
efidisk0: raid1:105/vm-105-disk-0.qcow2,format=qcow2,size=128K
ide2: none,media=cdrom
memory: 6144
name: redacted
net0: virtio=redacted,bridge=vmbr00,tag=redacted
net1: virtio=redacted,bridge=vmbr00,tag=redacted
numa: 0
onboot: 1
ostype: win10
scsi0: raid1:105/vm-105-disk-1.qcow2,cache=writeback,format=qcow2,size=80G
scsihw: virtio-scsi-pci
smbios1: uuid=redacted
sockets: 1
vmgenid: redacted
```
Is there somewhere I can get greater output as to why it's failing? I have tried all the usual log files.
I am not sure if this is related to this - https://bugzilla.proxmox.com/show_bug.cgi?id=3227 - as I am seeing VM's with EFI disks of varying size both succeed and fail.
Some points:
-It seems to work fine for a new windows VM deployed from the same win 2019 template the problematic VM's were deployed from.
-No issues with unix VM's with EFI disks.
-Offline migrates work fine.
-No issues on proxmox 6.2, only seen issues since upgrading to 6.3
-Local storage with LVM on top
-2 node cluster with local storage on each node.
-If it migrated the normal disk first it will get to 100% and then fail after attempting the EFI disk.
Output from VM that worked once out of many attempts, this being one of the failed attempts:
```
2021-04-22 15:15:29 starting migration of VM 105 to node 'redacted' (redacted)
2021-04-22 15:15:30 found local disk 'raid1:105/vm-105-disk-0.qcow2' (in current VM config)
2021-04-22 15:15:30 found local disk 'raid1:105/vm-105-disk-1.qcow2' (in current VM config)
2021-04-22 15:15:30 copying local disk images
2021-04-22 15:15:30 starting VM 105 on remote node 'vmh01'
2021-04-22 15:15:32 start remote tunnel
2021-04-22 15:15:33 ssh tunnel ver 1
2021-04-22 15:15:33 starting storage migration
2021-04-22 15:15:33 efidisk0: start migration to nbd:unix:/run/qemu-server/105_nbd.migrate:exportname=drive-efidisk0
drive mirror is starting for drive-efidisk0
drive-efidisk0: Cancelling block job
drive-efidisk0: Done.
2021-04-22 15:15:33 ERROR: online migrate failure - mirroring error: drive-efidisk0: mirroring has been cancelled
2021-04-22 15:15:33 aborting phase 2 - cleanup resources
2021-04-22 15:15:33 migrate_cancel
2021-04-22 15:15:36 ERROR: migration finished with problems (duration 00:00:07)
TASK ERROR: migration problems
```
Config file for VM:
```
agent: 1
balloon: 0
bios: ovmf
bootdisk: scsi0
cores: 2
efidisk0: raid1:105/vm-105-disk-0.qcow2,format=qcow2,size=128K
ide2: none,media=cdrom
memory: 6144
name: redacted
net0: virtio=redacted,bridge=vmbr00,tag=redacted
net1: virtio=redacted,bridge=vmbr00,tag=redacted
numa: 0
onboot: 1
ostype: win10
scsi0: raid1:105/vm-105-disk-1.qcow2,cache=writeback,format=qcow2,size=80G
scsihw: virtio-scsi-pci
smbios1: uuid=redacted
sockets: 1
vmgenid: redacted
```
Is there somewhere I can get greater output as to why it's failing? I have tried all the usual log files.
I am not sure if this is related to this - https://bugzilla.proxmox.com/show_bug.cgi?id=3227 - as I am seeing VM's with EFI disks of varying size both succeed and fail.