[SOLVED] [ERROR] Live Migrate a Windows VM with TPM

Voyaller

Member
Nov 15, 2020
17
2
23
I'm getting the following error when trying to migrate a VM that has TPM. I'm reading that on Hyper-V you need to transfer a VM certificate to the other node.
Maybe something similar applies to a Proxmox cluster?

Code:
[REDACTED] starting migration of VM [REDACTED] to node '[REDACTED]' ([REDACTED])
[REDACTED] found local disk '[REDACTED]' (in current VM config)
[REDACTED] found local disk '[REDACTED]' (in current VM config)
[REDACTED] found generated disk '[REDACTED]' (in current VM config)
[REDACTED] copying local disk images
[REDACTED] 4+1 records in
[REDACTED] 4+1 records out
[REDACTED] 16576 bytes (17 kB, 16 KiB) copied, 5.0007e-05 s, 331 MB/s
[REDACTED] Formatting '[REDACTED]', fmt=raw size=16384 preallocation=off
[REDACTED] 0+1 records in
[REDACTED] 0+1 records out
[REDACTED] 16576 bytes (17 kB, 16 KiB) copied, 0.000120642 s, 137 MB/s
[REDACTED] successfully imported '[REDACTED]'
[REDACTED] volume '[REDACTED]' is '[REDACTED]' on the target
[REDACTED] starting VM [REDACTED] on remote node '[REDACTED]'
[REDACTED] [REDACTED] kvm: tpm-emulator: TPM result for CMD_INIT: 0x9 operation failed
[REDACTED] [REDACTED] start failed: QEMU exited with code 1
[REDACTED] ERROR: online migrate failure - remote command failed with exit code 255
[REDACTED] aborting phase 2 - cleanup resources
[REDACTED] migrate_cancel
[REDACTED] ERROR: migration finished with problems (duration 00:00:08)
TASK ERROR: migration problems

VM Config:

Code:
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 2
efidisk0: [REDACTED],efitype=4m,pre-enrolled-keys=1,size=528K
ide2: none,media=cdrom
machine: pc-q35-6.0
memory: 4096
name: [REDACTED]
net0: virtio=[REDACTED],bridge=[REDACTED],firewall=1,tag=[REDACTED]
numa: 0
onboot: 1
ostype: win10
protection: 1
scsi0: [REDACTED]w,cache=writeback,discard=on,size=[REDACTED]
scsihw: virtio-scsi-pci
smbios1: uuid=[REDACTED]
sockets: 1
tpmstate0: [REDACTED],size=4M,version=v2.0
vmgenid: [REDACTED]
 
Hi!
[REDACTED] [REDACTED] kvm: tpm-emulator: TPM result for CMD_INIT: 0x9 operation failed
Please upgrade to latest 7.1, a bug triggering such an error when copying over local TPM state like you do was fixed and released in the qemu-server package version 7.1-3.
 
Hi!

Please upgrade to latest 7.1, a bug triggering such an error when copying over local TPM state like you do was fixed and released in the qemu-server package version 7.1-3.

Hello, upgrading the packages did in fact fixed the error, although when i can normally migrate VM's back and fourth without TPM, vms with TMP give me the following error:

Code:
[REDACTED] starting migration of VM [REDACTED] to node '[REDACTED]' ([REDACTED])
[REDACTED] found local disk '[REDACTED]' (in current VM config)
[REDACTED] found local disk '[REDACTED]' (in current VM config)
[REDACTED] found generated disk '[REDACTED]' (in current VM config)
[REDACTED] copying local disk images
[REDACTED] 4+1 records in
[REDACTED] 4+1 records out
[REDACTED] 16576 bytes (17 kB, 16 KiB) copied, 4.2483e-05 s, 390 MB/s
[REDACTED] Formatting '[REDACTED]', fmt=raw size=16384 preallocation=off
[REDACTED] 0+1 records in
[REDACTED] 0+1 records out
[REDACTED] 16576 bytes (17 kB, 16 KiB) copied, 8.949e-05 s, 185 MB/s
[REDACTED] successfully imported '[REDACTED]'
[REDACTED] volume '[REDACTED]' is '[REDACTED]' on the target
[REDACTED] starting VM [REDACTED] on remote node '[REDACTED]'
[REDACTED] [REDACTED] storage '[REDACTED]' is not available on node '[REDACTED]'
[REDACTED] ERROR: online migrate failure - remote command failed with exit code 255
[REDACTED] aborting phase 2 - cleanup resources
[REDACTED] migrate_cancel
[REDACTED] ERROR: migration finished with problems (duration 00:00:07)
TASK ERROR: migration problems
 
that's a bit much redaction ;) please at least give each entity (storage, node, ..) a separate replacement string so we can tell what's going on. also please post /etc/pve/storage.cfg (and apply identical redactions!)
 
Hello, basically i have two nodes, that both have 2 hardware RAIDs.
  • 2x SSD's in RAID1 for OS
  • 4x SSD's in RAID5 for VM disks and hot backups (those are the /mnt dirs below)

The migration is happening between the /mnt directories that are stored on different nodes. Not a shared storage.

Code:
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

dir: [REDACTED]
        path /mnt/pve/[REDACTED]
        content backup,iso,rootdir,snippets,images,vztmpl
        is_mountpoint 1
        nodes [REDACTED]
        prune-backups keep-last=3
        shared 0

pbs: [REDACTED]
        datastore [REDACTED]
        server [REDACTED]
        content backup
        fingerprint [REDACTED]
        prune-backups keep-all=1
        username [REDACTED]

dir: [REDACTED]
        path /mnt/pve/[REDACTED]
        content vztmpl,rootdir,snippets,images,backup,iso
        is_mountpoint 1
        nodes [REDACTED]
        prune-backups keep-last=3
        shared 0
 
you need to redact your configs and logs in a way that makes them still readable - you can change the names/IPs/.., but you need to use different redacted values for different source values, else it's not possible to understand what's going on..
 
Logs with different values:


Code:
starting migration of VM 100 to node 'pve1' (10.10.10.10)
found local disk 'storage_pve2:100/vm-100-disk-0.raw' (in current VM config)
found local disk 'storage_pve2:100/vm-100-disk-1.raw' (in current VM config)
found generated disk 'storage_pve2:100/vm-100-disk-2.raw' (in current VM config)
copying local disk images
4+1 records in
4+1 records out
16576 bytes (17 kB, 16 KiB) copied, 4.3136e-05 s, 384 MB/s
Formatting '/mnt/pve/storage_pve1/images/100/vm-100-disk-2.raw', fmt=raw size=16384 preallocation=off
0+1 records in
0+1 records out
16576 bytes (17 kB, 16 KiB) copied, 7.1251e-05 s, 233 MB/s
successfully imported 'storage_pve1:100/vm-1002-disk-2.raw'
volume 'storage_pve2:100/vm-100-disk-2.raw' is 'storage_pve1:100/vm-100-disk-2.raw' on the target
starting VM 100 on remote node 'pve1'
[pve1] storage 'storage_pve2' is not available on node 'pve1'
ERROR: online migrate failure - remote command failed with exit code 255
aborting phase 2 - cleanup resources
migrate_cancel
ERROR: migration finished with problems (duration 00:00:07)
TASK ERROR: migration problems

Storage.cfg from pve1

Code:
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

dir: storage_pve1
        path /mnt/pve/storage_pve1
        content backup,iso,rootdir,snippets,images,vztmpl
        is_mountpoint 1
        nodes pve1
        prune-backups keep-last=3
        shared 0

pbs: pbs
        datastore [REDACTED]
        server [REDACTED]
        content backup
        fingerprint [REDACTED]
        prune-backups keep-all=1
        username [REDACTED]

dir: storage_pve2
        path /mnt/pve/storage_pve2
        content vztmpl,rootdir,snippets,images,backup,iso
        is_mountpoint 1
        nodes pve2
        prune-backups keep-last=3
        shared 0

Storage.cfg from pve2

Code:
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

dir: storage_pve1
        path /mnt/pve/storage_pve1
        content backup,iso,rootdir,snippets,images,vztmpl
        is_mountpoint 1
        nodes pve1
        prune-backups keep-last=3
        shared 0

pbs: [REDACTED]
        datastore [REDACTED]
        server [REDACTED]
        content backup
        fingerprint [REDACTED]
        prune-backups keep-all=1
        username [REDACTED]

dir: storage_pve2
        path /mnt/pve/storage_pve2
        content vztmpl,rootdir,snippets,images,backup,iso
        is_mountpoint 1
        nodes pve2
        prune-backups keep-last=3
        shared 0
 
okay, thanks. seems we forget to update the volid for the tmp state volume, I'll take a look!
 
first I have to write the patch, then yes ;) I'll try to update this thread once a patch is available.
 
and contained in qemu-server >= 7.1-4 - you need to update both source and target node for the migration to work.