Error during VM migration

Loic92

Member
Oct 23, 2022
34
1
13
Paris
magic-radio.net
- PVE 9.1.1
- Kernel Linux 6.17.2-1-pve (2025-10-21T11:55Z
- I have a production subscription.

Hello,

I had an error during a VM migration, can you please help:

Code:
2025-11-29 21:31:59 migration status: completed
all 'mirror' jobs are ready
mirror-efidisk0: Completing block job...
mirror-efidisk0: Completed successfully.
mirror-scsi0: Completing block job...
mirror-scsi0: Completed successfully.
mirror-efidisk0: mirror-job finished
mirror-scsi0: mirror-job finished
2025-11-29 21:32:01 ERROR: Can't locate object method "update_volume_ids" via package "PVE::QemuConfig\0\0K^ñ\x{a5}X\0\0001\0\0\0\0\0\0\0X\x{b2}\x{9d}õ\x{a5}X\0\0\x{90}\x{a6}\x{9f}õ\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0"..."â\6\0\0\37\x{ad}\0\0\x{89}v\x{b4}é\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0W\0\0\0\0\0(\\_ñ\x{a5}X\0\0(\\_ñ\x{a5}X\0\0Pª"... (perhaps you forgot to load "PVE::QemuConfig\0\0K^ñ\x{a5}X\0\0001\0\0\0\0\0\0\0X\x{b2}\x{9d}õ\x{a5}X\0\0\x{90}\x{a6}\x{9f}õ\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0"..."â\6\0\0\37\x{ad}\0\0\x{89}v\x{b4}é\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0W\0\0\0\0\0(\\_ñ\x{a5}X\0\0(\\_ñ\x{a5}X\0\0Pª"...?) at /usr/share/perl5/PVE/QemuMigrate.pm line 1648.
2025-11-29 21:32:01 ERROR: migration finished with problems (duration 00:00:15)
TASK ERROR: migration problems

Code:
root@nab93:~# dpkg -l | egrep 'qemu-server|pve-manager|libpve-guest-common-perl'
ii  libpve-guest-common-perl             6.0.2                                all          Proxmox VE common guest-related modules
ii  pve-manager                          9.1.1                                all          Proxmox Virtual Environment Management Tools
ii  qemu-server                          9.1.0                                amd64        Qemu Server Tools

The VM/HA was in a bad/intermediate migration state, I used ChatGPT to be able to clean the situation and to be able to start the VM again.
ChatGPT is saying that libpve-guest-common-perl is too old.

Thanks.
 
Last edited:
Hi!

ChatGPT is saying that libpve-guest-common-perl is too old.
LLMs are inherently incapable of maintaining any semantic or logical relationships and will interpolate beyond facts. libpve-guest-common-perl 6.0.2 is the correct version for Proxmox VE 9.1.

Has the source file /usr/share/perl5/PVE/QemuMigrate.pm been changed in any way? It seems like it has been corrupted a bit. If it's possible, you could try to either reinstalling the qemu-server package or upgrading it altogether since there's qemu-server 9.1.1 out already. Does that fix the problem for you?
 
  • Like
Reactions: fiona
In fact I rebooted the PVE node and now everything is working well.
So nothing was corrupted and it was not needed to reinstall a package.
How do you explain that?
Thanks.

Note: I do like Proxmox VE as an alternative of vSphere,
I purchased the Community subscription for both PVE and PBS for personal usage (a lab with a cluster of 3 nodes, but I consider it as for a production usage).
I did a clean install from scratch of PVE with the 9.1-1 ISO (the same for PBS), I followed all the possible best practices and documentation, but I still have instabilities and bugs very often…
Did I miss something?
I'm little bit disappointed by the solution. I don't want to do finger pointing and perhaps comparing PVE and vSphere is not comparing apple with apple,
but vSphere is 200% stable when you follow the best practices and documentation.
Now with PVE and PBS I have very often issues ☹ and resolving them by just rebooting without knowing the root cause is not acceptable for me ☹
I have 25 years of experience in IT (I am an engineer) so I don’t think I’m using badly the solution
What is your advise?
 
So nothing was corrupted and it was not needed to reinstall a package.
How do you explain that?
From the error message it seems like that a random byte sequence was introduced... When you restart your machine / restart the pveproxy/pvedaemon services, then the Perl files are recompiled. I have never seen something similar before and can only guess that there must have been some corruption going on, either on-disk or in-memory.

To properly investigate here, what is the hardware those nodes are running on? Are there any special BIOS settings set? Does a syslog from that boot indicate any direct errors that might have caused the migration error in the first place? Does a longer memtest show any failures?

I followed all the possible best practices and documentation, but I still have instabilities and bugs very often…
Did I miss something?
You need to be a bit clearer about this to get a helpful answer. What best practices and documentation did you follow? Which bugs or instabilities do you experience?
 
I followed the official PVE documentation.
I'm using a cluster of 3 x Minisforum NAB9, all the BIOS settings are by default except the secure boot I disabled.
I have 64GB of RAM (2x32GB, Kingston FURY Impact) + 1 NVMe external disk (256GB, Lexar NM620 SSD) for the boot (ext4) via an USB 3.2 port + 1 internal NVMe disk (2TB, Samsung SSD Interne 990 EVO Plus) for the VMs (zfs). No ceph, only local zfs storage for the VMs, I'm using the replication between the nodes for the VM + HA in case of a node failire. I'm using PBS for the backups each night + offsite backups towards AWS S3.
I'm currently running MemTest86 on the node from which the migration failed.
I will try to provide logs later.
 
I'm using a cluster of 3 x Minisforum NAB9, all the BIOS settings are by default except the secure boot I disabled.
that kind of hardware is known to not be very stable, and likely that is the source of your issues. use proper server grade hardware if you want proper server grade performance and stability.
 
that kind of hardware is known to not be very stable, and likely that is the source of your issues. use proper server grade hardware if you want proper server grade performance and stability.
What kind of hardware do you advise to use? brand? model? NIC? etc.
I don't think this is documented somewhere?
Do you have a compatibility matrix like VMWare is providing?
Thanks.
 
If I'm following this documentation the NAB9 is ok.
I selected specifically this unit because it has 2x2,5Gbps network ports.
I created specific vlans for Coroync , the management, the replication and the backup trafic, these 2 last are isolated on a 2,5Gbps port.
When you are saying "that kind of hardware is known to not be very stable", what are your sources please?
Can you please provide factual information?
Thanks.
 
We recommend using high quality server hardware, when running Proxmox VE inproduction.

a mini PC is most definitely not fitting the definition of "high quality server hardware"!

When you are saying "that kind of hardware is known to not be very stable", what are your sources please?

our forum and bug tracker, where we see endless threads with hardware like this that simply doesn't run stable (or only with a lot of tweaking).

your initial post is about as clear-cut of an example of a hardware problem as there can be - random corruption that you see if either the disk, CPU or memory is broken or unstable.
 
a mini PC is most definitely not fitting the definition of "high quality server hardware"!



our forum and bug tracker, where we see endless threads with hardware like this that simply doesn't run stable (or only with a lot of tweaking).

your initial post is about as clear-cut of an example of a hardware problem as there can be - random corruption that you see if either the disk, CPU or memory is broken or unstable.
I did a search on the forum with the keyword "nab9" and I found 10 posts, and 4 are mines :) I would not say that there is a lot of issues reported for the NAB9 :) Can you please be more factual when you are saying "we see endless threads with hardware like this that simply doesn't run stable (or only with a lot of tweaking", please provide links/benchmarks etc. Which "tweaking" are you talking about?
I don't understand the term "hardware like this". It's not because a specific Chinese mini PC brand is not working well that all mini PC on the market are not stable to run PVE :)
Thanks.
 
Last edited:
  • Like
Reactions: Loic92
https://forum.proxmox.com/threads/minisforum-nab9.176741/ (first hit ;))

and for other vendors: https://forum.proxmox.com/threads/vm-freezes-irregularly.111494

but even if your hardware would be server grade - the symptoms clearly point at this particular piece of hardware having an issue!
Concerning this topic (mine topic :)):


I found the root cause, it's because I was using Remote Desktop, and the Windows 11 RDP layer is not compatible with PVE and it's causing a lot of instability and crashes, and generally I would say that it's not a good practice to deploy Windows 11 on PVE.
That's why I switched to Windows Server 2025 because the RDP layer is completely different from the one used inside W11 and everything is fine now :)
I have also Linux Debian VMs and for this OS I've never had stability issues.
Also I'm doing currently massive load tests on my NAB9 units by using MemTest86, stress-ng, memtester and fio running on the PVE hypervisor and inside a VM running on PVE and so far everything is stable :) No error at all.
So for the moment I would give a green light for my NAB9 units to run "prod" services, even if it's a lab.
 
2025-11-29 21:32:01 ERROR: Can't locate object method "update_volume_ids" via package "PVE::QemuConfig\0\0K^ñ\x{a5}X\0\0001\0\0\0\0\0\0\0X\x{b2}\x{9d}õ\x{a5}X\0\0\x{90}\x{a6}\x{9f}õ\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0"..."â\6\0\0\37\x{ad}\0\0\x{89}v\x{b4}é\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0W\0\0\0\0\0(\\_ñ\x{a5}X\0\0(\\_ñ\x{a5}X\0\0Pª"... (perhaps you forgot to load "PVE::QemuConfig\0\0K^ñ\x{a5}X\0\0001\0\0\0\0\0\0\0X\x{b2}\x{9d}õ\x{a5}X\0\0\x{90}\x{a6}\x{9f}õ\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0"..."â\6\0\0\37\x{ad}\0\0\x{89}v\x{b4}é\x{a5}X\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0W\0\0\0\0\0(\\_ñ\x{a5}X\0\0(\\_ñ\x{a5}X\0\0Pª"...?) at /usr/share/perl5/PVE/QemuMigrate.pm line 1648.

I've said it a few times, but I can say it once more ;) this error here is almost certainly a sign of hardware-level corruption or instability. nothing inside a guest should be able to trigger this unless there is a bigger problem.

Concerning this topic (mine topic :)):

missed that, sorry :)