[SOLVED] Intra-Cluster Online Migration failed

SRU

Member
Dec 2, 2020
38
4
13
24
Hello,
I just did an online migration of one of or DB Servers.
Underlying storage is CEPH.
That failed *after* migrating:

Code:
2024-08-26 09:29:59 migration status: completed
2024-08-26 09:29:59 ERROR: tunnel replied 'ERR: resume failed - VM 1153 qmp command 'query-status' failed - client closed connection' to command 'resume 1153'
VM quit/powerdown failed - terminating now with SIGTERM
2024-08-26 09:30:12 ERROR: migration finished with problems (duration 00:03:24)

What does that tell me?
Is my clusternetwork unhealty?
Is there too much load on the DB Server for it to "resume"?

Thanks, Stefan
 
Last edited:
Root Cause:
VM's CPU has been configured to 'Host' but src and dst Nodes CPUs differ:

root@pve-5-2-rz:~ # head /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 17
model name : AMD EPYC 9454P 48-Core Processor
stepping : 1
microcode : 0xa101144

root@pve-5-3-rz:~ # head /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7502P 32-Core Processor
stepping : 0
microcode : 0x830107a
 
  • Like
Reactions: UdoB