Hi,
I'm running a VE 3.2 cluster with Ceph Dumpling. All works very well except a few hickups - one of the mayor issues is with Online Migration of VMs between the same hardware/software (exactly the same CPU).
What is interesting is that Online Migration works perfectly if the VM was recently rebooted, but after a certain uptime (can be several hours) it fill fail the online migration and freeze with 100% CPU. Pings won't reach it and neither console (actually it connects the console but the VM is frozen there).
I managed to get a strace on the qemu process but i'm not sure if there's anything in here.
If anyone has ANY idea where I should start debugging, that'd be awesome.
Possibly this part:
Attaching the rest of the strace.View attachment strace.txt.zip
I've also noticed that KSM is on, but that shouldn't affect Online Migration right?
I'm running a VE 3.2 cluster with Ceph Dumpling. All works very well except a few hickups - one of the mayor issues is with Online Migration of VMs between the same hardware/software (exactly the same CPU).
What is interesting is that Online Migration works perfectly if the VM was recently rebooted, but after a certain uptime (can be several hours) it fill fail the online migration and freeze with 100% CPU. Pings won't reach it and neither console (actually it connects the console but the VM is frozen there).
I managed to get a strace on the qemu process but i'm not sure if there's anything in here.
If anyone has ANY idea where I should start debugging, that'd be awesome.
Possibly this part:
Code:
read(5, "\2\0\0\0\0\0\0\0", 16) = 8write(5, "\1\0\0\0\0\0\0\0", 8) = 8
recvmsg(50, {msg_name(0)=NULL, msg_iov(1)=[{"}", 1}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 1
ioctl(11, KVM_CHECK_EXTENSION, 0x10) = 1
write(50, "{\"return\": {\"actual\": 8589934592"..., 79) = 79
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
Attaching the rest of the strace.View attachment strace.txt.zip
I've also noticed that KSM is on, but that shouldn't affect Online Migration right?
Last edited: