Live migartion fails in KVM 1.3

bazzi

Active Member
Jun 4, 2010
107
2
36
I already reported it below the announcement f the new proxmox version, but I think a new thread is better.
old thread (http://forum.proxmox.com/threads/12237-Updates-for-Proxmox-VE-2-2-including-QEMU-1-3)

I am looking at a problem with the new version. VM greater then 8GB will not finish live migration. It will hang around:
Code:
[COLOR=#000000][FONT=tahoma]Dec 18 16:43:12 migration status: active (transferred 1887637973, remaining 1518489600), total 8598913024)[/FONT][/COLOR]

The VM will e unresponsive and is not running on the "new" node. Only possibility to get the VM running again is to cancel the migration AND unlock the VM trough qm.

old node:
Code:
root@timo:~# pveversion -vpve-manager: 2.2-32 (pve-manager/2.2/3089a616)
running kernel: 2.6.32-17-pve
proxmox-ve-2.6.32: 2.2-83
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-17-pve: 2.6.32-83
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-34
qemu-server: 2.0-71
pve-firmware: 1.0-21
libpve-common-perl: 1.0-40
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.3-7
ksm-control-daemon: 1.1-1

new node:
Code:
pve-manager: 2.2-32 (pve-manager/2.2/3089a616)running kernel: 2.6.32-17-pve
proxmox-ve-2.6.32: 2.2-83
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-17-pve: 2.6.32-83
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-34
qemu-server: 2.0-71
pve-firmware: 1.0-21
libpve-common-perl: 1.0-40
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.3-7
ksm-control-daemon: 1.1-1

THIS IS INCORRECT:
Ok narrowed the problem: If the disk is LVM trough vertio then it hangs at the end. LVM trough IDE works without problem.
On our production VM's the change virtio to IDE doesn't work. It still hangs at the last point.

I already have disabled the HA on the VM, but it still fails on approx the last 1.5GB RAM what needs to be transferd.

I also noticed that the "new" node in the begin of the process had the 201 running, but when the transfers stalss the process is gone on the "new" node.
Code:
/usr/bin/kvm -id 201 -chardev socket,id=qmp,path=/var/run/qemu-server/201.qmp,server,nowait -mon chardev=qmp,mode=control -vnc..................(it is a large line ;-))
 
VMID.conf:
Code:
bootdisk: virtio0cores: 8
ide2: none,media=cdrom
memory: 8192
name: CPANEL2
net0: virtio=0A:50:12:91:18:35,bridge=vmbr0
net1: virtio=8E:CC:BB:30:23:C8,bridge=vmbr1
onboot: 1
ostype: l26
sockets: 2
virtio0: LVM-VM-Disks:vm-201-disk-1
It is CentOS 6.3 64bit, running cPanel.
 
a fresh installed centos 6.3 amd64 can be live migrated here on our IMS but I do not have the same load here.

do other system migrate without problems? and make sure that you got the VM booted with QEMU 1.3 before you live migrate to QEMU 1.3 node.
 
Code:
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:53 migration status: active (transferred 3506597511, remaining 1577824256), total 8598913024)[/FONT][/COLOR]
No same problem. Looks like it can't handle a large load/large RAM. If I copy the VM with no load and therefore almost no RAM allocated it transfered without a problem.
 
Last edited:
can you describe a way (step by step) to reproduce the behavior in an easy way?
 
Ok, I can send you the VM image.
But here is the step by step:
1. Disable HA for VM
2. Offline migration VM from QEMU 1.2 to QEMU 1.3
3. Start of VM
4. When VM is running Live migrate from qemu1.3 old host to qemu1.3 new host
5. STATUS:
Code:
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting migration of VM 200 to node 'timo' (10.221.184.62)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 copying disk images[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting VM 200 on remote node 'timo'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:38 starting migration tunnel[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:39 starting online/live migration on port 60000[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:41 migration status: active (transferred 74345107, remaining 7876104192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:43 migration status: active (transferred 147185019, remaining 6391275520), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:45 migration status: active (transferred 153943192, remaining 3282419712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:47 migration status: active (transferred 222281832, remaining 3210084352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:49 migration status: active (transferred 322183870, remaining 3107733504), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:51 migration status: active (transferred 422257372, remaining 3007537152), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:53 migration status: active (transferred 523207505, remaining 2906107904), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:55 migration status: active (transferred 622286855, remaining 2802089984), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:57 migration status: active (transferred 721439280, remaining 2700673024), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:59 migration status: active (transferred 819940003, remaining 2601701376), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:01 migration status: active (transferred 920042189, remaining 2501427200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:03 migration status: active (transferred 1014520586, remaining 2406699008), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:05 migration status: active (transferred 1110334879, remaining 2308177920), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:07 migration status: active (transferred 1210207709, remaining 2208051200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:09 migration status: active (transferred 1308249611, remaining 2109820928), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:11 migration status: active (transferred 1410084423, remaining 2007740416), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:13 migration status: active (transferred 1512181442, remaining 1905139712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:15 migration status: active (transferred 1612185530, remaining 1804120064), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:17 migration status: active (transferred 1710194678, remaining 1705865216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:19 migration status: active (transferred 1810399254, remaining 2701668352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:21 migration status: active (transferred 1911316606, remaining 2600325120), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:23 migration status: active (transferred 2010927504, remaining 2499592192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:25 migration status: active (transferred 2110476691, remaining 2400030720), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:27 migration status: active (transferred 2211332499, remaining 2299174912), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:29 migration status: active (transferred 2311598485, remaining 2198900736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:31 migration status: active (transferred 2413957540, remaining 2096480256), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:33 migration status: active (transferred 2513244580, remaining 1997193216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:35 migration status: active (transferred 2613019044, remaining 1897418752), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:37 migration status: active (transferred 2712105654, remaining 1797210112), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:39 migration status: active (transferred 2811614656, remaining 2066083840), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:41 migration status: active (transferred 2911429732, remaining 1950920704), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:43 migration status: active (transferred 3009282381, remaining 1772441600), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:45 migration status: active (transferred 3108357970, remaining 1667055616), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:47 migration status: active (transferred 3206109009, remaining 1764212736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:49 migration status: active (transferred 3305450200, remaining 1661173760), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:51 migration status: active (transferred 3406921117, remaining 1634742272), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:53 migration status: active (transferred 3506597511, remaining 1577824256), total 8598913024)[/FONT][/COLOR]
6. At this point the VM will be unresponsive. On QEMU1.3 old host the VM is still running, but on the QEMU1.3 new host the VM is gone. Looks like the VM on the new host is crashed/killed and therefore the tunnel is broken.
7. After some time (10 minutes of unresponsive VM and stalled transfer):
Code:
QEMU1.3 OLD HOST# qm unlock 200
8. Stop VM through panel.
9. Start VM through panel.
10. VM is up again.

I have tried to make a test VM (copied the VM) but then the VM is not being used and then it transfers.
 
Ok, I can send you the VM image.
But here is the step by step:
1. Disable HA for VM
2. Offline migration VM from QEMU 1.2 to QEMU 1.3
3. Start of VM
4. When VM is running Live migrate from qemu1.3 old host to qemu1.3 new host
5. STATUS:
Code:
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting migration of VM 200 to node 'timo' (10.221.184.62)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 copying disk images[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting VM 200 on remote node 'timo'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:38 starting migration tunnel[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:39 starting online/live migration on port 60000[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:41 migration status: active (transferred 74345107, remaining 7876104192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:43 migration status: active (transferred 147185019, remaining 6391275520), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:45 migration status: active (transferred 153943192, remaining 3282419712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:47 migration status: active (transferred 222281832, remaining 3210084352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:49 migration status: active (transferred 322183870, remaining 3107733504), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:51 migration status: active (transferred 422257372, remaining 3007537152), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:53 migration status: active (transferred 523207505, remaining 2906107904), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:55 migration status: active (transferred 622286855, remaining 2802089984), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:57 migration status: active (transferred 721439280, remaining 2700673024), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:59 migration status: active (transferred 819940003, remaining 2601701376), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:01 migration status: active (transferred 920042189, remaining 2501427200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:03 migration status: active (transferred 1014520586, remaining 2406699008), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:05 migration status: active (transferred 1110334879, remaining 2308177920), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:07 migration status: active (transferred 1210207709, remaining 2208051200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:09 migration status: active (transferred 1308249611, remaining 2109820928), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:11 migration status: active (transferred 1410084423, remaining 2007740416), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:13 migration status: active (transferred 1512181442, remaining 1905139712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:15 migration status: active (transferred 1612185530, remaining 1804120064), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:17 migration status: active (transferred 1710194678, remaining 1705865216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:19 migration status: active (transferred 1810399254, remaining 2701668352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:21 migration status: active (transferred 1911316606, remaining 2600325120), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:23 migration status: active (transferred 2010927504, remaining 2499592192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:25 migration status: active (transferred 2110476691, remaining 2400030720), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:27 migration status: active (transferred 2211332499, remaining 2299174912), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:29 migration status: active (transferred 2311598485, remaining 2198900736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:31 migration status: active (transferred 2413957540, remaining 2096480256), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:33 migration status: active (transferred 2513244580, remaining 1997193216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:35 migration status: active (transferred 2613019044, remaining 1897418752), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:37 migration status: active (transferred 2712105654, remaining 1797210112), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:39 migration status: active (transferred 2811614656, remaining 2066083840), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:41 migration status: active (transferred 2911429732, remaining 1950920704), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:43 migration status: active (transferred 3009282381, remaining 1772441600), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:45 migration status: active (transferred 3108357970, remaining 1667055616), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:47 migration status: active (transferred 3206109009, remaining 1764212736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:49 migration status: active (transferred 3305450200, remaining 1661173760), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:51 migration status: active (transferred 3406921117, remaining 1634742272), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:53 migration status: active (transferred 3506597511, remaining 1577824256), total 8598913024)[/FONT][/COLOR]
6. At this point the VM will be unresponsive. On QEMU1.3 old host the VM is still running, but on the QEMU1.3 new host the VM is gone. Looks like the VM on the new host is crashed/killed and therefore the tunnel is broken.
7. After some time (10 minutes of unresponsive VM and stalled transfer):
Code:
QEMU1.3 OLD HOST# qm unlock 200
8. Stop VM through panel.
9. Start VM through panel.
10. VM is up again.

I have tried to make a test VM (copied the VM) but then the VM is not being used and then it transfers.
Hi,
how fast is your network-connection between the nodes?
Test with iperf in both directions please.

And try the same test during a migration-test!

Perhaps the changed something (driver) so you don't get the full speed?!
Is the network connection (heavily) used for other things like nfs/iSCSI/DRBD?

Udo
 
Hi,
how fast is your network-connection between the nodes? Gigabit
Test with iperf in both directions please.

And try the same test during a migration-test!

Perhaps the changed something (driver) so you don't get the full speed?!
Is the network connection (heavily) used for other things like nfs/iSCSI/DRBD?

Udo
It is a gigabit network inside the IMS.

The iperf without transfer:
Code:
------------------------------------------------------------Client connecting to 10.221.184.62, TCP port 5001
TCP window size: 23.8 KByte (default)
------------------------------------------------------------
[  3] local 10.221.184.60 port 58103 connected with 10.221.184.62 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.10 GBytes    943 Mbits/sec

the iperf with transfer:
Code:
------------------------------------------------------------
Client connecting to 10.221.184.62, TCP port 5001
TCP window size: 23.8 KByte (default)
------------------------------------------------------------
[  3] local 10.221.184.60 port 58292 connected with 10.221.184.62 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.10 GBytes    943 Mbits/sec

And we don't use that switch for any of this:
nfs/iSCSI/DRBD
 
Ok with the latest version on both nodes of QEMU(pve-qemu-kvm: 1.3-10), the test VM will also stall during migration.
 
Ok, I can send you the VM image.
...

our network is not that fast that we can just download multiple gigabytes in short time.
 
Ok what is then the best method?

that is exactly what I asked you. if there is a bug you need to find a way to reproduce the issue.
 
Ok but I can reproduce it, I have told you the steps.

Ok, I can send you the VM image.
But here is the step by step:
1. Disable HA for VM
2. Offline migration VM from QEMU 1.2 to QEMU 1.3
3. Start of VM
4. When VM is running Live migrate from qemu1.3 old host to qemu1.3 new host
5. STATUS:
Code:
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting migration of VM 200 to node 'timo' (10.221.184.62)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 copying disk images[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:36 starting VM 200 on remote node 'timo'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:38 starting migration tunnel[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:39 starting online/live migration on port 60000[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:41 migration status: active (transferred 74345107, remaining 7876104192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:43 migration status: active (transferred 147185019, remaining 6391275520), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:45 migration status: active (transferred 153943192, remaining 3282419712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:47 migration status: active (transferred 222281832, remaining 3210084352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:49 migration status: active (transferred 322183870, remaining 3107733504), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:51 migration status: active (transferred 422257372, remaining 3007537152), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:53 migration status: active (transferred 523207505, remaining 2906107904), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:55 migration status: active (transferred 622286855, remaining 2802089984), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:57 migration status: active (transferred 721439280, remaining 2700673024), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 08:59:59 migration status: active (transferred 819940003, remaining 2601701376), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:01 migration status: active (transferred 920042189, remaining 2501427200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:03 migration status: active (transferred 1014520586, remaining 2406699008), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:05 migration status: active (transferred 1110334879, remaining 2308177920), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:07 migration status: active (transferred 1210207709, remaining 2208051200), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:09 migration status: active (transferred 1308249611, remaining 2109820928), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:11 migration status: active (transferred 1410084423, remaining 2007740416), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:13 migration status: active (transferred 1512181442, remaining 1905139712), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:15 migration status: active (transferred 1612185530, remaining 1804120064), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:17 migration status: active (transferred 1710194678, remaining 1705865216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:19 migration status: active (transferred 1810399254, remaining 2701668352), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:21 migration status: active (transferred 1911316606, remaining 2600325120), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:23 migration status: active (transferred 2010927504, remaining 2499592192), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:25 migration status: active (transferred 2110476691, remaining 2400030720), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:27 migration status: active (transferred 2211332499, remaining 2299174912), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:29 migration status: active (transferred 2311598485, remaining 2198900736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:31 migration status: active (transferred 2413957540, remaining 2096480256), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:33 migration status: active (transferred 2513244580, remaining 1997193216), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:35 migration status: active (transferred 2613019044, remaining 1897418752), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:37 migration status: active (transferred 2712105654, remaining 1797210112), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:39 migration status: active (transferred 2811614656, remaining 2066083840), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:41 migration status: active (transferred 2911429732, remaining 1950920704), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:43 migration status: active (transferred 3009282381, remaining 1772441600), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:45 migration status: active (transferred 3108357970, remaining 1667055616), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:47 migration status: active (transferred 3206109009, remaining 1764212736), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:49 migration status: active (transferred 3305450200, remaining 1661173760), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:51 migration status: active (transferred 3406921117, remaining 1634742272), total 8598913024)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Dec 20 09:00:53 migration status: active (transferred 3506597511, remaining 1577824256), total 8598913024)[/FONT][/COLOR]
6. At this point the VM will be unresponsive. On QEMU1.3 old host the VM is still running, but on the QEMU1.3 new host the VM is gone. Looks like the VM on the new host is crashed/killed and therefore the tunnel is broken.
7. After some time (10 minutes of unresponsive VM and stalled transfer):
Code:
QEMU1.3 OLD HOST# qm unlock 200
8. Stop VM through panel.
9. Start VM through panel.
10. VM is up again.

I have tried to make a test VM (copied the VM) but then the VM is not being used and then it transfers.
 
Ok but I can reproduce it, I have told you the steps.

as I do not have your VM I can not reproduce this. a fresh installed works here, so we need to find a way that we see the issue also here.
 
as I do not have your VM I can not reproduce this. a fresh installed works here, so we need to find a way that we see the issue also here.
I can look if I can schrink the image so you can download it, because I think that is the only solution so you can see it.
 
Code:
Dec 20 17:38:37 timo pvedaemon[2067]: worker 62327 finishedDec 20 17:38:37 timo pvedaemon[2067]: starting 1 worker(s)
Dec 20 17:38:37 timo pvedaemon[2067]: worker 65033 started
Dec 20 17:38:49 timo pvedaemon[64588]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:49 timo pvedaemon[65033]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:50 timo pvedaemon[64588]: <root@pam> successful auth for user 'root@pam'
Dec 20 17:38:50 timo pvedaemon[64588]: <root@pam> successful auth for user 'root@pam'
Dec 20 17:38:50 timo pvedaemon[64588]: <root@pam> successful auth for user 'root@pam'
Dec 20 17:38:52 timo pvedaemon[64066]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:53 timo pvestatd[2897]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:53 timo pvedaemon[65033]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:56 timo pvedaemon[64066]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:56 timo pvedaemon[64952]: WARNING: interrupted by signal
Dec 20 17:38:56 timo pvedaemon[64952]: VM 108 qmp command failed - VM 108 qmp command 'query-migrate' failed - interrupted by signal
Dec 20 17:38:56 timo pvedaemon[64952]: WARNING: query migrate failed: VM 108 qmp command 'query-migrate' failed - interrupted by signal#012
Dec 20 17:38:56 timo pvedaemon[64588]: WARNING: unable to connect to VM 108 socket - timeout after 31 retries
Dec 20 17:38:59 timo pmxcfs[1635]: [status] notice: received log
Dec 20 17:38:59 timo pmxcfs[1635]: [status] notice: received log
Dec 20 17:38:59 timo pvedaemon[64952]: migration problems
Dec 20 17:38:59 timo pvedaemon[64066]: <root@pam> end task UPID:timo:0000FDB8:002021CB:50D33EF6:qmigrate:108:root@pam: migration problems

This is what I find in the syslog.
 
If I transfer from Rembo to Timo
Timo -> Rembo
the transfer fails.

syslog sint

Syslog Rembo:
Code:
Dec 20 20:19:58 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retriesDec 20 20:19:58 rembo pvedaemon[2701]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:02 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:05 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:08 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:08 rembo pvedaemon[2701]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:12 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:14 rembo pmxcfs[1668]: [status] notice: received log
Dec 20 20:20:15 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:15 rembo ntpd[1603]: Listen normally on 24 tap108i0 fe80::24b3:6cff:fe01:3241 UDP 123
Dec 20 20:20:15 rembo ntpd[1603]: Listen normally on 25 tap108i1 fe80::6caf:3fff:feb8:81a9 UDP 123
Dec 20 20:20:15 rembo ntpd[1603]: Deleting interface #23 tap108i1, fe80::1088:1ff:fe32:ff97#123, interface stats: received=0, sent=0, dropped=0, active_time=300 secs
Dec 20 20:20:15 rembo ntpd[1603]: Deleting interface #22 tap108i0, fe80::98de:f9ff:fe7b:8e59#123, interface stats: received=0, sent=0, dropped=0, active_time=300 secs
Dec 20 20:20:18 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:18 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:22 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:25 rembo pvedaemon[2701]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:28 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:28 rembo pvedaemon[2701]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:20:32 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
.........................


Dec 20 20:22:09 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:12 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:15 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:18 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:19 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:22 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:25 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:28 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:29 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:32 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:36 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:37 rembo pvedaemon[18911]: VM 305 qmp command failed - interrupted by signal
Dec 20 20:22:37 rembo pvedaemon[18911]: WARNING: query migrate failed: interrupted by signal#012
Dec 20 20:22:38 rembo pvestatd[2907]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:39 rembo pvedaemon[2701]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:42 rembo pvedaemon[2701]: <root@pam> end task UPID:rembo:000049DF:00032F31:50D364D0:qmigrate:305:root@pam: unexpected status
Dec 20 20:22:42 rembo pvedaemon[2699]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries
Dec 20 20:22:46 rembo pvedaemon[2698]: WARNING: unable to connect to VM 305 socket - timeout after 31 retries

timo
Code:
Dec 20 20:19:04 timo pvedaemon[2067]: worker 92719 finished
Dec 20 20:19:04 timo pvedaemon[2067]: starting 1 worker(s)
Dec 20 20:19:04 timo pvedaemon[2067]: worker 95083 started
Dec 20 20:19:31 timo pmxcfs[1635]: [status] notice: received log
Dec 20 20:19:31 timo pmxcfs[1635]: [status] notice: received log
Dec 20 20:19:31 timo kernel: vmbr0: port 4(tap108i0) entering disabled state
Dec 20 20:19:31 timo kernel: vmbr0: port 4(tap108i0) entering disabled state
Dec 20 20:19:32 timo kernel: vmbr1: port 2(tap108i1) entering disabled state
Dec 20 20:19:32 timo kernel: vmbr1: port 2(tap108i1) entering disabled state
Dec 20 20:19:34 timo pvedaemon[94842]: <root@pam> end task UPID:timo:000172FE:002ECD06:50D36483:qmigrate:108:root@pam: OK
Dec 20 20:19:44 timo pmxcfs[1635]: [status] notice: received log
Dec 20 20:19:45 timo qm[95200]: <root@pam> starting task UPID:timo:000173E1:002EEB4C:50D364D1:qmstart:305:root@pam:
Dec 20 20:19:45 timo qm[95201]: start VM 305: UPID:timo:000173E1:002EEB4C:50D364D1:qmstart:305:root@pam:
Dec 20 20:19:45 timo kernel: device tap305i0 entered promiscuous mode
Dec 20 20:19:45 timo kernel: vmbr0: port 4(tap305i0) entering forwarding state
Dec 20 20:19:45 timo kernel: device tap305i1 entered promiscuous mode
Dec 20 20:19:45 timo kernel: vmbr1: port 2(tap305i1) entering forwarding state
Dec 20 20:19:46 timo qm[95201]: VM 305 qmp command failed - unable to find configuration file for VM 305 - no such machine
Dec 20 20:19:46 timo qm[95201]: VM 305 qmp command failed - unable to find configuration file for VM 305 - no such machine
Dec 20 20:19:46 timo qm[95200]: <root@pam> end task UPID:timo:000173E1:002EEB4C:50D364D1:qmstart:305:root@pam: OK
Dec 20 20:19:55 timo kernel: tap305i0: no IPv6 routers present
Dec 20 20:19:56 timo kernel: tap305i1: no IPv6 routers present
Dec 20 20:20:14 timo pmxcfs[1635]: [status] notice: received log

So the great question is:
Why does the:
WARNING: unable to connect to VM 305 socket - timeout after 31 retries
happen? Are there more logs I can watch?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!