OK, just uploaded a patch to the git repository:
https://git.proxmox.com/?p=qemu-ser...;hpb=e95fe75f86e81e9f9d597e1d43cd757b928813eb
Please can you test?
I like the patch and it does seem to work.
Out of curiosity I added one line to the code to see if that loop is making a difference.
Code:
if ($timeout) {
for (my $i = 0; $i < $timeout; $i++) {
[B] $self->log('info', "Checking if tunnel exists\n");[/B]
return if !PVE::ProcFSTools::check_process_running($cpid);
sleep(1);
}
}
Nearly every time I do a live migration the check happens twice:
Code:
Jan 17 12:12:51 starting migration of VM 100 to node 'vm5' (192.168.8.5)
Jan 17 12:12:51 copying disk images
Jan 17 12:12:51 starting VM 100 on remote node 'vm5'
Jan 17 12:12:51 starting migration tunnel
Jan 17 12:12:51 starting online/live migration on port 60000
Jan 17 12:12:53 migration status: active (transferred 210559KB, remaining 2169920KB), total 4211136KB)
Jan 17 12:12:56 migration status: active (transferred 480429KB, remaining 284664KB), total 4211136KB)
Jan 17 12:12:58 migration status: completed
Jan 17 12:12:58 migration speed: 585.14 MB/s
[B]Jan 17 12:12:59 Checking if tunnel exists
Jan 17 12:13:00 Checking if tunnel exists[/B]
Jan 17 12:13:00 migration finished successfuly (duration 00:00:09)
TASK OK
Your patches fixes the problem, no more migration failures from the tunnel being killed prematurely.
Thanks for the patch, looking forward to seeing it in the next update!