[SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

spirit, good ponts. Will test them soon. Two of your assumptions i already planned for today experiments.
Concerning clock sync. When i enabled '-d qemu_error' debug option, there indeed was warning in qemu.log about small TSC difference, if i remember correctly.
 
I have done test on proxmox 4, commenting the qm resume, and the target vm after migration stay paused.
Can you do the same test to be sure ?
This is interesting. Maybe difference between community and commercial versions. What have you commented out exactly?
I commenting out whole resume section in PVE 4 /usr/share/perl5/PVE/QemuMigrate.pm
Code:
886 #    if ($self->{livemigration}) {
887 #       # now that config file is move, we can resume vm on target if livemigrate
888 #       my $cmd = [@{$self->{rem_ssh}}, 'qm', 'resume', $vmid, '--skiplock', '--nocheck'];
889 #       eval{ PVE::Tools::run_command($cmd, outfunc => sub {},
890 #               errfunc => sub {
891 #                   my $line = shift;
892 #                   $self->log('err', $line);
893 #               });
894 #       };
895 #       if (my $err = $@) {
896 #           $self->log('err', $err);
897 #           $self->{errors} = 1;
898 #       }
899 #    }
and VM becomes 'running' as usual.
When i do the same in PVE 5, VM becomes paused forever after migration. It resumes only after manual command.

Other tests (old kernel in PVE 5 and changing 'pvecm' to 'qm') did not bring any noticeable changes.

Edit: in my previous post the debug command was really "args: -d guest_errors -D /tmp/qemu200.log" in /etc/pve/qemu-server/200.conf. With this option pve-qemu-kvm_2.9.0-2 shows "kvm: warning: TSC frequency mismatch between VM (2933459 kHz) and host (2933455 kHz), and TSC scaling unavailable" even in PVE 4 with upgraded pve-qemu-kvm package. But this warning does not affect migration behavior conserning subject issue.
 
Last edited:
The issue introduced with this commit. And it is indirectly related to local storage migration.
Code:
Author: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Date:   Wed Apr 19 15:19:47 2017 +0200

    start: keep VM paused when migrating via unix socket
  
    this (correctly!) errored out with Qemu 2.9 when live-migrating
    local disks, because the NBD server blocks the VM from being
    resumed. was probably missed when migrating via unix domains
    was originally introduced..
  
    Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 6865d60..64ecd8d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -4666,7 +4666,7 @@ sub vm_start {
 
        print "migration listens on $migrate_uri\n" if $migrate_uri;
 
-       if ($statefile && $statefile ne 'tcp')  {
+       if ($statefile && $statefile ne 'tcp' && $statefile ne 'unix')  {
            eval { vm_mon_cmd_nocheck($vmid, "cont"); };
            warn $@ if $@;
        }
by reverting this change, i can make live migration fast again.
I understend that it is not real fix and something else in PVE 'qemu-server' package needs to be reworked, and maybe real problem hides somewere else, but at least we know now were to begin.

Edit: of course reverting this change does not affect '--migration_type insecure'. "$statefile ne 'tcp'" responsible for that case.
 
Last edited:
The issue introduced with this commit. And it is indirectly related to local storage migration.
Code:
Author: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Date:   Wed Apr 19 15:19:47 2017 +0200

    start: keep VM paused when migrating via unix socket
 
    this (correctly!) errored out with Qemu 2.9 when live-migrating
    local disks, because the NBD server blocks the VM from being
    resumed. was probably missed when migrating via unix domains
    was originally introduced..
 
    Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 6865d60..64ecd8d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -4666,7 +4666,7 @@ sub vm_start {
 
        print "migration listens on $migrate_uri\n" if $migrate_uri;
 
-       if ($statefile && $statefile ne 'tcp')  {
+       if ($statefile && $statefile ne 'tcp' && $statefile ne 'unix')  {
            eval { vm_mon_cmd_nocheck($vmid, "cont"); };
            warn $@ if $@;
        }
by reverting this change, i can make live migration fast again.
I understend that it is not real fix and something else in PVE 'qemu-server' package needs to be reworked, and maybe real problem hides somewere else, but at least we know now were to begin.

Edit: of course reverting this change does not affect '--migration_type insecure'. "$statefile ne 'tcp'" responsible for that case.

to give a little background: manually resuming the target VM in phase 3 of our migration code was the original implementation (since 2012). when we added live migration over unix sockets (in June 2016), this was overlooked (leading to a minor speedup when live migrating over unix sockets instead of tcp, but at the cost of switching over the VM before logically moving the config file). with live-migration with local storage, we not only need to migrate the RAM content, but also mirror the disks. so now it is actually vital to only resume the target VM after both have happened, otherwise you risk inconsistencies. the missing check for unix socket state files was added when I noticed this while testing Qemu 2.9, which actually enforces this (you can only have one single writer per drive, so you need to stop the nbd server over which the storage migration takes place before resuming the target VM).

I think the actual delay is caused by our polling interval of the migration status (which is 2 seconds). if the target VM is in migration mode but not suspended, it will be switched to independently from our code as soon as the RAM migration is completed. if the target VM is in migration mode but suspended (as originally intended, in PVE 4 for tcp and in PVE 5 for both modes), it will be switched to as soon as we ssh to the target node and call resume. but it takes up to 2 seconds in phase 2 to even notice that the migration was completed, and there is a bit of overhead for moving the config file and sshing to the other node as well (so an actual delay of somewhere between 0 and 5 seconds seems in the right ball park).

I'll move this discussion to the pve-devel mailing list, maybe we can get the faster behaviour without inconsistencies for the live migration with shared storage code path.
 
Fabian, do i understend you correctly, that manual resuming in phase 3 will remain in th future? I'd prefer to have an option (e.g. 'dangerous migration without downtime').
And in this case migration log should output real downtime as time difference from stopping VM on source host to finish of resume command on destination host. Now log misleadingly shows qemu migration stat approximately equal to 12-40 ms.
 
Fabian, do i understend you correctly, that manual resuming in phase 3 will remain in th future? I'd prefer to have an option (e.g. 'dangerous migration without downtime').
And in this case migration log should output real downtime as time difference from stopping VM on source host to finish of resume command on destination host. Now log misleadingly shows qemu migration stat approximately equal to 12-40 ms.

we'll discuss how to tackle this on pve-devel, depending on the associated risk moving to the immediate resume when no local storage migration is happening might be possible. I agree that printing the downtime of the RAM migration only is misleading, although I am not sure how to accurately calculate the real one in this scenario..
 
Fabian has made a good explain on pve-devel mailing list.

in QemuMigrate.pm,

can you try to reduce
my $usleep = 2000000;

and
$usleep = 300000 if $avglstat && $rem < $avglstat;


to smaller value, like 10000 for example ? (10ms)
 
I'm seeing that your migration log is quite short

Jul 24 18:12:37 start migrate command to unix:/run/qemu-server/100.migrate
Jul 24 18:12:39 migration speed: 256.00 MB/s - downtime 39 ms

I wonder if the usleep=300000 is catched or not ....

do you have tried with a vm with more memory and bigger migration time ?
 
Fabian has made a good explain on pve-devel mailing list.

in QemuMigrate.pm,

can you try to reduce
my $usleep = 2000000;

and
$usleep = 300000 if $avglstat && $rem < $avglstat;


to smaller value, like 10000 for example ? (10ms)
I'm trying to subscribe to pve-devel second time and it looks like my subscription request ignored again. I can read archive, but do not think my messages to pve-devel at pve.proxmox.com would be processed without subscription.

Concerning usleep. As i said i'm using nested PVE clusters for experiments. To monitor state of VM, i'm using this command on target host:
Code:
while :; do echo -n `date`' -- '; echo -e  '{ "execute": "qmp_capabilities" } \n {"execute":"human-monitor-command","arguments":{"command-line":"info status"}}' | socat - UNIX-CONNECT:/var/run/qemu-server/300.qmp| grep -Eo 'VM status: [^\]+'; sleep 0.5;done

Here is source host output with default usleeps:
Code:
root@pve01:~# /usr/sbin/qm migrate 300 pve02 --online
2017-07-27 18:25:21 starting migration of VM 300 to node 'pve02' (10.10.10.202)
2017-07-27 18:25:22 copying disk images
2017-07-27 18:25:22 starting VM 300 on remote node 'pve02'
2017-07-27 18:25:24 start remote tunnel
2017-07-27 18:25:25 starting online/live migration on unix:/run/qemu-server/300.migrate
2017-07-27 18:25:25 migrate_set_speed: 8589934592
2017-07-27 18:25:25 migrate_set_downtime: 0.1
2017-07-27 18:25:25 set migration_caps
2017-07-27 18:25:25 set cachesize: 53687091
2017-07-27 18:25:25 start migrate command to unix:/run/qemu-server/300.migrate
2017-07-27 18:25:27 migration speed: 256.00 MB/s - downtime 6 ms
2017-07-27 18:25:27 migration status: completed
2017-07-27 18:25:28 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.10.10.202 pvesr set-state 300 \''{}'\'
2017-07-27 18:25:32 migration finished successfully (duration 00:00:11)

And corresponding output on target host:
Code:
Thu Jul 27 18:25:22 MSK 2017 -- 2017/07/27 18:25:22 socat[2719] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:25:22 MSK 2017 -- 2017/07/27 18:25:22 socat[2738] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:25:23 MSK 2017 -- 2017/07/27 18:25:23 socat[2746] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:25:23 MSK 2017 -- Thu Jul 27 18:25:24 MSK 2017 -- VM status: paused (inmigrate)
Thu Jul 27 18:25:25 MSK 2017 -- VM status: paused (inmigrate)
Thu Jul 27 18:25:25 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:26 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:26 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:27 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:27 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:28 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:28 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:29 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:29 MSK 2017 -- VM status: paused
Thu Jul 27 18:25:30 MSK 2017 -- VM status: running
Thu Jul 27 18:25:30 MSK 2017 -- VM status: running
Thu Jul 27 18:25:31 MSK 2017 -- VM status: running
Thu Jul 27 18:25:31 MSK 2017 -- VM status: running

And same scenario with modified usleep:
Code:
root@pve01:~# /usr/sbin/qm migrate 300 pve02 --online
2017-07-27 18:27:13 starting migration of VM 300 to node 'pve02' (10.10.10.202)
2017-07-27 18:27:14 copying disk images
2017-07-27 18:27:14 starting VM 300 on remote node 'pve02'
2017-07-27 18:27:17 start remote tunnel
2017-07-27 18:27:18 starting online/live migration on unix:/run/qemu-server/300.migrate
2017-07-27 18:27:18 migrate_set_speed: 8589934592
2017-07-27 18:27:18 migrate_set_downtime: 0.1
2017-07-27 18:27:18 set migration_caps
2017-07-27 18:27:18 set cachesize: 53687091
2017-07-27 18:27:18 start migrate command to unix:/run/qemu-server/300.migrate
2017-07-27 18:27:18 migration status: active (transferred 211255, remaining 529178624), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: active (transferred 487834, remaining 403304448), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: active (transferred 750760, remaining 283643904), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: active (transferred 1049146, remaining 147845120), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: active (transferred 1589999, remaining 19120128), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: active (transferred 3178826, remaining 512000), total 554508288)
2017-07-27 18:27:18 migration xbzrle cachesize: 33554432 transferred 0 pages 0 cachemiss 0 overflow 0
2017-07-27 18:27:18 migration status: completed
2017-07-27 18:27:19 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.10.10.202 pvesr set-state 300 \''{}'\'
2017-07-27 18:27:22 migration finished successfully (duration 00:00:09)

Code:
Thu Jul 27 18:27:13 MSK 2017 -- 2017/07/27 18:27:13 socat[3293] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:27:14 MSK 2017 -- 2017/07/27 18:27:14 socat[3298] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:27:14 MSK 2017 -- 2017/07/27 18:27:14 socat[3320] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:27:15 MSK 2017 -- 2017/07/27 18:27:15 socat[3329] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:27:15 MSK 2017 -- Thu Jul 27 18:27:16 MSK 2017 -- Thu Jul 27 18:27:17 MSK 2017 -- VM status: paused (inmigrate)
Thu Jul 27 18:27:18 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:18 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:19 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:19 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:20 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:20 MSK 2017 -- VM status: paused
Thu Jul 27 18:27:21 MSK 2017 -- VM status: running
Thu Jul 27 18:27:21 MSK 2017 -- VM status: running

And now with default usleep and changed vm_start function:
Code:
root@pve01:~# /usr/sbin/qm migrate 300 pve02 --online
2017-07-27 18:43:45 starting migration of VM 300 to node 'pve02' (10.10.10.202)
2017-07-27 18:43:46 copying disk images
2017-07-27 18:43:46 starting VM 300 on remote node 'pve02'
2017-07-27 18:43:48 start remote tunnel
2017-07-27 18:43:48 starting online/live migration on unix:/run/qemu-server/300.migrate
2017-07-27 18:43:48 migrate_set_speed: 8589934592
2017-07-27 18:43:48 migrate_set_downtime: 0.1
2017-07-27 18:43:48 set migration_caps
2017-07-27 18:43:48 set cachesize: 53687091
2017-07-27 18:43:48 start migrate command to unix:/run/qemu-server/300.migrate
2017-07-27 18:43:50 migration speed: 256.00 MB/s - downtime 5 ms
2017-07-27 18:43:50 migration status: completed
2017-07-27 18:43:52 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.10.10.202 pvesr set-state 300 \''{}'\'
2017-07-27 18:43:56 migration finished successfully (duration 00:00:11)

Code:
Thu Jul 27 18:43:45 MSK 2017 -- 2017/07/27 18:43:45 socat[5029] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:43:46 MSK 2017 -- 2017/07/27 18:43:46 socat[5055] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:43:46 MSK 2017 -- 2017/07/27 18:43:46 socat[5089] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:43:47 MSK 2017 -- 2017/07/27 18:43:47 socat[5094] E connect(5, AF=1 "/var/run/qemu-server/300.qmp", 30): No such file or directory
Thu Jul 27 18:43:47 MSK 2017 -- Thu Jul 27 18:43:48 MSK 2017 -- VM status: paused (inmigrate)
Thu Jul 27 18:43:49 MSK 2017 -- VM status: running
Thu Jul 27 18:43:49 MSK 2017 -- VM status: running
Thu Jul 27 18:43:50 MSK 2017 -- VM status: running
BTW anyone can verify it easily.

I'm seeing that your migration log is quite short

Jul 24 18:12:37 start migrate command to unix:/run/qemu-server/100.migrate
Jul 24 18:12:39 migration speed: 256.00 MB/s - downtime 39 ms

I wonder if the usleep=300000 is catched or not ....

do you have tried with a vm with more memory and bigger migration time ?

Yes, with big VMs (8-32GB RAM) downtime is mostly the same. With local storage migration downtime 10 seconds longer.
 
Yes, with big VMs (8-32GB RAM) downtime is mostly the same.
downtime will always be the same, because it's compute by qemu itself.
But I wonder if lowering the usleep, will exit proxmox loop faster.
default usleep is 2s, and lowering to 300ms later after some iteration. But as your previous log was really fast,
I think the default usleep of 2s still apply.


With local storage migration downtime 10 seconds longer.
with local storage migration, it should be expected.
The vm shouldn't running before storage migration is finished, or you'll have corruption.

the discution on pve-devel was about a "regression", which put the "cont" on vm_start on target vm when unix socket is used.(introduce in june 2016).
The "normal" behaviour, was (should be) to resume manually the vm, after moving the config file, and with storage migration, have finished to sync drive.
if you use "migrate_insecure", you shouldn't have the "cont" apply in proxmox 4.

Now, it could be great to find what is slow between the end of qemu migration, and the resume.


2017-07-27 18:27:18 migration status: completed
-> maybe usleep can help here
2017-07-27 18:27:19 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=pve02' root@10.10.10.202 pvesr set-state 300 \''{}'\'
-->seem to have 3s here
2017-07-27 18:27:22 migration finished successfully (duration 00:00:09)


BTW, do you use new proxmox5 storage replication in this log ?




Edit:

I have done ping test on proxmox 4, with migrate_insecure (so "cont" is not applied), and I'm around 50-60ms of packet loss.
I'll try to do test on proxmox 5 tomorrow.
 
Last edited:
the discution on pve-devel was about a "regression", which put the "cont" on vm_start on target vm when unix socket is used.(introduce in june 2016).
The "normal" behaviour, was (should be) to resume manually the vm, after moving the config file, and with storage migration, have finished to sync drive.
if you use "migrate_insecure", you shouldn't have the "cont" apply in proxmox 4.
IMO in most cases normal is to resume as fast as you can without breaking VM. Especially if you have appllications sensible to time shifts or hi SLA. And when underlying technology (Qemu) allows you to loose 20 ms, waiting 3 s just to copy config or 10 s to destroy nbd devices or check something, is very questionable decision.
I do not know where in code should be 'cont', but i'm sure that optimal solution is to activate VM before Qemu migration process finishes or in few milliseconds after it. Otherwise it will be a waste of time.

BTW, do you use new proxmox5 storage replication in this log ?
no, i do not use it
 
IMO in most cases normal is to resume as fast as you can without breaking VM. Especially if you have appllications sensible to time shifts or hi SLA. And when underlying technology (Qemu) allows you to loose 20 ms, waiting 3 s just to copy config or 10 s to destroy nbd devices or check something, is very questionable decision.

waiting for the disks to finish syncing and stopping the nbd exports is necessary when doing storage migration, otherwise it's just a question of time before you corrupt a VM disk. IMHO, this is a bug in the current PVE 4 code, which only get exposed and fixed in PVE 5 because of Qemu 2.9 (which actively prevents concurrent rw access to the same disk by two "clients").

for the RAM-only migration we are discussing whether moving to the immediate automatic switch-over is feasible.
 
I have added some timers in phase3

2017-07-28 10:00:10 transfer_replication_state: 1.436832
2017-07-28 10:00:10 move config: 0.001174
2017-07-28 10:00:10 switch_replication_job_target: 0.003125


testing commenting
# $self->transfer_replication_state();

# $self->switch_replication_job_target();

Code:
while :; do echo -n `date +"%Y-%m-%d %H:%M:%S,%3N"`' -- '; echo -e  '{ "execute": "qmp_capabilities" } \n {"execute":"human-monitor-command","arguments":{"command-line":"info status"}}' | socat - UNIX-CONNECT:/var/run/qemu-server/244.qmp| grep -Eo 'VM status: [^\]+'; sleep 0.01;done

2017-07-28 10:24:45,184 -- VM status: paused (inmigrate)
2017-07-28 10:24:45,208 -- VM status: running


testing with
$self->transfer_replication_state();

$self->switch_replication_job_target();

Code:
while :; do echo -n `date +"%Y-%m-%d %H:%M:%S,%3N"`' -- '; echo -e  '{ "execute": "qmp_capabilities" } \n {"execute":"human-monitor-command","arguments":{"command-line":"info status"}}' | socat - UNIX-CONNECT:/var/run/qemu-server/244.qmp| grep -Eo 'VM status: [^\]+'; sleep 0.01;done

2017-07-28 10:38:28,711 -- VM status: paused (inmigrate)
2017-07-28 10:38:28,745 -- VM status: paused
2017-07-28 10:38:28,799 -- VM status: paused
2017-07-28 10:38:28,818 -- VM status: paused
2017-07-28 10:38:28,837 -- VM status: paused
2017-07-28 10:38:28,856 -- VM status: paused
2017-07-28 10:38:28,876 -- VM status: paused
2017-07-28 10:38:28,898 -- VM status: paused
2017-07-28 10:38:28,918 -- VM status: paused
2017-07-28 10:38:28,937 -- VM status: paused
2017-07-28 10:38:28,955 -- VM status: paused
2017-07-28 10:38:28,975 -- VM status: paused
2017-07-28 10:38:28,994 -- VM status: paused
2017-07-28 10:38:29,014 -- VM status: paused
2017-07-28 10:38:29,033 -- VM status: paused
2017-07-28 10:38:29,052 -- VM status: paused
2017-07-28 10:38:29,071 -- VM status: paused
2017-07-28 10:38:29,091 -- VM status: paused
2017-07-28 10:38:29,116 -- VM status: paused
2017-07-28 10:38:29,136 -- VM status: paused
2017-07-28 10:38:29,156 -- VM status: paused
2017-07-28 10:38:29,175 -- VM status: paused
2017-07-28 10:38:29,195 -- VM status: paused
2017-07-28 10:38:29,213 -- VM status: paused
2017-07-28 10:38:29,232 -- VM status: paused
2017-07-28 10:38:29,251 -- VM status: paused
2017-07-28 10:38:29,270 -- VM status: paused
2017-07-28 10:38:29,289 -- VM status: paused
2017-07-28 10:38:29,307 -- VM status: paused
2017-07-28 10:38:29,327 -- VM status: paused
2017-07-28 10:38:29,346 -- VM status: paused
2017-07-28 10:38:29,364 -- VM status: paused
2017-07-28 10:38:29,384 -- VM status: paused
2017-07-28 10:38:29,404 -- VM status: paused
2017-07-28 10:38:29,424 -- VM status: paused
2017-07-28 10:38:29,443 -- VM status: paused
2017-07-28 10:38:29,461 -- VM status: paused
2017-07-28 10:38:29,481 -- VM status: paused
2017-07-28 10:38:29,499 -- VM status: paused
2017-07-28 10:38:29,519 -- VM status: paused
2017-07-28 10:38:29,538 -- VM status: paused
2017-07-28 10:38:29,557 -- VM status: paused
2017-07-28 10:38:29,578 -- VM status: paused
2017-07-28 10:38:29,597 -- VM status: paused
2017-07-28 10:38:29,616 -- VM status: paused
2017-07-28 10:38:29,636 -- VM status: paused
2017-07-28 10:38:29,655 -- VM status: paused
2017-07-28 10:38:29,673 -- VM status: paused
2017-07-28 10:38:29,696 -- VM status: paused
2017-07-28 10:38:29,715 -- VM status: paused
2017-07-28 10:38:29,734 -- VM status: paused
2017-07-28 10:38:29,752 -- VM status: paused
2017-07-28 10:38:29,772 -- VM status: paused
2017-07-28 10:38:29,792 -- VM status: paused
2017-07-28 10:38:29,811 -- VM status: paused
2017-07-28 10:38:29,830 -- VM status: paused
2017-07-28 10:38:29,849 -- VM status: paused
2017-07-28 10:38:29,868 -- VM status: paused
2017-07-28 10:38:29,886 -- VM status: paused
2017-07-28 10:38:29,908 -- VM status: paused
2017-07-28 10:38:29,928 -- VM status: paused
2017-07-28 10:38:29,947 -- VM status: paused
2017-07-28 10:38:29,967 -- VM status: paused
2017-07-28 10:38:29,986 -- VM status: paused
2017-07-28 10:38:30,005 -- VM status: paused
2017-07-28 10:38:30,024 -- VM status: paused
2017-07-28 10:38:30,044 -- VM status: paused
2017-07-28 10:38:30,063 -- VM status: paused
2017-07-28 10:38:30,082 -- VM status: paused
2017-07-28 10:38:30,101 -- VM status: paused
2017-07-28 10:38:30,121 -- VM status: paused
2017-07-28 10:38:30,145 -- VM status: paused
2017-07-28 10:38:30,165 -- VM status: paused
2017-07-28 10:38:30,184 -- VM status: paused
2017-07-28 10:38:30,206 -- VM status: paused
2017-07-28 10:38:30,225 -- VM status: paused
2017-07-28 10:38:30,244 -- VM status: paused
2017-07-28 10:38:30,270 -- VM status: paused
2017-07-28 10:38:30,305 -- VM status: paused
2017-07-28 10:38:30,328 -- VM status: paused
2017-07-28 10:38:30,349 -- VM status: paused
2017-07-28 10:38:30,378 -- VM status: paused
2017-07-28 10:38:30,407 -- VM status: paused
2017-07-28 10:38:30,428 -- VM status: paused
2017-07-28 10:38:30,448 -- VM status: paused
2017-07-28 10:38:30,468 -- VM status: paused
2017-07-28 10:38:30,492 -- VM status: paused
2017-07-28 10:38:30,520 -- VM status: paused
2017-07-28 10:38:30,552 -- VM status: paused
2017-07-28 10:38:30,577 -- VM status: paused
2017-07-28 10:38:30,620 -- VM status: paused
2017-07-28 10:38:30,651 -- VM status: paused
2017-07-28 10:38:30,676 -- VM status: paused
2017-07-28 10:38:30,700 -- VM status: paused
2017-07-28 10:38:30,724 -- VM status: paused
2017-07-28 10:38:30,747 -- VM status: paused
2017-07-28 10:38:30,772 -- VM status: paused
2017-07-28 10:38:30,796 -- VM status: paused
2017-07-28 10:38:30,820 -- VM status: paused
2017-07-28 10:38:30,843 -- VM status: paused
2017-07-28 10:38:30,867 -- VM status: paused
2017-07-28 10:38:30,902 -- VM status: paused
2017-07-28 10:38:30,929 -- VM status: paused
2017-07-28 10:38:30,954 -- VM status: paused
2017-07-28 10:38:30,978 -- VM status: paused
2017-07-28 10:38:31,003 -- VM status: paused
2017-07-28 10:38:31,026 -- VM status: paused
2017-07-28 10:38:31,054 -- VM status: paused
2017-07-28 10:38:31,084 -- VM status: paused
2017-07-28 10:38:31,108 -- VM status: paused
2017-07-28 10:38:31,131 -- VM status: paused
2017-07-28 10:38:31,156 -- VM status: paused
2017-07-28 10:38:31,180 -- VM status: paused
2017-07-28 10:38:31,203 -- VM status: paused
2017-07-28 10:38:31,262 -- VM status: paused
2017-07-28 10:38:31,323 -- VM status: paused
2017-07-28 10:38:31,367 -- VM status: paused
2017-07-28 10:38:31,403 -- VM status: paused
2017-07-28 10:38:31,430 -- VM status: paused
2017-07-28 10:38:31,457 -- VM status: paused
2017-07-28 10:38:31,487 -- VM status: paused
2017-07-28 10:38:31,511 -- VM status: paused
2017-07-28 10:38:31,534 -- VM status: paused
2017-07-28 10:38:31,557 -- VM status: paused
2017-07-28 10:38:31,600 -- VM status: paused
2017-07-28 10:38:31,641 -- VM status: paused
2017-07-28 10:38:31,671 -- VM status: paused
2017-07-28 10:38:31,715 -- VM status: paused
2017-07-28 10:38:31,747 -- VM status: paused
2017-07-28 10:38:31,797 -- VM status: paused
2017-07-28 10:38:31,829 -- VM status: paused
2017-07-28 10:38:31,857 -- VM status: paused
2017-07-28 10:38:31,895 -- VM status: paused
2017-07-28 10:38:31,932 -- VM status: paused
2017-07-28 10:38:31,957 -- VM status: paused
2017-07-28 10:38:31,984 -- VM status: paused
2017-07-28 10:38:32,014 -- VM status: paused
2017-07-28 10:38:32,038 -- VM status: paused
2017-07-28 10:38:32,058 -- VM status: paused
2017-07-28 10:38:32,080 -- VM status: paused
2017-07-28 10:38:32,099 -- VM status: paused
2017-07-28 10:38:32,127 -- VM status: paused
2017-07-28 10:38:32,152 -- VM status: paused
2017-07-28 10:38:32,184 -- VM status: paused
2017-07-28 10:38:32,204 -- VM status: paused
2017-07-28 10:38:32,223 -- VM status: paused
2017-07-28 10:38:32,246 -- VM status: paused
2017-07-28 10:38:32,270 -- VM status: paused
2017-07-28 10:38:32,293 -- VM status: paused
2017-07-28 10:38:32,316 -- VM status: paused
2017-07-28 10:38:32,340 -- VM status: paused
2017-07-28 10:38:32,364 -- VM status: paused
2017-07-28 10:38:32,387 -- VM status: paused
2017-07-28 10:38:32,410 -- VM status: paused
2017-07-28 10:38:32,435 -- VM status: paused
2017-07-28 10:38:32,458 -- VM status: paused
2017-07-28 10:38:32,481 -- VM status: paused
2017-07-28 10:38:32,504 -- VM status: paused
2017-07-28 10:38:32,527 -- VM status: paused
2017-07-28 10:38:32,551 -- VM status: paused
2017-07-28 10:38:32,574 -- VM status: paused
2017-07-28 10:38:32,613 -- VM status: paused
2017-07-28 10:38:32,654 -- VM status: paused
2017-07-28 10:38:32,701 -- VM status: paused
2017-07-28 10:38:32,744 -- VM status: paused
2017-07-28 10:38:32,772 -- VM status: paused
2017-07-28 10:38:32,805 -- VM status: paused
2017-07-28 10:38:32,836 -- VM status: paused
2017-07-28 10:38:32,870 -- VM status: paused
2017-07-28 10:38:32,903 -- VM status: paused
2017-07-28 10:38:32,940 -- VM status: paused
2017-07-28 10:38:32,968 -- VM status: paused
2017-07-28 10:38:32,995 -- VM status: paused
2017-07-28 10:38:33,027 -- VM status: paused
2017-07-28 10:38:33,060 -- VM status: paused
2017-07-28 10:38:33,095 -- VM status: paused
2017-07-28 10:38:33,124 -- VM status: paused
2017-07-28 10:38:33,157 -- VM status: paused
2017-07-28 10:38:33,189 -- VM status: paused
2017-07-28 10:38:33,220 -- VM status: paused
2017-07-28 10:38:33,251 -- VM status: paused
2017-07-28 10:38:33,286 -- VM status: paused
2017-07-28 10:38:33,321 -- VM status: paused
2017-07-28 10:38:33,361 -- VM status: paused
2017-07-28 10:38:33,400 -- VM status: paused
2017-07-28 10:38:33,437 -- VM status: paused
2017-07-28 10:38:33,471 -- VM status: paused
2017-07-28 10:38:33,495 -- VM status: paused
2017-07-28 10:38:33,518 -- VM status: paused
2017-07-28 10:38:33,541 -- VM status: paused
2017-07-28 10:38:33,562 -- VM status: paused
2017-07-28 10:38:33,585 -- VM status: paused
2017-07-28 10:38:33,608 -- VM status: paused
2017-07-28 10:38:33,633 -- VM status: paused
2017-07-28 10:38:33,657 -- VM status: paused
2017-07-28 10:38:33,681 -- VM status: paused
2017-07-28 10:38:33,712 -- VM status: paused
2017-07-28 10:38:33,741 -- VM status: paused
2017-07-28 10:38:33,765 -- VM status: paused
2017-07-28 10:38:33,789 -- VM status: paused
2017-07-28 10:38:33,812 -- VM status: paused
2017-07-28 10:38:33,837 -- VM status: paused
2017-07-28 10:38:33,861 -- VM status: paused
2017-07-28 10:38:33,884 -- VM status: paused
2017-07-28 10:38:33,912 -- VM status: running

so yes, calling the command take some time. (I think we should at least bypass it if pvesr is not used).
I don't have lowered usleep for the main loop, maybe it could help to reduce time. (maybe usleep 1000 should be enough)

Can you try to reproduce same test ?
 
Can you try to reproduce same test ?
As i already said before (quoting myself):
Adding 'qm resume' command to QemuMigrate.pm right after migration in "phase 2" helps, but not completely. 'qm' tool is not fast enough and 0.5-1 second freeze remains.
This test bypassed both "$self->transfer_replication_state();" and "$self->switch_replication_job_target();"

I think that, in your test with commented out commands VM already was running, because it is impossible on ordinary system to resume machine on remote host in less then 24 ms with qm command. And this is why:
Code:
root@pve01:~# time ssh 10.10.10.201 :

real    0m0.146s
user    0m0.012s
sys    0m0.004s
root@pve01:~# time ssh 10.10.10.201 qm resume 300 --skiplock --nocheck

real    0m0.893s
user    0m0.012s
sys    0m0.004s


BTW for consistensy reproduced same issue on PVE 4 with "--migration_type insecure" option.
Code:
root@pve06:~# pveversion 
pve-manager/4.4-1/eb2d6f1e (running kernel: 4.4.35-1-pve)
root@pve06:~# qm migrate 200 pve05 --online --migration_type insecure
Jul 28 12:29:10 starting migration of VM 200 to node 'pve05' (10.10.10.205)
Jul 28 12:29:10 copying disk images
Jul 28 12:29:11 starting VM 200 on remote node 'pve05'
Jul 28 12:29:13 starting online/live migration on tcp:10.10.10.205:60000
Jul 28 12:29:13 migrate_set_speed: 8589934592
Jul 28 12:29:13 migrate_set_downtime: 0.1
Jul 28 12:29:13 set migration_caps
Jul 28 12:29:13 set cachesize: 53687091
Jul 28 12:29:13 start migrate command to tcp:10.10.10.205:60000
Jul 28 12:29:15 migration speed: 256.00 MB/s - downtime 6 ms
Jul 28 12:29:15 migration status: completed
Jul 28 12:29:18 migration finished successfully (duration 00:00:08)

Code:
Fri Jul 28 12:29:10 MSK 2017 -- 2017/07/28 12:29:10 socat[14129] E connect(5, AF=1 "/var/run/qemu-server/200.qmp", 30): No such file or directory
Fri Jul 28 12:29:11 MSK 2017 -- 2017/07/28 12:29:11 socat[14137] E connect(5, AF=1 "/var/run/qemu-server/200.qmp", 30): No such file or directory
Fri Jul 28 12:29:11 MSK 2017 -- 2017/07/28 12:29:11 socat[14146] E connect(5, AF=1 "/var/run/qemu-server/200.qmp", 30): No such file or directory
Fri Jul 28 12:29:12 MSK 2017 -- Fri Jul 28 12:29:13 MSK 2017 -- VM status: paused (inmigrate)
Fri Jul 28 12:29:14 MSK 2017 -- VM status: paused (inmigrate)
Fri Jul 28 12:29:14 MSK 2017 -- VM status: paused
Fri Jul 28 12:29:15 MSK 2017 -- VM status: paused
Fri Jul 28 12:29:15 MSK 2017 -- VM status: paused
Fri Jul 28 12:29:16 MSK 2017 -- VM status: paused
Fri Jul 28 12:29:16 MSK 2017 -- VM status: running
Fri Jul 28 12:29:17 MSK 2017 -- VM status: running
 
Sorry, indeed, I have made a mistake.

"qm resume" is indeed slow on my test system. (an old xeon 1,6ghz).
It's take around 1,4s.

#time qm
real 0m1.482s

on my production server (xeon v3 3,1ghz)
real 0m0.402s

using socat through ssh give me some ms.

(Don't known how fast is your cpu ? ) .


Edit: didn't have seen your previous comment, so yes qm command is slow on your system too.

I'm going on holiday tonight for 2 weeks, don't have time to help more for now.

But I think we should try to:

1) reduce the usleep in the loop.
2) bypass the pvesr if not needed
3) replace qm by something else, http call to remote pveproxy give me around 60ms (vs 1,4s)
 
Last edited:
there have been some improvements between 5.0 and 5.1..
 
  • Like
Reactions: Sralityhe
migration with local disk will always have more downtime than just RAM migration, because it needs to finalize copying the disks during the downtime. but it should benefit from the newer tunnel mechanism as well. if you think it is too slow, post a complete log and output of "pveversion -v" and we can take a look.
 
migration with local disk will always have more downtime than just RAM migration, because it needs to finalize copying the disks during the downtime. but it should benefit from the newer tunnel mechanism as well. if you think it is too slow, post a complete log and output of "pveversion -v" and we can take a look.

Hi fabian,
thanks for your reply!

I see, that also was the outcome of the thread i opend up when 5.0 was released, the reason why it was fast in pve 4.0 was that both nodes could write on the same disk for a short period of time which was fixed in the qemu version pve 5.0 shipped.

That was my question, if there are any plans of improvements for the future while using --with-local-disks option compared to the almos instant switch-over on 4.0.

currently im using:
Code:
pveversion -v
proxmox-ve: 5.1-26 (running kernel: 4.13.4-1-pve)
pve-manager: 5.1-36 (running version: 5.1-36/131401db)
pve-kernel-4.13.4-1-pve: 4.13.4-26
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1

Kind Regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!