Search results

  1. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    spirit, good ponts. Will test them soon. Two of your assumptions i already planned for today experiments. Concerning clock sync. When i enabled '-d qemu_error' debug option, there indeed was warning in qemu.log about small TSC difference, if i remember correctly.
  2. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    yes, but this process takes some time, about halfsecond. As i said, In PVE 4 VM becomes running before this command executed. And it is impossible to get downtime ~20 ms if you are waiting for such commands. This is why i think it is obvious bug. Now to emulate behavior of PVE 4, i send 'cont'...
  3. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    I also thought that it would be that easy =). But i already tried to comment out storage checks and already upgraded pve-qemu-kvm to 2.9 on PVE 4. No result. Adding 'qm resume' command to QemuMigrate.pm right after migration in "phase 2" helps, but not completely. 'qm' tool is not fast enough...
  4. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    spirit, I had to get my hands dirty with nice PVE perl code and not so nice qemu qmp sokcets protocol and mostly found сause of the problem. On target host VM starts in paused state. Then, in version 4 after migration process finishes, VM state transfers from '"VM status: paused (inmigrate)' to...
  5. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    i've already tryed setting insecure option in migration parameters and did not find noticeable difference
  6. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    Gerhard, unfortunately this is not just network problem, but VM is really frozen for some seconds insted of few dozens of milliseconds. Ping is just simple way to demonstrate the bug.
  7. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    You have 2 packets lost: in my cluster i have 4 ping lost, but 2 is too much anyway. If you try same experiment on PVE 4, there will be maximum 1 ping lost. Most of a time no loss at all. I now have 2 nested clusters with identical configuration, but different versions. This is stats in PVE 4...
  8. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    Gerhard, can you please check localtime or ntp offset inside VM before and after migration, or how many ICMP pings to VM was lost during migration?
  9. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    Did it. Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)
  10. P

    [SOLVED] PVE 5 Live migration downtime degradation (2-4 sec)

    The problem found in thread https://forum.proxmox.com/threads/slow-livemigration-performance-since-5-0.35522/, but there discussed live migration with local disks. To not hijack that thread, I open new one. In PVE 5.0 after live migration with shared storage, VM hangs for 2-4 seconds. It can be...
  11. P

    Slow livemigration performance since 5.0

    Confirming clock skew on migrated VM even on shared storage Migrating VM 9004 10.0.7.149. p@dev:[~]0$ pssh -l root -H 10.0.7.149 -H somehost01 -i date [1] 11:46:09 [SUCCESS] somehost01 Thu Jul 20 11:46:09 MSK 2017 [2] 11:46:09 [SUCCESS] 10.0.7.149 Thu Jul 20 11:46:09 MSK 2017 p@dev:[~]0$ ssh...
  12. P

    Slow livemigration performance since 5.0

    Wolfgang, i send broadcast arp requests, that go to all ports. I register all requests on destination host nic, so i'm sure that frames reach the host. I see ARP request on nic and no ARP reply on same nic, there no switch included, except virtual one on Proxmox. And after delay replies appeared.
  13. P

    Slow livemigration performance since 5.0

    Adding some clarification. Presence of virtual interface in my previous post may be irrelevant to physical network, so i made network diagnostics. I migrate VM 9006(1.1.1.111) from 1.1.1.100 to 1.1.1.102 and do arping requests from 1.1.1.101. IP and MAC addresses a not real. Here is migration...
  14. P

    Slow livemigration performance since 5.0

    This is migration log of VM 9006: 2017-07-19 11:10:59 migration speed: 1024.00 MB/s - downtime 12 ms 2017-07-19 11:10:59 migration status: completed 2017-07-19 11:11:00 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=somehost' root@1.1.1.1 pvesr set-state 9006 \''{}'\' 2017-07-19 11:11:04...
  15. P

    Slow livemigration performance since 5.0

    Yes it is possible. This is why i mentioned that pinging from the same switch (L2 and no routes and this is really good fast 10G switch and little network traffic and no congestion), and that in PVE 5.0 beta2 were no delays.
  16. P

    Slow livemigration performance since 5.0

    Can confirm problem. The bug appeared in 5.0 with introduction of storage replication feature (and pvesr). I gueess it was update of pve-manager from 5.0-10 to 5.0-23. In PVE 5.0 beta2 live migrathion was fast as in PVE 4 with maximum 1 ping packet lost. Problem reprodused on shared storage too...
  17. P

    [SOLVED] How to verify the IP address of vm

    Instead of dumping traffic or scanning network, one can install Qemu Guest Agent https://pve.proxmox.com/wiki/Qemu-guest-agent http://wiki.qemu.org/Features/GuestAgent. I guess it's more reliable and correct way. From PVE Host Agent can be requested through socket in /var/run/qemu-server. For...
  18. P

    Proxmox VE 5.0 beta2 released!

    After clean install with "pveceph init" it didn't work, bacause admin caps were client.admin key: c29tZWtleXNvbWVrZXlzb21la2V5c29tZWtleQo= auid: 0 caps: [mds] allow caps: [mon] allow * caps: [osd] allow * But in Ceph 12 "caps: [mgr] allow *" also needed. Maybe in your...
  19. P

    Proxmox VE 5.0 beta2 released!

    It is easy to reproduce. Just deploy PVE 5.0 and Ceph server 12.03 by official Proxmox tutorial or video tutorial and run "ceph pg dump". You will reseive error "Error EACCES: access denied" did it, Bug 1430
  20. P

    Proxmox VE 5.0 beta2 released!

    Just installed new PVE 5.0 beta2 cluster with Ceph storage. Works good. You may want add in pveceph deploy scripts 'caps mgr = "allow *"' to client.admin keyring to adapt it to Ceph Luminous, as stated in this bugreport tracker.ceph.com/issues/20296. It permits using 'ceph pg' commands. And...