> 3000 mSec Ping and packet drops with VirtIO under load

Discussion in 'Proxmox VE: Installation and configuration' started by Andreas Piening, Sep 2, 2017.

  1. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    has somebody tested to use kernel 4.4 from proxmox 4 , on your proxmox5 installation ?

    It could be great to known if it's a kernel problem or not.
     
  2. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    Ok, so this should only be relevant in combination with iothread as aderumier explained.
    I will check this out anyways as soon as I can afford the time.

    I think if we can find this one thing that is different on your setup compared to one of our installations where we suffer from this issue, we might get close to find the cause for all of this.
    But your setup is quite different compared to my install: I'm using local ZFS storage instead of NFS. So this is hard to compare.
    I think @micro was using a SAN when he was hit by this issue in the first place. So it is probably not limited to local storage, but it does not affect everyone.
    It makes me wonder that I have this issue because I sticked exactly to the install guide for Windows Server 2016 on PVE and did nothing experimental. I assume ZFS on local storage is very common. In fact I use it on other < 5.0 PVE hosts without a single glitch for quite a while now.
     
  3. micro

    micro Member
    Proxmox Subscriber

    Joined:
    Nov 28, 2014
    Messages:
    58
    Likes Received:
    12
    Correct, I'm using SAN and LVM (raw) for the shared storage of the VMs.
     
  4. Phinitris

    Phinitris Member

    Joined:
    Jun 1, 2014
    Messages:
    83
    Likes Received:
    11
    Hello,
    unfortunately it seems like I have the same issue. Recently migrated two Proxmox VE 3 and 4 nodes into one Proxmox VE 5 node and networking is unstable resulting in lags in applications like Teamspeak and SSH. IO delay is between 6-10% with multiple ZFS thin pools with SSD L2ARC and ZIL.

    VM Configurations:
    - HDD(s): SCSI, IOThread=0, Discard=1
    - Controller: virtio-scsi
    - Network: 10GE VirtIO Linuxbridge or OVS

    The issue even occurs on VMs that are on a different pool with no IO at all.
     
  5. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    @Phinitris Please can you add your setup and the details from your post to this bug report as well? https://bugzilla.proxmox.com/show_bug.cgi?id=1494
    Hopefully it helps us to locate the issue. We still don't have any confirmation from the Proxmox team on if someone was able to reproduce the issue.
     
  6. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    @Phinitris : if you have upgraded from proxmox 4, can you try to boot on his kernel (4.4), to see if it's a kernel bug or not ?
     
  7. Phinitris

    Phinitris Member

    Joined:
    Jun 1, 2014
    Messages:
    83
    Likes Received:
    11
    @Andreas Piening I'm pretty sure the Proxmox Team is aware of the issue but they don't have any clue how to fix this. From my understanding the issue is present if scsi with virtio controller is used.

    @aderumier I did not upgrade from Proxmox 4. It was a clean Proxmox 5 install (with Proxmox ISO) and I just restored my VM backups made earlier.
     
  8. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    Andreas Piening likes this.
  9. Phinitris

    Phinitris Member

    Joined:
    Jun 1, 2014
    Messages:
    83
    Likes Received:
    11
    @aderumier unfortunately the Proxmox server is currently running in my production cluster and I don't want to reboot the node now.
     
  10. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    Yes you were right. I switched back to IDE on my test system after a host reboot but couldn't get the Windows VM back up running. It kept rebooting over and over again. There was something going on with my guest system.
    I removed the disks and added new ones and then the restored the VM and everything was back to normal.

    I tried your settings but the did not have impact on this issue in my case. At least the result was the same as before, a lot of latency and dropped packets / connections.
    I will try your qemu-kvm that you provided as a next step.
     
  11. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    Hi @aderumier, I just installed the .deb you provided and rebooted the host to make sure everything was started with the new version.

    But I get an error when I try to start the VM:
    Code:
    kvm: symbol lookup error: kvm: undefined symbol: rbd_aio_writev
    command 'kvm -version' failed: exit code 127
    TASK ERROR: detected old qemu-kvm binary (unknown)
     
  12. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    micro likes this.
  13. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18

    Ok great ! That mean that it should be a qemu regression. we can focus on it.
     
    Andreas Piening likes this.
  14. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    mmm, that's strange, rbd_aio_writev is a new feature in ceph librbd to improve performance. Not related to our problem, but maybe official proxmox package are build with old lirbd. .

    what is your current librbd ? (dpkg -l|grep librbd).

    maybe it's the one of debian repo.

    you can try to add
    /etc/apt/sources.list.d/ceph.list
    deb http://download.proxmox.com/debian/ceph-luminous stretch main

    apt-get update && apt-get dist-upgrade, it should increase librbd version
     
    Andreas Piening likes this.
  15. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    Andreas Piening likes this.
  16. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    @aderumier Thank you.
    I have installed your version of qemu-kvm:
    Code:
    # dpkg -i pve-qemu-kvm_2.9.1-1_amd64.deb
    (Reading database ... 60826 files and directories currently installed.)
    Preparing to unpack pve-qemu-kvm_2.9.1-1_amd64.deb ...
    Unpacking pve-qemu-kvm (2.9.1-1) over (2.9.1-1) ...
    Setting up pve-qemu-kvm (2.9.1-1) ...
    Processing triggers for man-db (2.7.6.1-2) ...
    I did a reboot of my host and my KVM machine started this time without the error I got before.

    However the issue remains the same.

    Can you post a MD5 for one of the binaries of your package so that I can compare it with my installed version just to make 100% sure the files has been correctly replaced by dpkg?
    However it looks to me that this is not the solution yet.
     
  17. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    #md5sum /usr/bin/kvm
    be19f6834b8486d138f5eb9d90d2477b /usr/bin/kvm

    Ok, so it's not related to the same problem than @hansm


    can you try to install pve-qemu 2.7 from proxmox4 on your proxmox5 install ?

    wget
    http://download.proxmox.com/debian/...n/binary-amd64/pve-qemu-kvm_2.7.1-4_amd64.deb
    wget http://download.proxmox.com/debian/...ion/binary-amd64/libiscsi4_1.15.0-1_amd64.deb
    dpkg -i *.deb

    (old libiscsi is needed as dependency)

    :(
     
    Andreas Piening likes this.
  18. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    Same result here.

    I already thought about installing PVE 4.4 on my test system to check if I get the same issue there. However I would loose me 5.0 test system then and I would not be able to do tests with PVE 5.0 until I reinstall everything again. Which is quite time consuming.
     
  19. aderumier

    aderumier Member

    Joined:
    May 14, 2013
    Messages:
    203
    Likes Received:
    18
    I don't have asked to install full pve4.4, only pve-qemu-kvm package from proxmox 4 on proxmox 5.
     
  20. Andreas Piening

    Joined:
    Mar 11, 2017
    Messages:
    58
    Likes Received:
    7
    No you haven't.
    That was just my thoughts because I really want to get a real solution for this issue even though I can live with IDE at the moment.

    So you suggested to install the pve-qemu-kvm package from PVE 4? I somehow missed that.
    That's a good idea I think I will try that.
    But I don't think it is easy because there might be a lot of dependencies, but we'll see.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice