> 3000 mSec Ping and packet drops with VirtIO under load

has somebody tested to use kernel 4.4 from proxmox 4 , on your proxmox5 installation ?

It could be great to known if it's a kernel problem or not.
 
My understanding is that it is one controller per disk.
Ok, so this should only be relevant in combination with iothread as aderumier explained.
I will check this out anyways as soon as I can afford the time.

I think if we can find this one thing that is different on your setup compared to one of our installations where we suffer from this issue, we might get close to find the cause for all of this.
But your setup is quite different compared to my install: I'm using local ZFS storage instead of NFS. So this is hard to compare.
I think @micro was using a SAN when he was hit by this issue in the first place. So it is probably not limited to local storage, but it does not affect everyone.
It makes me wonder that I have this issue because I sticked exactly to the install guide for Windows Server 2016 on PVE and did nothing experimental. I assume ZFS on local storage is very common. In fact I use it on other < 5.0 PVE hosts without a single glitch for quite a while now.
 
Hello,
unfortunately it seems like I have the same issue. Recently migrated two Proxmox VE 3 and 4 nodes into one Proxmox VE 5 node and networking is unstable resulting in lags in applications like Teamspeak and SSH. IO delay is between 6-10% with multiple ZFS thin pools with SSD L2ARC and ZIL.

VM Configurations:
- HDD(s): SCSI, IOThread=0, Discard=1
- Controller: virtio-scsi
- Network: 10GE VirtIO Linuxbridge or OVS

The issue even occurs on VMs that are on a different pool with no IO at all.
 
@Andreas Piening I'm pretty sure the Proxmox Team is aware of the issue but they don't have any clue how to fix this. From my understanding the issue is present if scsi with virtio controller is used.

@aderumier I did not upgrade from Proxmox 4. It was a clean Proxmox 5 install (with Proxmox ISO) and I just restored my VM backups made earlier.
 
it's 100% unrelated. Note that if you change disk from ide->scsi, scsi->ide, you need to change boot drive each in vm option.
Yes you were right. I switched back to IDE on my test system after a host reboot but couldn't get the Windows VM back up running. It kept rebooting over and over again. There was something going on with my guest system.
I removed the disks and added new ones and then the restored the VM and everything was back to normal.

I tried your settings but the did not have impact on this issue in my case. At least the result was the same as before, a lot of latency and dropped packets / connections.
I will try your qemu-kvm that you provided as a next step.
 
BTW, I have build last pve-qemu-kvm with patch for@hansm bug. (which is virtio related, so maybe it could improve performance too)

http://odisoweb1.odiso.net/pve-qemu-kvm_2.9.1-1_amd64.deb
Hi @aderumier, I just installed the .deb you provided and rebooted the host to make sure everything was started with the new version.

But I get an error when I try to start the VM:
Code:
kvm: symbol lookup error: kvm: undefined symbol: rbd_aio_writev
command 'kvm -version' failed: exit code 127
TASK ERROR: detected old qemu-kvm binary (unknown)
 
Hi @aderumier, I just installed the .deb you provided and rebooted the host to make sure everything was started with the new version.

But I get an error when I try to start the VM:
Code:
kvm: symbol lookup error: kvm: undefined symbol: rbd_aio_writev
command 'kvm -version' failed: exit code 127
TASK ERROR: detected old qemu-kvm binary (unknown)

mmm, that's strange, rbd_aio_writev is a new feature in ceph librbd to improve performance. Not related to our problem, but maybe official proxmox package are build with old lirbd. .

what is your current librbd ? (dpkg -l|grep librbd).

maybe it's the one of debian repo.

you can try to add
/etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-luminous stretch main

apt-get update && apt-get dist-upgrade, it should increase librbd version
 
  • Like
Reactions: Andreas Piening
Hi, I have rebuild my package with librbd 10.2.5, like proxmox5, so you don't need last ceph librairies to get it work and test
http://odisoweb1.odiso.net/pve-qemu-kvm_2.9.1-1_amd64.deb

@aderumier Thank you.
I have installed your version of qemu-kvm:
Code:
# dpkg -i pve-qemu-kvm_2.9.1-1_amd64.deb
(Reading database ... 60826 files and directories currently installed.)
Preparing to unpack pve-qemu-kvm_2.9.1-1_amd64.deb ...
Unpacking pve-qemu-kvm (2.9.1-1) over (2.9.1-1) ...
Setting up pve-qemu-kvm (2.9.1-1) ...
Processing triggers for man-db (2.7.6.1-2) ...
I did a reboot of my host and my KVM machine started this time without the error I got before.

However the issue remains the same.

Can you post a MD5 for one of the binaries of your package so that I can compare it with my installed version just to make 100% sure the files has been correctly replaced by dpkg?
However it looks to me that this is not the solution yet.
 
@aderumier Thank you.
I have installed your version of qemu-kvm:
Code:
# dpkg -i pve-qemu-kvm_2.9.1-1_amd64.deb
(Reading database ... 60826 files and directories currently installed.)
Preparing to unpack pve-qemu-kvm_2.9.1-1_amd64.deb ...
Unpacking pve-qemu-kvm (2.9.1-1) over (2.9.1-1) ...
Setting up pve-qemu-kvm (2.9.1-1) ...
Processing triggers for man-db (2.7.6.1-2) ...
I did a reboot of my host and my KVM machine started this time without the error I got before.

However the issue remains the same.

Can you post a MD5 for one of the binaries of your package so that I can compare it with my installed version just to make 100% sure the files has been correctly replaced by dpkg?
However it looks to me that this is not the solution yet.

#md5sum /usr/bin/kvm
be19f6834b8486d138f5eb9d90d2477b /usr/bin/kvm

Ok, so it's not related to the same problem than @hansm


can you try to install pve-qemu 2.7 from proxmox4 on your proxmox5 install ?

wget [URL]http://download.proxmox.com/debian/pve/dists/jessie/pve-no-subscription/binary-amd64/pve-qemu-kvm_2.7.1-4_amd64.deb[/URL]
wget [URL]http://download.proxmox.com/debian/pve/dists/jessie/pve-no-subscription/binary-amd64/libiscsi4_1.15.0-1_amd64.deb[/URL]
dpkg -i *.deb

(old libiscsi is needed as dependency)

:(
 
  • Like
Reactions: Andreas Piening
#md5sum /usr/bin/kvm
be19f6834b8486d138f5eb9d90d2477b /usr/bin/kvm

Same result here.

I already thought about installing PVE 4.4 on my test system to check if I get the same issue there. However I would loose me 5.0 test system then and I would not be able to do tests with PVE 5.0 until I reinstall everything again. Which is quite time consuming.
 
Same result here.

I already thought about installing PVE 4.4 on my test system to check if I get the same issue there. However I would loose me 5.0 test system then and I would not be able to do tests with PVE 5.0 until I reinstall everything again. Which is quite time consuming.

I don't have asked to install full pve4.4, only pve-qemu-kvm package from proxmox 4 on proxmox 5.
 
I don't have asked to install full pve4.4, only pve-qemu-kvm package from proxmox 4 on proxmox 5.
No you haven't.
That was just my thoughts because I really want to get a real solution for this issue even though I can live with IDE at the moment.

So you suggested to install the pve-qemu-kvm package from PVE 4? I somehow missed that.
That's a good idea I think I will try that.
But I don't think it is easy because there might be a lot of dependencies, but we'll see.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!