> 3000 mSec Ping and packet drops with VirtIO under load

Andreas Piening · Sep 2, 2017

I'm running PVE 5.0-30 with two KVM machines with Windows 2016 Servers installed.
I did everything like computerized William explained in this video https://www.proxmox.com/de/training/video-tutorials/item/install-windows-2016-server-on-proxmox-ve.
So I use ZFS and VirtIO for storage (SCSI) and network. The ISO with the drivers is "virtio-win-0.1.126.iso" which was the most recent one as I downloaded it like 8 weeks ago.

Both systems are running stable. However if I put load on the network like a full backup from inside the VM over the network or a Datev DB check (which basically accesses the DB from a network share) my network gets unstable: While normal usage I have ping response times between 20-35 mSec on my local bridge and while putting load on the network it is between 600 - 4.000 mSec and sometimes dropping packets for several seconds. My RDP connection gets dropped and reconnection attempts do fail.
After the network load is over, everything is fine and stable again. No network hiccups and smooth RDP. Even copying several GBs over the network with SMB is no problem.

I never had this behavior and the only difference to my other installations is the newer PVE (5.0 instead 4.4) and Windows Server 2016 instead of 2012 but I don't believe this is a general issue.
I'm speculating: Probably the VirtIO drivers I use for networking are the issue? Has Someone running Windows Server 2016 with stable network even under load? Which version are you using for VirtIO. My first idea is to replace the network drivers.

All other suggestions are welcome.
Is it a good idea to disable IPv6 completely when it is a IPv4 only network? Or to disable QoS and topology protocol layers from the network card configuration?

Kind regards

Andreas

Andreas Piening · Sep 3, 2017

I thought about it and should add a few details about my network setup:
The NICs of both Windows Server 2016 VMs are connected to a bridge:

Code:

brctl show vmbr1
bridge name    bridge id        STP enabled    interfaces
vmbr1        8000.aace816c169a    no        tap0
                                            tap100i0
                                            tap101i0

tap0 is a OpenVPN server tap device and the other two are the VMs.
So there is no physical device attached to this particular bridge. The NICs are shows as 10 GBit devs inside the VMs. So could it just be that the VMs can send quicker than the bridge can handle the traffic?
Is it a good idea to us the "Rate Limit" on the NICs in the PVE configuration? Since the "outside" is connected via WAN / OpenVPN it would probably never exceed 300 MBits and this would still be enough for backups and everything else.

Someone with experience on this? Am I thinking in the wrong direction?

aderumier · Sep 3, 2017

>>While normal usage I have ping response times between 20-35 mSec on my local bridge
Is it really from host bridge to to vm ?
I'm around 0.1ms from host bridge to guest vm.

more than 1ms locally is really anormal.

micro · Sep 3, 2017

Not exactly the same setup (linux guests here), but I notice too in PVE 5 there are some huge network hiccups with virtio networks on virtio disk load. IO wait induce huge network latency (1000-2000ms) and packet reordering.

Andreas Piening · Sep 3, 2017

aderumier said:
>>While normal usage I have ping response times between 20-35 mSec on my local bridge
Is it really from host bridge to to vm ?
I'm around 0.1ms from host bridge to guest vm.

more than 1ms locally is really anormal.

No you are right: Same here. My ping values was from my local DSL line through OpenVPN to the bridge. Getting 0.13 mSec response time on the local bridge from the PVE host.

Andreas Piening · Sep 3, 2017

micro said:
Not exactly the same setup (linux guests here), but I notice too in PVE 5 there are some huge network hiccups with virtio networks on virtio disk load. IO wait induce huge network latency (1000-2000ms) and packet reordering.

Oh this makes sense: It happens especially when I do a backup which has high IO and network load at the same time.
Is this a "official" issue? Is there a bug opened for that?
I wonder which component introduces the issue: KVM version?
Are there any workarounds known that can make it less bad? Have you tried to throttle IO?
Would be quite painfull for me to switch back to PVE 4.4 because I noticed the problem right after going productive with the system.

I tried to throttle the network to 30 MB / sec. but I did not notice that much of a difference. My RDP sessions are getting still dropped while I'm doing a backup or other IO intense things.

aderumier · Sep 4, 2017

Note that qemu share a single thread by default, for disk,nic,....

maybe can you try to enable iothread on disk ? (virtio or scsi + virtio-scsi-single controller).
(note that it's not yet compatible with proxmox backup)

micro · Sep 4, 2017

Andreas Piening said:
Is this a "official" issue? Is there a bug opened for that?

This is exactly what I'm wondering too. I was forced to emergency move elasticsearch from the guest machine because any 10-20 seconds there were network ping hiccups because of elasticsearch writing its shards to the disk, and they were not really big IO writes. Everything (disk/network) is virtio on the guests and the storage is SAN with about 600 fsync/s. I tried different cache modes - no cache, directsync, writethrough - no difference.

micro · Sep 4, 2017

Some iostat and ping logs to show what is happening during the hiccups (this VM is a router, the ping is to hosts behind it):

64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15980 ttl=245 time=3.61 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15981 ttl=245 time=4.32 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15982 ttl=245 time=4.17 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15983 ttl=245 time=4.43 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15984 ttl=245 time=4.23 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15988 ttl=245 time=1022 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15987 ttl=245 time=1239 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15989 ttl=245 time=830 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15993 ttl=245 time=16.5 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15992 ttl=245 time=220 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15986 ttl=245 time=1449 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15990 ttl=245 time=629 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15985 ttl=245 time=1657 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15991 ttl=245 time=433 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15994 ttl=245 time=3.82 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15995 ttl=245 time=4.32 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15996 ttl=245 time=4.17 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15997 ttl=245 time=4.08 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15998 ttl=245 time=3.83 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=15999 ttl=245 time=3.26 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=16000 ttl=245 time=5.12 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=16001 ttl=245 time=4.47 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=16002 ttl=245 time=4.16 ms
64 bytes from somehost.com (xx.xx.xx.xx): icmp_seq=16003 ttl=245 time=4.67 ms

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.16 0.00 0.00 0.00 0.00 16.00

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 12.00 0.00 1.00 0.00 4.00 8.00 1.81 1012.00 0.00 1012.00 1000.00 100.00

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 1.00 0.00 5.00 0.00 400.00 160.00 1.27 446.40 0.00 446.40 184.80 92.40

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Andreas Piening · Sep 4, 2017

Looks similar to my ping tests but I get even over 8.000 mSec and dropped packets while doing a backup job from inside the VM (not PVE backup).
I can't try the io-thread option at the moment because the system is used during the working hours. But I will try it tonight.

@micro Have you tried this option already?

micro · Sep 4, 2017

I didn't. I'm not sure this would help in my case. I have running iostat on the cluster nodes. On all of them during this surge there is 100% utilization of the SAN for a 2-3 seconds. I don't know why is this - I don't have any big writes which can make the SAN utilization 100%

But the thing I'm wondering right now is why CPU iowait in guest vm (because of the host's SAN 100% utilization spikes) is delaying the routing/forwarding/shaping services provided by the guest? Is this behavior normal ? Hope somebody from Proxmox staff can answer to this question?

Andreas Piening · Sep 4, 2017

aderumier said:
Note that qemu share a single thread by default, for disk,nic,....

maybe can you try to enable iothread on disk ? (virtio or scsi + virtio-scsi-single controller).
(note that it's not yet compatible with proxmox backup)

Good point, and I really crossed my fingers for this to help, but it did not. Same issue with iothread enabled for both virtual disks.
I did a ping test this time while booting the machine and noticed dropped packets and ping response times over 4.000 seconds. Even starting applications on the terminal server are causing ping times to raise to multiple seconds. It is even worse than I thought in the first place.
I don't know what to do next, since I can't downgrade PVE and I don't have a second server to migrate.

Andreas Piening · Sep 4, 2017

I have just created a bug report since this is a serious issue and I have no ideas left what I can try: https://bugzilla.proxmox.com/show_bug.cgi?id=1494

mac.linux.free · Sep 4, 2017

did you try ovs on your host?

Andreas Piening · Sep 4, 2017

mac.linux.free said:
did you try ovs on your host?

No. Just a simple bridge setup. I don't know much about Open vSwitch but I just want my network to be reliable under load.

mac.linux.free · Sep 4, 2017

Andreas Piening said:
No. Just a simple bridge setup. I don't know much about Open vSwitch but I just want my network to be reliable under load.

it is really simple to try and to switch back if it's not working for you...for me it is working really well on all my pve hosts

https://pve.proxmox.com/wiki/Open_vSwitch

Andreas Piening · Sep 4, 2017

mac.linux.free said:
it is really simple to try and to switch back if it's not working for you...for me it is working really well on all my pve hosts

https://pve.proxmox.com/wiki/Open_vSwitch

Sounds interesting. However I don't think it is related to my problem: There is nothing bad about a bridge setup if I don't need additional switching features. I have this setup running on a few other PVE installs with 4.4 and previous versions and never had a problem with that.
My network is completely unusable when I have IO load in my VMs, so that I can't even ping the VM directly from the bridge anymore. I can't see that additional switching topology can make this better. And my system is connected to two sites via OpenVPN it would be too much effort to change the whole network setup.
Thank you anyway.

mac.linux.free · Sep 4, 2017

Andreas Piening said:
Sounds interesting. However I don't think it is related to my problem: There is nothing bad about a bridge setup if I don't need additional switching features. I have this setup running on a few other PVE installs with 4.4 and previous versions and never had a problem with that.
My network is completely unusable when I have IO load in my VMs, so that I can't even ping the VM directly from the bridge anymore. I can't see that additional switching topology can make this better. And my system is connected to two sites via OpenVPN it would be too much effort to change the whole network setup.
Thank you anyway.

I see, where are your from? I'm from Stuttgart. Perhaps I can help you.

Andreas Piening · Sep 4, 2017

mac.linux.free said:
I see, where are your from? I'm from Stuttgart. Perhaps I can help you.

I'm from Hamburg. We probably both speak german, right? However, for a personal conversation let's choose PM or something else, I want to stay on topic in this thread.

micro · Sep 5, 2017

mac.linux.free said:
did you try ovs on your host?

I'm using OVS. Still have the issue.

> 3000 mSec Ping and packet drops with VirtIO under load

Well-Known Member

Well-Known Member

Well-Known Member

Renowned Member

Well-Known Member

Well-Known Member

Well-Known Member

Renowned Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member