problem - network slow kvm

cvandeplas

New Member
Nov 9, 2008
7
0
1
christophe.vandeplas.com
I'm facing a difficult networking problem for the second time.
Here's the situation:
- proxmox with mostly openvz machines, but also a kvm machine (playing the role of router/firewall)
- I have 'slow reaction' on the network to my kvm machine. But bandwidth is not a problem. pings vary from 2ms to 1000ms (see below)
- Slow reaction is not present when connecting to openvz machines (ping stable from 1 to 4ms)
- problem present on both IPv4 adn IPv6

Sat 06 Jun 12:03 $ ping vm01 => from laptop to hypervisor = OK
PING vm01.home.xxxxx (10.107.2.240): 56 data bytes
64 bytes from 10.107.2.240: icmp_seq=0 ttl=64 time=4.267 ms
64 bytes from 10.107.2.240: icmp_seq=1 ttl=64 time=0.947 ms
64 bytes from 10.107.2.240: icmp_seq=2 ttl=64 time=0.933 ms

Sat 06 Jun 12:05 $ ping nas01 => from laptop to openvz virtual machine = OK
PING nas01.home.xxxxx (10.107.2.8): 56 data bytes
64 bytes from 10.107.2.8: icmp_seq=0 ttl=64 time=3.919 ms
64 bytes from 10.107.2.8: icmp_seq=1 ttl=64 time=0.952 ms
64 bytes from 10.107.2.8: icmp_seq=2 ttl=64 time=0.844 ms

Sat 06 Jun 12:05 $ ping gw01 => from laptop to KVM firewall = NOK
PING gw01.home.xxxxx (10.107.2.254): 56 data bytes
64 bytes from 10.107.2.254: icmp_seq=0 ttl=64 time=1000.727 ms
64 bytes from 10.107.2.254: icmp_seq=1 ttl=64 time=1000.851 ms
64 bytes from 10.107.2.254: icmp_seq=2 ttl=64 time=1.910 ms
64 bytes from 10.107.2.254: icmp_seq=3 ttl=64 time=1001.805 ms
64 bytes from 10.107.2.254: icmp_seq=4 ttl=64 time=999.146 ms
64 bytes from 10.107.2.254: icmp_seq=5 ttl=64 time=82.865 ms
64 bytes from 10.107.2.254: icmp_seq=6 ttl=64 time=15.473 ms

From cabling perspective everything passes on the same switches and cables.

I did some network sniffing on my laptop, on the bridge of the hypervisor/vm01 and on the gw01 (kvm machine).
The results are:
- my laptop sends a packet on the network at T
- my laptop sees reply at T + ~1000ms (long delay)
- vm01 sees packet arrive on vmbr2 at T
- vm01 sees reply at T + ~1000ms (long delay)

- gw01 sees packet arrive on machine at T + ~900ms
- gw01 sends reply immediately (no delay)

=> Conclusion: Delay is between the hypervisor and the KVM virtual machine.

A month ago I had exactly the same behavior. I fixed it by reinstalling my VM from scratch (I was performing a network migration at that time).
For 1 or 2 weeks there were no problems (not that I noticed), but now it's starting again...


Something about the gw01 configuration:
- 5 virtio network interfaces
- virtio disk
- ubuntu linux 9.04 kernel 2.6.28-11-server, up2date
- slow behavior is on every interface of the machine (from and to every dmz)

The hypervisor/vm01:
- standard proxmox install, no extra repo, up2date


I'm reaching the limit of my knowledge and don't know where to continue debugging...
I could re-create a new vm, but that's again a temporary solution and still doesn't explain the cause.


Thanks for your help.
Any ideas for possible debugging are welcome
 
Deleting the virtio interfaces and replacing them by rtl8139 seems to be a valid workaround. (currently)

But it still doesn't explain what the cause is... :'(

Unfortunately all ways to do troubleshooting are gone now

thanks
 
Deleting the virtio interfaces and replacing them by rtl8139 seems to be a valid workaround. (currently)

But it still doesn't explain what the cause is... :'(

Unfortunately all ways to do troubleshooting are gone now

thanks

I've been trying to figure out what causes this problem for a long time now.

I'm not 100% sure yet (it usually takes few hours up to several days until the problem happens), but the cause seems to be "fairsched.diff" patch which is applied to vanilla KVM sources.

I compiled KVM without this patch and I don't observer any slowness for 2 days now with virtio network drivers. If the network doesn't slow down for a week, I'll post an update here.
 
I've been trying to figure out what causes this problem for a long time now.

I'm not 100% sure yet (it usually takes few hours up to several days until the problem happens), but the cause seems to be "fairsched.diff" patch which is applied to vanilla KVM sources.

I compiled KVM without this patch and I don't observer any slowness for 2 days now with virtio network drivers. If the network doesn't slow down for a week, I'll post an update here.

Mangoo.. did the slowness return or was the patch the fix?
 
No, the "slowness" for guests using virtio_net is not solved in Proxmox VE 1.3 (and the problem is still there in the latest beta of 1.4, too).

I got the same problem at Proxmox VE 1.4.
Machine is Dell 2950 III
net is Broadcom 5708
 
Same problem here with pve 1.7:
pveversion -v
pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-30
pve-kernel-2.6.32-4-pve: 2.6.32-30
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4

The extreme high latency with virtio makes this unusable.
 
I have exactly the same troubles. A VM with 2 vCPUs and virtio-net has an extremely bad network performance:
Code:
root@server1:~# iperf -i 10 -m -t 120 -c server2.tobru.ch
------------------------------------------------------------
Client connecting to james.tobru.ch, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.10 port 36767 connected with 10.0.0.4 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  47.4 MBytes  39.7 Mbits/sec
[  3] 10.0-20.0 sec  46.3 MBytes  38.9 Mbits/sec
[  3] 20.0-30.0 sec    164 MBytes    138 Mbits/sec
[  3] 30.0-40.0 sec  98.4 MBytes  82.5 Mbits/sec
[  3] 40.0-50.0 sec    102 MBytes  85.4 Mbits/sec
[  3] 50.0-60.0 sec    115 MBytes  96.3 Mbits/sec
[  3] 60.0-70.0 sec  86.2 MBytes  72.3 Mbits/sec
[  3] 70.0-80.0 sec    112 MBytes  93.7 Mbits/sec
[  3] 80.0-90.0 sec    127 MBytes    107 Mbits/sec
[  3] 90.0-100.0 sec  22.2 MBytes  18.6 Mbits/sec                                                                                                   
[  3] 100.0-110.0 sec  13.2 MBytes  11.0 Mbits/sec                                                                                                  
[  3] 110.0-120.0 sec  57.3 MBytes  48.1 Mbits/sec                                                                                                  
[  3]  0.0-120.0 sec    990 MBytes  69.2 Mbits/sec                                                                                                  
[  3] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
Reconfiguring the VM to use only 1 vCPU, the network performance is much better:
Code:
root@server1:~# iperf -i 10 -m -t 120 -c server2.tobru.ch
------------------------------------------------------------
Client connecting to james.tobru.ch, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.10 port 56966 connected with 10.0.0.4 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec    510 MBytes    428 Mbits/sec
[  3] 10.0-20.0 sec    911 MBytes    765 Mbits/sec
[  3] 20.0-30.0 sec    921 MBytes    773 Mbits/sec
[  3] 30.0-40.0 sec    709 MBytes    594 Mbits/sec
[  3] 40.0-50.0 sec    654 MBytes    548 Mbits/sec
[  3] 50.0-60.0 sec    920 MBytes    772 Mbits/sec
[  3] 60.0-70.0 sec    886 MBytes    744 Mbits/sec
[  3] 70.0-80.0 sec    977 MBytes    819 Mbits/sec
[  3] 80.0-90.0 sec  1016 MBytes    852 Mbits/sec
[  3] 90.0-100.0 sec    737 MBytes    618 Mbits/sec
[  3] 100.0-110.0 sec    890 MBytes    747 Mbits/sec
[  3] 110.0-120.0 sec    913 MBytes    766 Mbits/sec
[  3]  0.0-120.0 sec  9.81 GBytes    702 Mbits/sec
[  3] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
That's to bad, as I'd like to give the VM 2 vCPUs.

Code:
pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.7-9
pve-kernel-2.6.35-1-pve: 2.6.35-9
pve-kernel-2.6.24-5-pve: 2.6.24-6
pve-kernel-2.6.24-1-pve: 2.6.24-4
pve-kernel-2.6.24-2-pve: 2.6.24-5
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4
 
also post your VMID.conf (for both) and let us know which OS and which drivers do you use inside? windows?
 
hello,

it happens on Linux and Windows. Linux Guest is Ubuntu 10.04.1 x64 Server with default installation. Windows is 2003 x86 Standard Edition with the driver from "virtio-win-1.1.16.iso".

We didn't make any throughput tests because latency of 1s (1000ms) was not usable for our software.

for example one configuration of the linux guest:
name: <removed>
ide2: local:iso/ubuntu-10.04.1-server-amd64.iso,media=cdrom
vlan0: virtio=4A:7F:B3:6A:29:6D
bootdisk: virtio0
virtio0: local:10044/vm-10044-disk-1.raw,cache=writeback
ostype: l26
memory: 3072
sockets: 1
onboot: 0
description: <removed>
cores: 2

On one Windows guest (2003 with Active Directory) even a few services didn't start on reboot with virtio network driver (Net Logon, Computer Browser, DFS).

esco
 
regarding your iperf setup: what is server1 and what is server2? (host or guest, running on which host)
 
regarding your iperf setup: what is server1 and what is server2? (host or guest, running on which host)
sorry for the little information =(
* server1 is a physical server
* server2 is a kvm-vm running on an IBM x3400 Server (NOT server1!)
 
I did some testing here and I got similar results using two server (about 3 years old) with intel entry level server boards with xeon X3210.

But I did these test on a more modern Intel Modular Server - no performance loss using more than one vCPU in the guest - I also tested a KVM guest with 2 x 4 (8 vCPU´s).
 
I did some testing here and I got similar results using two server (about 3 years old) with intel entry level server boards with xeon X3210.

But I did these test on a more modern Intel Modular Server - no performance loss using more than one vCPU in the guest - I also tested a KVM guest with 2 x 4 (8 vCPU´s).
My IBM server is also not the newest model, it's 3 years old. The processor is a Intel Quad-Core Intel Xeon E5320 1.86 GHz (8MB L2 cache) and I have 2 250GB SATA Disks (Mirror). So perhaps I have to upgrade my hardware for more perfomance =(
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!