Guest losing connectivity under load

tomte76

Member
Mar 6, 2015
26
0
21
Hi,

I'm using Proxmox 3.4-11 with ZFS storage in a 4 node cluster. I repeatedly lose connectivity to certain VMs under significant network load. In the actual case the VM has a NFS mount and was performing a lot of IO. Suddenly the VM becomes unreachable. It is not pingable from the external network any more. Also it is not possible to reach the external network from inside the VM.

I did an tcpdump on the VM's tapi0 device but I cannot see the ICMP packets the VM creates. Also I cannot find pings from the external network to the VM on the Tap device. I can still see ARP-requests for the VM's IP-adresse on the tap device and on the vmbr-bridge it is connected to. AND I can also see the arp-requests inside the VM on the eth0 device. But the VM does not answer this requests any more. Also I can see valid ARP entries from hosts on the external network inside the VMs ARP-table but I cannot reach the systems. If I send a ping to a system with valid ARP entry inside the disconnected VM I will only see the ARP requests from the ping-target for the VM's IP adress on vmbr, tap or the VMs eth0. But nothing else.

I tried to ifup/ifdown the interface inside the disconnected VM without success. The only chance to solve it at the moment is to reboot the VM. Other VMs on the same vmbr-bridge are not affected. The guest-os ist Debian Jessie AMD64 or Debian Wheezy AMD64. It happened on both versions. The network device type is virtio. I'll now try to switch to E1000. There is no regarding information in the hosts's or the guest's dmesg, kern.log, syslog, messages or else.

Any other ideas what I can do to track done or solve the problem? Thank you.
 
> In the actual case the VM has a NFS mount and was performing a lot of IO.

Do you mean the VM disk image is hosted on NFS ?

Could you reproduce the problem with the e1000 drivers ?
 
Hi Manu,

no, the VMs have local ZVOL as storage. The problem seems to be the same as in

http://forum.proxmox.com/threads/24189-linux-VMs-using-virtio_net-lose-network-connectivity

Unfortunately I cannot reproduce it at all. We tried hard, but without successs. It just happens sometimes, rendering the VM unuseable until reboot. At the moment we are using the e1000 driver instead of virtio and we had no further incidents. But the performance of the e1000 is significantly slower the virtio. Especially if VMs on the same host are communicating. If we move to the 3.10 kernel on proxmox 3.x we will loose openvz support? Did i get this right?

Thank you.
 
Hum last time I compared virtio vs e1000 , guest + host with a 4.2 kernel, with iperf, bandwith was similar, but that was between a guest and a outside server, I got for both something around 700 Mbs on a Gigabit link IIRC.
Do you have some numbers for guest to guest communication ?

Yes if you go with the 3.10 kernel, you will lose OpenVZ containers. But if you go to PVE 4.0, you have a supported upgrade path from OpenVZ to LXC containers:
https://pve.proxmox.com/wiki/Convert_OpenVZ_to_LXC
 
Customer-specific performance test using dd to copy files from NFS-Mount to /dev/null. Customer defined block-size of which he thinks it best fits his needs. Before running the dd commands the buffer-caches of the nfs-server and the nfs-client and also the host-systems ZFS ARC is flushed.

Using virtio:
-----------------

root@target:/serv/temp# for i in `ls -1`; do dd if=$i of=/dev/null bs=1M; done
167+1 records in
167+1 records out
175307283 bytes (175 MB) copied, 2.41433 s, 72.6 MB/s
38+1 records in
38+1 records out
40262315 bytes (40 MB) copied, 0.699333 s, 57.6 MB/s
207+1 records in
207+1 records out
217512589 bytes (218 MB) copied, 4.01664 s, 54.2 MB/s
71+1 records in
71+1 records out
75169566 bytes (75 MB) copied, 3.05221 s, 24.6 MB/s
97+1 records in
97+1 records out
102647454 bytes (103 MB) copied, 1.07482 s, 95.5 MB/s
191+1 records in
191+1 records out
201034279 bytes (201 MB) copied, 3.22942 s, 62.3 MB/s
38+1 records in
38+1 records out
40347351 bytes (40 MB) copied, 0.839057 s, 48.1 MB/s
220+1 records in
220+1 records out
231110337 bytes (231 MB) copied, 2.38687 s, 96.8 MB/s
214+1 records in
214+1 records out
225300432 bytes (225 MB) copied, 2.52321 s, 89.3 MB/s
233+1 records in
233+1 records out
244673594 bytes (245 MB) copied, 2.56708 s, 95.3 MB/s
246+1 records in
246+1 records out
258095453 bytes (258 MB) copied, 1.67509 s, 154 MB/s
215+1 records in
215+1 records out
225469663 bytes (225 MB) copied, 5.27471 s, 42.7 MB/s
239+1 records in
239+1 records out
251095329 bytes (251 MB) copied, 2.84699 s, 88.2 MB/s
238+1 records in
238+1 records out
250124036 bytes (250 MB) copied, 5.21174 s, 48.0 MB/s
22+1 records in
22+1 records out
23365610 bytes (23 MB) copied, 0.120749 s, 194 MB/s
229+1 records in
229+1 records out
240892401 bytes (241 MB) copied, 2.34313 s, 103 MB/s
232+1 records in
232+1 records out
244197338 bytes (244 MB) copied, 3.24915 s, 75.2 MB/s
214+1 records in
214+1 records out
225034821 bytes (225 MB) copied, 2.95121 s, 76.3 MB/s
261+1 records in
261+1 records out
273693189 bytes (274 MB) copied, 4.93417 s, 55.5 MB/s
76+1 records in
76+1 records out
80410992 bytes (80 MB) copied, 0.31111 s, 258 MB/s
7+1 records in
7+1 records out
7664076 bytes (7.7 MB) copied, 0.370532 s, 20.7 MB/s
1+1 records in
1+1 records out
1427098 bytes (1.4 MB) copied, 0.0658811 s, 21.7 MB/s
38+1 records in
38+1 records out
40522736 bytes (41 MB) copied, 0.276031 s, 147 MB/s
88+1 records in
88+1 records out
92377637 bytes (92 MB) copied, 0.770892 s, 120 MB/s
86+1 records in
86+1 records out
90224950 bytes (90 MB) copied, 0.958323 s, 94.1 MB/s
98+1 records in
98+1 records out
103141991 bytes (103 MB) copied, 0.765951 s, 135 MB/s
4+1 records in
4+1 records out
4396403 bytes (4.4 MB) copied, 0.205789 s, 21.4 MB/s
680+1 records in
680+1 records out
713609988 bytes (714 MB) copied, 10.1116 s, 70.6 MB/s
1253+1 records in
1253+1 records out
1314611039 bytes (1.3 GB) copied, 13.8992 s, 94.6 MB/s
355+1 records in
355+1 records out
372911576 bytes (373 MB) copied, 4.82249 s, 77.3 MB/s
458+1 records in
458+1 records out
481119806 bytes (481 MB) copied, 6.33826 s, 75.9 MB/s
791+1 records in
791+1 records out
830134608 bytes (830 MB) copied, 10.6749 s, 77.8 MB/s
447+1 records in
447+1 records out
468840293 bytes (469 MB) copied, 5.39854 s, 86.8 MB/s
1066+1 records in
1066+1 records out
1118208838 bytes (1.1 GB) copied, 12.0214 s, 93.0 MB/s
312+1 records in
312+1 records out
327913937 bytes (328 MB) copied, 4.09502 s, 80.1 MB/s
0+1 records in
0+1 records out
4644 bytes (4.6 kB) copied, 0.00135879 s, 3.4 MB/s
0+1 records in
0+1 records out
3748 bytes (3.7 kB) copied, 0.00115056 s, 3.3 MB/s
433+1 records in
433+1 records out
454523128 bytes (455 MB) copied, 5.49896 s, 82.7 MB/s
788+1 records in
788+1 records out
827080704 bytes (827 MB) copied, 11.246 s, 73.5 MB/s

using e1000:
-------------------

root@target:/serv/temp# for i in `ls -1`; do dd if=$i of=/dev/null bs=1M; done
167+1 records in
167+1 records out
175307283 bytes (175 MB) copied, 4.17678 s, 42.0 MB/s
38+1 records in
38+1 records out
40262315 bytes (40 MB) copied, 2.03667 s, 19.8 MB/s
207+1 records in
207+1 records out
217512589 bytes (218 MB) copied, 4.21675 s, 51.6 MB/s
71+1 records in
71+1 records out
75169566 bytes (75 MB) copied, 2.57497 s, 29.2 MB/s
97+1 records in
97+1 records out
102647454 bytes (103 MB) copied, 2.4049 s, 42.7 MB/s
191+1 records in
191+1 records out
201034279 bytes (201 MB) copied, 4.09252 s, 49.1 MB/s
38+1 records in
38+1 records out
40347351 bytes (40 MB) copied, 0.924242 s, 43.7 MB/s
220+1 records in
220+1 records out
231110337 bytes (231 MB) copied, 7.48229 s, 30.9 MB/s
214+1 records in
214+1 records out
225300432 bytes (225 MB) copied, 5.04808 s, 44.6 MB/s
233+1 records in
233+1 records out
244673594 bytes (245 MB) copied, 5.16226 s, 47.4 MB/s
246+1 records in
246+1 records out
258095453 bytes (258 MB) copied, 4.84071 s, 53.3 MB/s
215+1 records in
215+1 records out
225469663 bytes (225 MB) copied, 4.82048 s, 46.8 MB/s
239+1 records in
239+1 records out
251095329 bytes (251 MB) copied, 6.68411 s, 37.6 MB/s
238+1 records in
238+1 records out
250124036 bytes (250 MB) copied, 5.74995 s, 43.5 MB/s
22+1 records in
22+1 records out
23365610 bytes (23 MB) copied, 0.359905 s, 64.9 MB/s
229+1 records in
229+1 records out
240892401 bytes (241 MB) copied, 5.32606 s, 45.2 MB/s
232+1 records in
232+1 records out
244197338 bytes (244 MB) copied, 4.20975 s, 58.0 MB/s
214+1 records in
214+1 records out
225034821 bytes (225 MB) copied, 5.7598 s, 39.1 MB/s
261+1 records in
261+1 records out
273693189 bytes (274 MB) copied, 7.15302 s, 38.3 MB/s
76+1 records in
76+1 records out
80410992 bytes (80 MB) copied, 1.08091 s, 74.4 MB/s
7+1 records in
7+1 records out
7664076 bytes (7.7 MB) copied, 0.209762 s, 36.5 MB/s
1+1 records in
1+1 records out
1427098 bytes (1.4 MB) copied, 0.0387635 s, 36.8 MB/s
38+1 records in
38+1 records out
40522736 bytes (41 MB) copied, 0.768741 s, 52.7 MB/s
88+1 records in
88+1 records out
92377637 bytes (92 MB) copied, 2.5119 s, 36.8 MB/s
86+1 records in
86+1 records out
90224950 bytes (90 MB) copied, 1.8387 s, 49.1 MB/s
98+1 records in
98+1 records out
103141991 bytes (103 MB) copied, 4.00174 s, 25.8 MB/s
4+1 records in
4+1 records out
4396403 bytes (4.4 MB) copied, 0.272675 s, 16.1 MB/s
680+1 records in
680+1 records out
713609988 bytes (714 MB) copied, 13.436 s, 53.1 MB/s
1253+1 records in
1253+1 records out
1314611039 bytes (1.3 GB) copied, 27.7915 s, 47.3 MB/s
355+1 records in
355+1 records out
372911576 bytes (373 MB) copied, 8.14154 s, 45.8 MB/s
458+1 records in
458+1 records out
481119806 bytes (481 MB) copied, 9.42542 s, 51.0 MB/s
791+1 records in
791+1 records out
830134608 bytes (830 MB) copied, 17.72 s, 46.8 MB/s
447+1 records in
447+1 records out
468840293 bytes (469 MB) copied, 11.7229 s, 40.0 MB/s
1066+1 records in
1066+1 records out
1118208838 bytes (1.1 GB) copied, 24.3314 s, 46.0 MB/s
312+1 records in
312+1 records out
327913937 bytes (328 MB) copied, 8.12247 s, 40.4 MB/s
0+1 records in
0+1 records out
4644 bytes (4.6 kB) copied, 0.00243778 s, 1.9 MB/s
0+1 records in
0+1 records out
3748 bytes (3.7 kB) copied, 0.00318773 s, 1.2 MB/s
433+1 records in
433+1 records out
454523128 bytes (455 MB) copied, 9.71649 s, 46.8 MB/s
788+1 records in
788+1 records out
827080704 bytes (827 MB) copied, 16.9927 s, 48.7 MB/s
 
As a clarification: The root cause of the whole investigation was a very bad disk io performance with KVM on ZFS. It turned out, that the customer has 4TB of data in random size files which are exported from a VM using nfs. The performance was below 20MB/s in most cases as only a little part of the files can remain in ARC but it has highly random access. The of-the-disk performance without NFS was below 60MB/s in all cases with flushed ARC and Buffer-Caches or on not previously accessed files. I turned out, that the default ZVOL block size of 8k significantly reduced the performance. We changed the ZVOL block size to 128k and gained a 3-4 times better performance inside the VMs in our testlab. That's where the above data is from. Because the Lab was already set up we decided to also consider the network layer. At the moment we are migrating the 4TB storage of the customer to a 128k ZVOL. We also expect much better performance on the live-system afterwards. If you are interested I can keep you informed.
 
> The of-the-disk performance without NFS was below 60MB/s in all cases with flushed ARC and Buffer-Caches or on not previously accessed files. I turned out, that the default ZVOL block size of 8k significantly reduced the performance. We changed the ZVOL block size to 128k and gained a 3-4 times better performance inside the VMs in our testlab

You're talking here of disk throughput inside the VM am I right, everything being local ? You could use bonnie++ instead of dd for testing that, have a look at https://calomel.org/zfs_raid_speed_capacity.html if will give some figures to compare.

Going back to your initial problem (guest losing connectivity under load with virtio), can you test ( not over the network of course :) if unloading and reloading the virtio kernel module re-establish the ability to do Tcp connections ? ( It's just to single things out)
In a Debian Jessie guest in your testlab

ifdown eth0
rmod virtio-pci
modrobe virtio-pci
ifup eth0
 
Thank you. I'm aware of the fact that benchmarking using dd is not the best effort. But in this case I tried to reproduce the measurements of the customer's r&d department in detail. In this scenario it turned out that a ZVOL with 128k or higher blocksize performed much better as one which 8k blocksize. Also the native ZFS mount on the proxmox-server always performed pretty good. It was just the ZVOL, mapped to the KVM VM using VIRTIO disks that had a bad performance.

To the initial Problem: It just happened again yesterday night. Nothing in the logs on the host or the guest os. The only thing in the logs is the stale NFS mount in the guest VM after losing the network connectivity. Unloading virtio-pci was a bad idea as it also wrecked the virtio-disk and everything was broken. I had to reset the machine an I hope fsck will fix things. Unfortunately I also cannot tell you, if unloading and loading the module will solve the problem. What I can tell is, that rebooting the VM solves the problem, that an ifdown/ifup does not solve it. That I can still see ARP inside the VM on eth0 but ICMP, TCP or other connectivity does not work.
 
Hi
Sorry for the virtio-pci incidebt: my Debian Jessie VM had a different storage controller and did not bring this issue.

I am suspecting here a bad combination of host / KVM version and guest virtio drivers.

Which kernel are you running on your host and guests ?
If you have 2.6.32 on you host, and you're not running OpenVZ containers, you should definivitely upgrade to 3.10, it improves a lot of stuff with KVM.
 
Hi Manu,

the Host is Proxmox 3.x with 2.6.32-42-pve. The guest OS is Debian Jessie with Linux 3.16.0-4-amd64 x86_64. The affected nodes in the Cluster are not using OpenVZ, so using the 3.10 kernel could be a option. But the cluster consists of 4 Proxmox host. 2 of them are using OpenVZ. Is it possible to have 3.10 an 2.6 Kernel mixed in the same cluster?

I also found these threads, which are describing the problem we observed:

http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03587.html
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=669184#97

it turned out, that GSO was enabled on the servers and also the VMs have a lot of external activity. We disabled GSO by now and moved some VMs back to virtio to see if things stabilize. As I understand the issue will be fixed in 3.10 so if we could use 3.10 there is no need to disable GSO or vhost_net any more.

Thanks.
 
Hi
Did the situation improved by disabling GSO ? Did you do that by way of ethtool ?

Hum I have not yet tested two differents kernel versions in a cluster. I think it shoud work, the only thing which could fail would be a live migration from KVM guest between 2.6 hosts and 3.10 KVM is quite picky about live migration, and it might very well be that a different host kernel disturbs him.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!