virtio-net crashing (stop sending traffic)

Blinkiz

Active Member
Jan 23, 2010
32
1
28
Stockholm Sweden
Hi
Am trying out Proxmox 1.5 today. Seems really nice. But am experiencing that virtio driver crashes when there are to much load. I had this problem back in the days when I was running virtualization with kvm+libvirt on Ubuntu 8.04. Now days I use Ubuntu Jaunty which has newer kernel, kvm and libvirt. Virtio-net works great even at high loads under jaunty.

Because I already have all my virtual machines in logical volumes, it was kinda easy to migrate them into Proxmox. Only problem am facing, is that I must use e1000 as network driver in my virtual machines because virtio-net is crashing at high loads.

Am sorry, I don't have any kernel panic logs because I can not find any. It's just stop sending traffic. maybe someone else is experiencing the same problem and can provide more info?

This is a bug in Proxmox 1.5. Because I simply have the same virtual machines working in kvm+libvirt environment with virtio-net.

Am using Proxmox 1.5 with 2.6.32 kernel.
 
Last edited:
Hi
Am trying out Proxmox 1.5 today. Seems really nice. But am experiencing that virtio driver crashes when there are to much load. I had this problem back in the days when I was running virtualization with kvm+libvirt on Ubuntu 8.04. Now days I use Ubuntu Jaunty which has newer kernel, kvm and libvirt. Virtio-net works great even at high loads under jaunty.

Because I already have all my virtual machines in logical volumes, it was kinda easy to migrate them into Proxmox. Only problem am facing, is that I must use e1000 as network driver in my virtual machines because virtio-net is crashing at high loads.

Am sorry, I don't have any kernel panic logs because I can not find any. It's just stop sending traffic. maybe someone else is experiencing the same problem and can provide more info?

This is a bug in Proxmox 1.5. Because I simply have the same virtual machines working in kvm+libvirt environment with virtio-net.

I have the same problem in Proxmox 1.4 2.6.24 kernel with e1000 with Intel Website drivers.

If the file sharing server (Windows 2008) start to high load with connections, i have already 2 reactions, one like above, the network card dont answer, i need to shutdow the VM and startup...and the other is that the VM just STOPPED, yes you read me, status as STOPPED.
 
Hi
Am trying out Proxmox 1.5 today. Seems really nice. But am experiencing that virtio driver crashes when there are to much load. I had this problem back in the days when I was running virtualization with kvm+libvirt on Ubuntu 8.04. Now days I use Ubuntu Jaunty which has newer kernel, kvm and libvirt. Virtio-net works great even at high loads under jaunty.

Why don't you use kernel 2.6.32?

http://pve.proxmox.com/wiki/Proxmox_VE_Kernel
 
Maybe you can test with 2.6.18?
Alright, finally got the time to test it out.

2.6.18 works without virtio crashing. Downloaded with 15 threads at a speed of 10.14 MB/sec (not Mbit) for 7 minutes and 37 seconds.
To make sure my testing is accurate, I switched back to 2.6.32 and download the same thing.
2.6.32 works without virtio crashing. Downloaded with 15 threads at a speed of 10.84 MB/sec (not Mbit) for 7 minutes and 7 seconds.

So this means I can not reproduce my crashings with virtio driver. Strange. I have really not been doing any changes so I don't know why it has started to work. I did a apt-get upgrade and some qemu-kvm stuff was updated.. But that's it. Hmmm..
 
Last edited:
Alright, now I catched on of the crash. When this happens, my eth1 stops to send data.

This is from kern.log in the virtual machine fileserver:
Code:
Feb  2 20:18:52 fileserver kernel: [13318.016208] kswapd0: page allocation failure. order:0, mode:0x20
Feb  2 20:18:52 fileserver kernel: [13318.016213] Pid: 47, comm: kswapd0 Not tainted 2.6.31-16-server #53-Ubuntu
Feb  2 20:18:52 fileserver kernel: [13318.016216] Call Trace:
Feb  2 20:18:52 fileserver kernel: [13318.016218]  <IRQ>  [<ffffffff810e005c>] __alloc_pages_slowpath+0x4cc/0x4e0
Feb  2 20:18:52 fileserver kernel: [13318.016231]  [<ffffffff810e019c>] __alloc_pages_nodemask+0x12c/0x130
Feb  2 20:18:52 fileserver kernel: [13318.016235]  [<ffffffff8110c592>] alloc_pages_current+0x82/0xd0
Feb  2 20:18:52 fileserver kernel: [13318.016249]  [<ffffffffa0026ef6>] try_fill_recv+0x136/0x1c0 [virtio_net]
Feb  2 20:18:52 fileserver kernel: [13318.016254]  [<ffffffffa002711d>] virtnet_poll+0xfd/0x150 [virtio_net]
Feb  2 20:18:52 fileserver kernel: [13318.016258]  [<ffffffff81438dd7>] net_rx_action+0x107/0x250
Feb  2 20:18:52 fileserver kernel: [13318.016262]  [<ffffffffa0026366>] ? skb_recv_done+0x36/0x40 [virtio_net]
Feb  2 20:18:52 fileserver kernel: [13318.016266]  [<ffffffff810651fd>] __do_softirq+0xbd/0x200
Feb  2 20:18:52 fileserver kernel: [13318.016270]  [<ffffffff810131ac>] call_softirq+0x1c/0x30
Feb  2 20:18:52 fileserver kernel: [13318.016272]  [<ffffffff81014b85>] do_softirq+0x55/0x90
Feb  2 20:18:52 fileserver kernel: [13318.016274]  [<ffffffff81064f65>] irq_exit+0x85/0x90
Feb  2 20:18:52 fileserver kernel: [13318.016276]  [<ffffffff810140c0>] do_IRQ+0x70/0xe0
Feb  2 20:18:52 fileserver kernel: [13318.016279]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
Feb  2 20:18:52 fileserver kernel: [13318.016281]  <EOI>  [<ffffffff81036396>] ? __ticket_spin_lock+0x16/0x20
Feb  2 20:18:52 fileserver kernel: [13318.016320]  [<ffffffff81527139>] ? _spin_lock+0x9/0x10
Feb  2 20:18:52 fileserver kernel: [13318.020055]  [<ffffffff811444ab>] ? try_to_free_buffers+0x3b/0xb0
Feb  2 20:18:52 fileserver kernel: [13318.020084]  [<ffffffff811e9b17>] ? jbd2_journal_try_to_free_buffers+0xa7/0x150
Feb  2 20:18:52 fileserver kernel: [13318.020088]  [<ffffffff8111988f>] ? __mem_cgroup_uncharge_common+0xef/0x160
Feb  2 20:18:52 fileserver kernel: [13318.020091]  [<ffffffff811b0110>] ? ext4_releasepage+0x50/0x90
Feb  2 20:18:52 fileserver kernel: [13318.020094]  [<ffffffff810d946b>] ? try_to_release_page+0x2b/0x50
Feb  2 20:18:52 fileserver kernel: [13318.020097]  [<ffffffff810e7dd3>] ? shrink_page_list+0x4f3/0x610
Feb  2 20:18:52 fileserver kernel: [13318.020100]  [<ffffffff810e8182>] ? shrink_inactive_list+0x292/0x660
Feb  2 20:18:52 fileserver kernel: [13318.020103]  [<ffffffff811716e4>] ? proc_destroy_inode+0x14/0x20
Feb  2 20:18:52 fileserver kernel: [13318.020107]  [<ffffffff810e2035>] ? determine_dirtyable_memory+0x15/0x30
Feb  2 20:18:52 fileserver kernel: [13318.020109]  [<ffffffff810e20d2>] ? get_dirty_limits+0x22/0x2f0
Feb  2 20:18:52 fileserver kernel: [13318.020111]  [<ffffffff810e859c>] ? shrink_list+0x4c/0xe0
Feb  2 20:18:52 fileserver kernel: [13318.020114]  [<ffffffff810e8777>] ? shrink_zone+0x147/0x200
Feb  2 20:18:52 fileserver kernel: [13318.020116]  [<ffffffff810e9599>] ? balance_pgdat+0x579/0x5d0
Feb  2 20:18:52 fileserver kernel: [13318.020118]  [<ffffffff810e6810>] ? isolate_pages_global+0x0/0x60
Feb  2 20:18:52 fileserver kernel: [13318.020121]  [<ffffffff810e96ef>] ? kswapd+0xff/0x160
Feb  2 20:18:52 fileserver kernel: [13318.020124]  [<ffffffff81078620>] ? autoremove_wake_function+0x0/0x40
Feb  2 20:18:52 fileserver kernel: [13318.020127]  [<ffffffff810e95f0>] ? kswapd+0x0/0x160
Feb  2 20:18:52 fileserver kernel: [13318.020129]  [<ffffffff81078236>] ? kthread+0xa6/0xb0
Feb  2 20:18:52 fileserver kernel: [13318.020131]  [<ffffffff810130aa>] ? child_rip+0xa/0x20
Feb  2 20:18:52 fileserver kernel: [13318.020134]  [<ffffffff81078190>] ? kthread+0x0/0xb0
Feb  2 20:18:52 fileserver kernel: [13318.020135]  [<ffffffff810130a0>] ? child_rip+0x0/0x20
Feb  2 20:18:52 fileserver kernel: [13318.020137] Mem-Info:
Feb  2 20:18:52 fileserver kernel: [13318.020139] Node 0 DMA per-cpu:
Feb  2 20:18:52 fileserver kernel: [13318.020142] CPU    0: hi:    0, btch:   1 usd:   0
Feb  2 20:18:52 fileserver kernel: [13318.020144] CPU    1: hi:    0, btch:   1 usd:   0
Feb  2 20:18:52 fileserver kernel: [13318.020146] CPU    2: hi:    0, btch:   1 usd:   0
Feb  2 20:18:52 fileserver kernel: [13318.020147] CPU    3: hi:    0, btch:   1 usd:   0
Feb  2 20:18:52 fileserver kernel: [13318.020148] Node 0 DMA32 per-cpu:
Feb  2 20:18:52 fileserver kernel: [13318.020150] CPU    0: hi:  186, btch:  31 usd: 105
Feb  2 20:18:52 fileserver kernel: [13318.020152] CPU    1: hi:  186, btch:  31 usd:  37
Feb  2 20:18:52 fileserver kernel: [13318.020153] CPU    2: hi:  186, btch:  31 usd: 149
Feb  2 20:18:52 fileserver kernel: [13318.020155] CPU    3: hi:  186, btch:  31 usd: 176
Feb  2 20:18:52 fileserver kernel: [13318.020159] Active_anon:75231 active_file:16904 inactive_anon:81679
Feb  2 20:18:52 fileserver kernel: [13318.020160]  inactive_file:62572 unevictable:0 dirty:23343 writeback:800 unstable:0
Feb  2 20:18:52 fileserver kernel: [13318.020161]  free:1408 slab:7069 mapped:3902 pagetables:1468 bounce:0
Feb  2 20:18:52 fileserver kernel: [13318.020162] Node 0 DMA free:3980kB min:60kB low:72kB high:88kB active_anon:1496kB inactive_anon:1728kB active_file:112kB inactive_file:8356kB unevictable:0kB present:15352kB pages_scanned:0 all_unreclaimable? no
Feb  2 20:18:52 fileserver kernel: [13318.020167] lowmem_reserve[]: 0 994 994 994
Feb  2 20:18:52 fileserver kernel: [13318.020170] Node 0 DMA32 free:1652kB min:4000kB low:5000kB high:6000kB active_anon:299428kB inactive_anon:324988kB active_file:67504kB inactive_file:241932kB unevictable:0kB present:1018016kB pages_scanned:0 all_unreclaimable? no
Feb  2 20:18:52 fileserver kernel: [13318.020175] lowmem_reserve[]: 0 0 0 0
Feb  2 20:18:52 fileserver kernel: [13318.020178] Node 0 DMA: 1*4kB 17*8kB 10*16kB 17*32kB 13*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3980kB
Feb  2 20:18:52 fileserver kernel: [13318.020185] Node 0 DMA32: 85*4kB 0*8kB 4*16kB 1*32kB 1*64kB 1*128kB 0*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 1652kB
Feb  2 20:18:52 fileserver kernel: [13318.020191] 80040 total pagecache pages
Feb  2 20:18:52 fileserver kernel: [13318.020193] 469 pages in swap cache
Feb  2 20:18:52 fileserver kernel: [13318.020194] Swap cache stats: add 3120, delete 2651, find 1528/1751
Feb  2 20:18:52 fileserver kernel: [13318.020196] Free swap  = 784224kB
Feb  2 20:18:52 fileserver kernel: [13318.020197] Total swap = 786424kB
Feb  2 20:18:52 fileserver kernel: [13318.025212] 262128 pages RAM
Feb  2 20:18:52 fileserver kernel: [13318.025215] 6567 pages reserved
Feb  2 20:18:52 fileserver kernel: [13318.025217] 73662 pages shared
Feb  2 20:18:52 fileserver kernel: [13318.025218] 194329 pages non-shared
I have found nothing in the logs on the host.
 
Alright, now I catched on of the crash. When this happens, my eth1 stops to send data.

This is from kern.log in the virtual machine fileserver:
Code:
Feb  2 20:18:52 fileserver kernel: [13318.016208] kswapd0: page allocation failure. order:0, mode:0x20
I have found nothing in the logs on the host.

Your guest system is running low on memory and it failed to allocate pages (even if you still have some RAM free, the kernel was not able to allocate what it needed).

Most likely it's a bug in virtio if the traffic stops altogether as it happens - you should report your findings to KVM mailing list (and possibly, linux-net list, too) and ask if it's normal or not.

As a workaround, try to increase guest's RAM.
 
BTW, you may also try to increase /proc/sys/vm/min_free_kbytes on the guest.
Okay, thanks. I wanna test this.
So I have put echo "8192" > /proc/sys/vm/min_free_kbytes into my rc.local file. Let's see if this solves the problem. I have not increased RAM. I want to test this setting first.
I guess it's in the virtual machine you want me to do this? Well, am setting it there anyway..
 
Na, tried this with higher /proc/sys/vm/min_free_kbytes value. No good. Currently I don't see any error message, just that the interface stops to send data. I have also monitored the memory usage, and I don't think it's always related. I have 1024 mb allocated to this machine. But it just crashed when only 184 mb of userspace memory was allocated. 3 minutes after I did a reboot and loaded data over NFS from this machine at high speed.
 
Last edited:
Most likely it's a bug in virtio if the traffic stops altogether as it happens - you should report your findings to KVM mailing list (and possibly, linux-net list, too) and ask if it's normal or not.

When saying this, you must take into consideration that I can run this virtual machine without any problem under Ubuntu Jaunty (2.6.28) with kvm 84 and libvirt 0.6.1.
So the problem is really the combination of all tools/apps used under the collected name "Proxmox 1.5".
 
Today I tested with 2.6.18-2 kernel. Same problem.
This time, I didn't even have high load on the machine. Just surfing from my HTPC frow NFS on the machine and the interface stopped to send data. Nothing in the logs.

Because I had this problem when I was using Ubuntu 8.04, it must be because Proxmox is using something with an old version. Like kvm. It's kvm-83-maint-snapshot-20090205 when running 2.6.18-2. That's an old one. But in 2.6.32 kernel, its v0.11.1 so that is good. And kernel version is the one built into 2.6.32 I guess. Hmmm..
 
Last edited:
Just wanted to say that I have now replaced Proxmox 1.5 with Fedora 12 JeOS with only virtualization group installed. Problems described in this thread are gone. No more virtio-net hanging and no more "CPU stuck for xxxx seconds" in the logs.
 
Confirm this problem on kernel 2.6.24-10-pve
Debian 5 guest. Virtio network driver.

Mar 24 06:10:28 nginx2-kvm kernel: [38811.988798] swapper: page allocation failure. order:0, mode:0x20
Mar 24 06:10:28 nginx2-kvm kernel: [38811.989123] Pid: 0, comm: swapper Not tainted 2.6.26-2-amd64 #1
Mar 24 06:10:28 nginx2-kvm kernel: [38811.989416]
Mar 24 06:10:28 nginx2-kvm kernel: [38811.989416] Call Trace:
Mar 24 06:10:28 nginx2-kvm kernel: [38811.989820] <IRQ> [<ffffffff80276b49>] __alloc_pages_internal+0x3a6/0x3bf
Mar 24 06:10:28 nginx2-kvm kernel: [38811.990188] [<ffffffff80295654>] kmem_getpages+0x96/0x15f
Mar 24 06:10:28 nginx2-kvm kernel: [38811.990483] [<ffffffff80295ce4>] fallback_alloc+0x16b/0x1e1
Mar 24 06:10:28 nginx2-kvm kernel: [38811.990772] [<ffffffff802958c1>] kmem_cache_alloc_node+0x105/0x138
Mar 24 06:10:28 nginx2-kvm kernel: [38811.991077] [<ffffffff803b6531>] __alloc_skb+0x64/0x12d
Mar 24 06:10:28 nginx2-kvm kernel: [38811.991352] [<ffffffff803b749f>] __netdev_alloc_skb+0x29/0x43
Mar 24 06:10:28 nginx2-kvm kernel: [38811.991643] [<ffffffffa019e187>] :virtio_net:try_fill_recv+0x32/0xf1
Mar 24 06:10:28 nginx2-kvm kernel: [38811.991955] [<ffffffffa019ec01>] :virtio_net:virtnet_poll+0x214/0x2c3
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff803bd027>] net_rx_action+0xab/0x1da
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8023930d>] __do_softirq+0x5c/0xd1
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8021c3cc>] ack_apic_level+0x53/0xd8
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020d2dc>] call_softirq+0x1c/0x28
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020f3e8>] do_softirq+0x3c/0x81
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8023926b>] irq_exit+0x3f/0x85
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020f648>] do_IRQ+0xb9/0xd9
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020b0ae>] default_idle+0x0/0x46
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020c47d>] ret_from_intr+0x0/0x19
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] <EOI> [<ffffffff8021a737>] lapic_next_event+0x0/0x13
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8021eb64>] native_safe_halt+0x2/0x3
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8021eb64>] native_safe_halt+0x2/0x3
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020b0d8>] default_idle+0x2a/0x46
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] [<ffffffff8020ad04>] cpu_idle+0x8e/0xb8
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085]
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Mem-info:
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Mem-info:
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA per-cpu:
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] CPU 0: hi: 0, btch: 1 usd: 0
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA32 per-cpu:
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] CPU 0: hi: 186, btch: 31 usd: 192
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Active:17548 inactive:463151 dirty:43819 writeback:2903 unstable:0
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] free:2522 slab:28745 mapped:723 pagetables:334 bounce:0
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA free:8024kB min:28kB low:32kB high:40kB active:4kB inactive:1596kB present:10788kB pages_scanned:0 all_unreclaimable? no
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] lowmem_reserve[]: 0 2004 2004 2004
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA32 free:2064kB min:5712kB low:7140kB high:8568kB active:70188kB inactive:1851008kB present:2052256kB pages_scanned:0 all_unreclaimable? no
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] lowmem_reserve[]: 0 0 0 0
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA: 22*4kB 18*8kB 7*16kB 4*32kB 2*64kB 2*128kB 4*256kB 4*512kB 0*1024kB 0*2048kB 1*4096kB = 8024kB
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Node 0 DMA32: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2064kB
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] 476494 total pagecache pages
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Swap cache: add 153, delete 0, find 0/0
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Free swap = 908692kB
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] Total swap = 909304kB
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] 524272 pages of RAM
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] 8518 reserved pages
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] 477580 pages shared
Mar 24 06:10:28 nginx2-kvm kernel: [38811.992085] 153 pages swap cached
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!