e1000e driver update - query on recommended route ?

fortechitsolutions · Jan 8, 2009

Hi,

I'm hoping to get a bit of guidance as to the best / recommended way to achieve this goal (if possible): I have 2 ProxmoxVE servers in a small cluster, in pre-deployment. The servers are identical 1u intel boxes, single-socket quadcore and dual-on-board-gig-ether NICs. One nic is e1000 chipset, while the other is e1000e.

I believe the stock e1000e driver in Debian Etch is slightly flakey ( as is the case with many distros!

but there is a newer version of the e1000e driver available (I believe it is available in "etch and a half" interm release also?)

On CentOS / RHEL boxes I've managed that need good / working e1000e, the 'simple fix' is,

- install kernel headers RPM via yum
- ensure GCC / compiler tools are present
- build the kernel module for e1000e using the stock make; make install cycle on the fresh (Dec-2008) e1000e sources - and then you are laughing.

In particular, jumbo frame support is 'good' on the latest driver, while it is problematic on the earlier e1000e, from what I've seen / found.

Alas, my experience in doing this on Debian .. is rather more limited, to put it modestly. I know we have a tweaked kernel for ProxMox when compared to a totally stock Debian Etch install (?) (OpenVZ support, PVM support, etc). and I'm relutant to break things by barging ahead.

If a brief pointer could be made, on how to integrate new e1000e driver / module -- without totally messing up the proxmox system -- certainly such help would be greatly appreciated.

Many thanks,

--Tim Chipman

tog · Jan 9, 2009

I used this:
http://download.proxmox.com/debian/...4/pve-headers-2.6.24-1-pve_2.6.24-4_amd64.deb

To compile and install the Asterisk DAHDI stuff from source successfully. I then used DAHDI's dummy driver for timing with Asterisk inside an OpenVZ container.

So that (as well as apt-get'ing gcc and glibc-devel and other things you may need) should be enough for you to build your new e1000e driver.

dietmar · Jan 9, 2009

fortechitsolutions said:
I believe the stock e1000e driver in Debian Etch is slightly flakey ( as is the case with many distros!

We do not use the debian kernel at all - we use the ubuntu hardy kernel instead.

- Dietmar

fortechitsolutions · Jan 11, 2009

Hi,

Just a footnote to mention,

I downloaded the kernel source as hinted; also installed make and gcc; and then had no problem building the new e1000e driver and installing the updated kernel module using the standard "make install" for this source tarball. So all very nice.

Many thanks for the help,

--Tim

For reference: Capture of commands is given below--------------

Confirm that gcc, make are not present; then install
111 which gcc
112 which make
113 apt-get install make gcc

Get sources required and prep for install /build:
114 cd /opt/src
115 ls -la
116 mkdir e1000e
117 cd e1000e/
119 wget http://download.proxmox.com/debian/...4/pve-headers-2.6.24-1-pve_2.6.24-4_amd64.deb

Install kernel headers
122 dpkg -i pve-headers-2.6.24-1-pve_2.6.24-4_amd64.deb

Get e1000e source, latest version from ~dec-08
126 wget http://internap.dl.sourceforge.net/sourceforge/e1000/e1000e-0.5.8.2.tar.gz
127 gzip -d e1000e-0.5.8.2.tar.gz
128 tar xfv e1000e-0.5.8.2.tar
129 cd e1000e-0.5.8.2/src/
130 ls -la
131 make install

Confirm that new kernel module exists where indicated, with appropriate timestamp on file
132 ls -la /lib/modules/2.6.24-1-pve/kernel/drivers/net/e1000e/e1000e.ko
133 date

Confirm we have any e1000 modules presently in use
134 lsmod | grep -i e1000

remove current (older) module and install new module
135 rmmod e1000e; modprobe e1000e

verify in dmesg the new e1000e driver was announced / is in use
136 dmesg

confirm nic still appears to be configured, do ping connectivity test
137 ifconfig -a | more
138 ping www.slashdot.org

dietmar · Jan 12, 2009

So what version do you use exactly?

# modinfo e1000e

fortechitsolutions · Jan 12, 2009

Hi,

Here we go:

------paste--------

pvm2:~# modinfo e1000e
filename: /lib/modules/2.6.24-1-pve/kernel/drivers/net/e1000e/e1000e.ko
author: Intel Corporation, <linux.nics@intel.com>
description: Intel(R) PRO/1000 Network Driver
license: GPL
version: 0.5.8.2-NAPI
vermagic: 2.6.24-1-pve SMP preempt mod_unload
depends:
alias: pci:v00008086d0000105Esv*sd*bc*sc*i*
.....etc.... for ~30 more lines here...
alias: pci:v00008086d000010DFsv*sd*bc*sc*i*
srcversion: A42F8874B58F240524B9A16
parm: CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int)
parm: KumeranLockLoss:Enable Kumeran lock loss workaround (array of in t)
parm: SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm: IntMode:Interrupt Mode (array of int)
parm: InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm: RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm: RxIntDelay:Receive Interrupt Delay (array of int)
parm: TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm: TxIntDelay:Transmit Interrupt Delay (array of int)
parm: copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
pvm2:~#

note in dmesg, we see:

e1000e: Intel(R) PRO/1000 Network Driver - 0.5.8.2-NAPI
e1000e: Copyright (c) 1999-2008 Intel Corporation.
ACPI: PCI Interrupt 0000:00:19.0[A] -> GSI 20 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:19.0 to 64
0000:00:19.0: eth1: (PCI Express:2.5GB/s:Width x1) 00:15:17:8e:38:a1
0000:00:19.0: eth1: Intel(R) PRO/1000 Network Connection
0000:00:19.0: eth1: MAC: 7, PHY: 6, PBA No: 0070ff-0ff
pvm2:~#

dietmar · Jan 12, 2009

And you observed a bug with the version we use?

fortechitsolutions · Jan 12, 2009

Hi, yes - I had problems with NIC performance when jumbo frame MTU 9000 was enabled. Symptoms of the problem are,

- you can SSH in and do small packet type work OK
- but as soon as you try anything less trivial, the connection lags / hangs terribly. (ie, to scp a 2 meg file or larger, for example, into the host)
- if you drop MTU back to 1500 the problem goes away

I've seen the same behaviour on other linux environments (RHEL/CentOS 5) as well - ie - that the stock e1000e driver included has known bugs / and that the latest version addresses these known bugs, but the latest version driver is not yet available in the distro yet by default ..

I believe you can find discussion of the issues in the e1000 sourceforge forums. But if you don't feel like digging in extensively, just using the latest driver for e1000e seems to do the trick.

Tim

dietmar · Jan 17, 2009

Please can you test with the new kernel (new e1000 drivers included)

- Dietmar

fortechitsolutions · Jan 17, 2009

Hi - Sure I'll try to do some tests this week to confirm. I had noted after doing a clean install of ProxVE on a new system this week, that the e1000/e1000e drivers were magically updated now - very nice

Once I have feedback I'll let you know,

Tim

fortechitsolutions · Jan 19, 2009

Hi Dietmar, just to let you know I've applied the updates available for ProxVE to both the systems with the e1000e NICs which had required the updated driver in the past to work well. Using the latest ProxVE 1.1 they both behave well / no problems.

The only other thing, which has been an ongoing 'minor' issue, maybe I should mention here: I have an openvz virtual machine (CentOS 5 template) with Cacti running on one of these 2 ProxVE physical hosts. This cacti host is moitoring a number of servers and a switch in this group of hardware (5 devices monitored in total via SNMP). I note in my cacti graphs, there are periods of 'network discontinuity' at fairly regular intervals, ie, maybe 12+ times per day cacti thinks it can't reach other systems. These devices are all connected togother directly to the same physical gig-ether switch; a very simple LAN network. Also 'interestingly', the catci host never has trouble (ie, discontinuities of any kind) when monitoring itself, nor the physical ProxVE host upon which it resides.

I did also note some messages logged in output of 'dmesg' regarding short UDP packets, as follow:

---paste---
vmbr0: no IPv6 routers present
UDP: short packet: From 10.10.2.10:0 0/219 to 10.10.2.255:0
UDP: short packet: From 10.10.2.10:0 16896/219 to 10.10.2.255:2339
UDP: short packet: From 10.10.2.10:43616 3673/219 to 10.10.2.255:80
UDP: short packet: From 10.10.2.10:55344 3732/219 to 10.10.2.255:22
UDP: short packet: From 10.10.2.10:55344 3732/219 to 10.10.2.255:22
pvm1:~#

---endpaste---

(note in this case, 10.10.2.10 is one of the local physical servers monitored, which does have this 'discontinuity' apparent in cacti graphs..)

I'm not certain, if this type of behaviour has been observed before / elsewhere or not, or if the udp short packet errors might be related in any way or not. I've just changed the config in cacti to monitor 'device availability' via ping and snmp, not just ping; and also to try 5 ping attempts before flagging a device as inaccessible (rather than the default of one fail means device inaccessible) -- we'll see if this makes any difference to behaviour.

I know that I"ve run cacti in virtual (openvz) environment at another site (not using ProxVE - just straight Openvz install on CentOS physical host) -- and there were no general problems there with 'intermittent interruptions of network' like I'm seeing here..

This isn't urgent, but I thought I should mention it in case it was a known issue / or in case it was of interest.

Tim

tog · Jan 19, 2009

If the container does no network traffic for at least a few minutes at a time and then starts up its polling cycle, it may have a delay for the traffic that it tries within the first half-second to each host on the same LAN segment that it has to do arp with. I have definitely seen slow arp with openvz containers using venet, but it doesn't seem to be an issue at all if there's continuous traffic.

Your fix of trying more than one ping and also trying snmp sounds like it should be effective.

fortechitsolutions · Jan 21, 2009

Hi,

Just a brief footnote to followup and confirm, that with the tweak to ping multiple times before flagging a fail in cacti - did indeed resolve my issue. (In case this info is of help to others in the future ..)

--Tim

szympek1234 · Feb 3, 2010

Search

Search

e1000e driver update - query on recommended route ?

fortechitsolutions

Renowned Member

tog

Member

dietmar

Proxmox Staff Member

fortechitsolutions

Renowned Member

dietmar

Proxmox Staff Member

fortechitsolutions

Renowned Member

dietmar

Proxmox Staff Member

fortechitsolutions

Renowned Member

dietmar

Proxmox Staff Member

fortechitsolutions

Renowned Member

fortechitsolutions

Renowned Member

tog

Member

fortechitsolutions

Renowned Member

szympek1234

New Member

We value your privacy