Lost 2 out of 4 Ethernet ports after upgrade

dalex

Renowned Member
Sep 22, 2011
7
0
66
Athens, Greece
Hi.

I a have a Supermicro double server (SYS-6036ST-6LR) with two Proxmoxs in a cluster arrangement.

Each server has two (2) Intel Gigabit Ethernet ports (configured bonded, and connected to a managed switch), plus two (2) internal 10 Gigabit Ethernet ports (also configured bonded) which connect the two servers together.

I decided to change the interface settings a little, and though to upgrade to 1.9 with the opportunity of the reboot (I just have did it two days ago to another server with success).

I entered

apt-get update
apt-get dist-upgrade

to both machines, and after reboot, the two Gigabit Ethernet ports are vanished! I only have the inter-server communication active...


I tried my old /etc/network/interfaces, but still the same problem, has to be the upgrade. I' ve done 1.7 to 1.8 in summer with no problems.

Ethernet ports seem not to exist. In dmesg i can see "Intel Gigabit drivers ver. 3.1.16" and "Intel 10 Gigabit drivers ver. 3.5.14-NAPI", but only one bond is up.


An additional hint: During the initial installation, there were a peculiar port misplacement:

In server A, eth0, eth1 installed as the 10 G ports, and eth2, eth3 as the 1 G ports.

In server B, eth0, eth2 installed as the 10 G ports, and eth1, eth3 as the 1 G ports!

The same order exists now but only for 10 G ports of course.


It was working fine for several months now, with many Vlans, bridges, and a routed internal network, all over the 1 G bond, as per instructions in the "network model" pages.



Is it a driver problem?

Must i re-install 1.8 (or 1.7 which i have in CD) ?
 
Hi,
you don't need reinstall. Take a look at /etc/udev/rules.d/70-persistent-net.rules - i guess due the new driver new networkdevice-numbers will use.
Simply remove the old one and change the new one to the old numbers - reboot and all should work!

BTW. makes bonding on 10GB-NICs sense?

Udo
 
Thank you very much for your quick reply Udo.

1- I found the file with date 26 Aug. Here are the contents:


# This file was automatically generated by the /lib/udev/write_net_rules
# program run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single line.


# PCI device 0x8086:0x10c9 (igb)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:27:68:ee", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"


# PCI device 0x8086:0x10c9 (igb)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:27:68:ef", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"


# PCI device 0x8086:0x10f7 (ixgbe)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:27:68:f1", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"


# PCI device 0x8086:0x10f7 (ixgbe)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:27:68:f0", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

- Shall i delete it and reboot ( i always keep backup of /etc etc.) ?
- Should change something in it?
- Is there any config application i have to run?

Sorry for my ignorance...


This is an extract of my interfaces file concerning Ethernet ports and bonding:

iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual


auto bond0
iface bond0 inet manual
slaves eth0 eth1
bond_miimon 100
bond_mode 802.3ad
post-up echo 1 > /proc/sys/net/ipv4/conf/bond1/proxy_arp


auto bond1
iface bond1 inet manual
slaves eth2 eth3
bond_miimon 100
bond_mode 802.3ad
post-up echo 1 > /proc/sys/net/ipv4/conf/bond0/proxy_arp





2- The two 10 G are there sitting! Since they cannot connect to anything external, i did bond them (lacp). My original idea is to host a vNAS on prox2 to provide iScsi targets for servers in prox1, so all the data passes through this 20 G channel...


Thanks again
 
Indeed, no sign of the missing ones (eth2, eth3):


ixgbe 0000:02:00.0: eth0: MAC: 2, PHY: 1, PBA No: FFFFFF-0FF
ixgbe 0000:02:00.0: eth0: Enabled Features: RxQ: 16 TxQ: 16 FdirHash RSS RSC
ixgbe 0000:02:00.0: eth0: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:02:00.1: eth1: MAC: 2, PHY: 1, PBA No: FFFFFF-0FF
ixgbe 0000:02:00.1: eth1: Enabled Features: RxQ: 16 TxQ: 16 FdirHash RSS RSC
ixgbe 0000:02:00.1: eth1: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:02:00.0: eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
bonding: bond1: enslaving eth0 as a backup interface with an up link.
ixgbe 0000:02:00.1: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
bonding: bond1: enslaving eth1 as a backup interface with an up link.
8021q: adding VLAN 0 to HW filter on device eth0
8021q: adding VLAN 0 to HW filter on device eth1

no sign in ifconfig -a also...
 
Indeed, no sign of the missing ones (eth2, eth3):


ixgbe 0000:02:00.0: eth0: MAC: 2, PHY: 1, PBA No: FFFFFF-0FF
ixgbe 0000:02:00.0: eth0: Enabled Features: RxQ: 16 TxQ: 16 FdirHash RSS RSC
ixgbe 0000:02:00.0: eth0: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:02:00.1: eth1: MAC: 2, PHY: 1, PBA No: FFFFFF-0FF
ixgbe 0000:02:00.1: eth1: Enabled Features: RxQ: 16 TxQ: 16 FdirHash RSS RSC
ixgbe 0000:02:00.1: eth1: Intel(R) 10 Gigabit Network Connection
ixgbe 0000:02:00.0: eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
bonding: bond1: enslaving eth0 as a backup interface with an up link.
ixgbe 0000:02:00.1: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
bonding: bond1: enslaving eth1 as a backup interface with an up link.
8021q: adding VLAN 0 to HW filter on device eth0
8021q: adding VLAN 0 to HW filter on device eth1

no sign in ifconfig -a also...
Hi,
looks not so good.

What is with:
Code:
modinfo igb
modprobe igb
lspci -v
Udo
 
modinfo:

filename: /lib/modules/2.6.32-6-pve/kernel/drivers/net/igb/igb.ko
version: 3.1.16
license: GPL
description: Intel(R) Gigabit Ethernet Network Driver
author: Intel Corporation, <e1000-devel@lists.sourceforge.net>
srcversion: 4414453856986C94F378A9B
alias: pci:v00008086d000010D6sv*sd*bc*sc*i*
alias: pci:v00008086d000010A9sv*sd*bc*sc*i*
alias: pci:v00008086d000010A7sv*sd*bc*sc*i*
alias: pci:v00008086d00001526sv*sd*bc*sc*i*
alias: pci:v00008086d000010E8sv*sd*bc*sc*i*
alias: pci:v00008086d0000150Dsv*sd*bc*sc*i*
alias: pci:v00008086d000010E7sv*sd*bc*sc*i*
alias: pci:v00008086d000010E6sv*sd*bc*sc*i*
alias: pci:v00008086d00001518sv*sd*bc*sc*i*
alias: pci:v00008086d0000150Asv*sd*bc*sc*i*
alias: pci:v00008086d000010C9sv*sd*bc*sc*i*
alias: pci:v00008086d00000440sv*sd*bc*sc*i*
alias: pci:v00008086d0000043Csv*sd*bc*sc*i*
alias: pci:v00008086d0000043Asv*sd*bc*sc*i*
alias: pci:v00008086d00000438sv*sd*bc*sc*i*
alias: pci:v00008086d00001527sv*sd*bc*sc*i*
alias: pci:v00008086d00001516sv*sd*bc*sc*i*
alias: pci:v00008086d00001511sv*sd*bc*sc*i*
alias: pci:v00008086d00001510sv*sd*bc*sc*i*
alias: pci:v00008086d0000150Fsv*sd*bc*sc*i*
alias: pci:v00008086d0000150Esv*sd*bc*sc*i*
alias: pci:v00008086d00001524sv*sd*bc*sc*i*
alias: pci:v00008086d00001523sv*sd*bc*sc*i*
alias: pci:v00008086d00001522sv*sd*bc*sc*i*
alias: pci:v00008086d00001521sv*sd*bc*sc*i*
depends: dca
vermagic: 2.6.32-6-pve SMP mod_unload modversions
parm: InterruptThrottleRate:Maximum interrupts per second, per vector, (max 100000), default 3=adaptive (array of int)
parm: IntMode:Change Interrupt Mode (0=Legacy, 1=MSI, 2=MSI-X), default 2 (array of int)
parm: Node:set the starting node to allocate memory on, default -1 (array of int)
parm: LLIPort:Low Latency Interrupt TCP Port (0-65535), default 0=off (array of int)
parm: LLIPush:Low Latency Interrupt on TCP Push flag (0,1), default 0=off (array of int)
parm: LLISize:Low Latency Interrupt on Packet Size (0-1500), default 0=off (array of int)
parm: RSS:Number of Receive-Side Scaling Descriptor Queues (0-8), default 1=number of cpus (array of int)
parm: VMDQ:Number of Virtual Machine Device Queues: 0-1 = disable, 2-8 enable, default 0 (array of int)
parm: max_vfs:Number of Virtual Functions: 0 = disable, 1-7 enable, default 0 (array of int)
parm: MDD:Malicious Driver Detection (0/1), default 1 = enabled. Only available when max_vfs is greater than 0 (array of int)
parm: QueuePairs:Enable TX/RX queue pairs for interrupt handling (0,1), default 1=on (array of int)
parm: EEE:Enable/disable on parts that support the feature (array of int)
parm: DMAC:Disable or set latency for DMA Coalescing ((0=off, 1000-10000(msec), 250, 500 (usec)) (array of int)
parm: debug:Debug level (0=none, ..., 16=all) (int)



modprobe gave nothing.



lspci (i extracted only relevant info, 17+ kb):



01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10c9
Flags: fast devsel, IRQ 28
Memory at faee0000 (32-bit, non-prefetchable) [size=128K]
Memory at faec0000 (32-bit, non-prefetchable) [size=128K]
I/O ports at cc00
Memory at fae9c000 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at faea0000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] Message Signalled Interrupts: Mask+ 64bit+ Queue=0/0 Enable-
Capabilities: [70] MSI-X: Enable- Mask- TabSize=10
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [140] Device Serial Number ee-68-27-ff-ff-90-25-00
Capabilities: [150] #0e
Capabilities: [160] #10
Kernel modules: igb


01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10c9
Flags: fast devsel, IRQ 40
Memory at fae60000 (32-bit, non-prefetchable) [size=128K]
Memory at fae40000 (32-bit, non-prefetchable) [size=128K]
I/O ports at c880
Memory at fae98000 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at fae20000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] Message Signalled Interrupts: Mask+ 64bit+ Queue=0/0 Enable-
Capabilities: [70] MSI-X: Enable- Mask- TabSize=10
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [140] Device Serial Number ee-68-27-ff-ff-90-25-00
Capabilities: [150] #0e
Capabilities: [160] #10
Kernel modules: igb


02:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit KX4 Network Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10f7
Flags: bus master, fast devsel, latency 0, IRQ 24
Memory at f8fe0000 (64-bit, prefetchable) [size=128K]
I/O ports at dc00
Memory at f8fdc000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] Message Signalled Interrupts: Mask+ 64bit+ Queue=0/0 Enable-
Capabilities: [70] MSI-X: Enable+ Mask- TabSize=64
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [140] Device Serial Number 00-00-00-ff-ff-00-00-00
Capabilities: [150] #0e
Capabilities: [160] #10
Kernel driver in use: ixgbe
Kernel modules: ixgbe


02:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit KX4 Network Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10f7
Flags: bus master, fast devsel, latency 0, IRQ 34
Memory at f8fa0000 (64-bit, prefetchable) [size=128K]
I/O ports at d880
Memory at f8fd8000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] Message Signalled Interrupts: Mask+ 64bit+ Queue=0/0 Enable-
Capabilities: [70] MSI-X: Enable+ Mask- TabSize=64
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [140] Device Serial Number 00-00-00-ff-ff-00-00-00
Capabilities: [150] #0e
Capabilities: [160] #10
Kernel driver in use: ixgbe
Kernel modules: ixgbe
 
dmesg, but this was the initial text, after modprobe nothing added to dmesg...


igb 0000:01:00.0: PCI INT A -> GSI 28 (level, low) -> IRQ 28
igb 0000:01:00.0: setting latency timer to 64
igb 0000:01:00.0: irq 52 for MSI/MSI-X
igb 0000:01:00.0: irq 53 for MSI/MSI-X
igb 0000:01:00.0: The NVM Checksum Is Not Valid
igb 0000:01:00.0: PCI INT A disabled
igb: probe of 0000:01:00.0 failed with error -5
igb 0000:01:00.1: PCI INT B -> GSI 40 (level, low) -> IRQ 40
igb 0000:01:00.1: setting latency timer to 64
igb 0000:01:00.1: irq 52 for MSI/MSI-X
igb 0000:01:00.1: irq 53 for MSI/MSI-X
igb 0000:01:00.1: The NVM Checksum Is Not Valid
igb 0000:01:00.1: PCI INT B disabled
igb: probe of 0000:01:00.1 failed with error -5


I got the driver but needs compilation, does proxmox includes all the development stuff?

Dimitri
 
Ok, i will return at home now and come back tomorrow with a (portable) usb-to-ethernet adapter (no connectivity now!).

If all fail, i' ll take a backup to a portable disk and try to install 1.9 from scratch to see if it works, otherwise 1.7...


Thank you again.

Dimitri
 
Thank you Dietmar.

I run dpkg -i pve-kernel-.... and rebooted.

Everything came back again. Truly commercial level support, thank you!

Dimitri
 
... Truly commercial level support, thank you!

Dimitri
Hi,
I would say: much more than that!
I don't know any commercial support, where i got an new kernel (or an issue-fix) at the first work-hour on monday.
Normaly you got a ticket-number and perhaps an call in the afternoon - this is my experiences with closed-source-companies.

Udo