[SOLVED] Troubleshooting non-working ethernet interface

topquark

New Member
Sep 25, 2018
18
3
3
34
I've got a non functional ethernet interface, and unable to figure out why. Symptoms: interface is part of a bond, but is down, and no lights on the switch or the sfp+ port are on.
I've replaced the cable by a confirmed working one, but that doesn't help. I've tried multiple switch ports, but that doesn't help either. So I'm fairly sure that the problem is on the node.

The node basically has 2 bonded networks (1 for ceph private, 1 for public) each configured to have a 10GbE SFP+ port as main interface, and a 1GbE as fallback interface. The troubling one is the 10GbE sfp+ port for the Ceph private. As far as I can see the conig for the public network is similar to the private one, so not sure if the problem in in the configuration (do hope so though, alternative is hardware).
I have noticed that ethtools gives a different output for ethtools, it gives
Speed: Unknown!; Duplex: Unknown! (255) for eno7,
this is 10000Mb/s and Full for eno8

A final thing I noticed is in dmesg; eno7 and eno8 act the same exept 8 goes up.
[ 9.005377] ixgbe 0000:04:00.1 eno8: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 9.069430] bond1: link status definitely up for interface eno8, 10000 Mbps full duplex

But I've got no idea how to interpret this, any advice?


I've added the output of ethtool for both eno 7 and 8, and the interfaces config below.
ethtool:
Code:
root@asgard:~# ethtool eno7
Settings for eno7:
        Supported ports: [ FIBRE ]
        Supported link modes:   10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: no
root@asgard:~# ethtool eno8
Settings for eno8:
        Supported ports: [ FIBRE ]
        Supported link modes:   10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

/etc/network/interfaces (other interfaces removed for readability):
Code:
iface eno4 inet manual
iface eno5 inet manual
iface eno7 inet manual
iface eno8 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eno4 eno7
        bond-miimon 100
        bond-mode active-backup
        bond-primary eno7
#CephPrivate

auto bond1
iface bond1 inet manual
        bond-slaves eno5 eno8
        bond-miimon 100
        bond-mode active-backup
        bond-primary eno8
#Public

auto vmbr10
iface vmbr10 inet static
        address  192.168.10.10
        netmask  24
        gateway  192.168.10.1
        bridge-ports bond1.10
        bridge-stp off
        bridge-fd 0
#maincluster

auto vmbr2
iface vmbr2 inet static
        address  192.168.20.10
        netmask  24
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#CephPrivate

auto vmbr1
iface vmbr1 inet manual
        bridge-ports bond1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#LAN
 
Hi,

I would say a broken NIC if you check the switches and the cable.
Is eno7 and eno8 on the same board or is it an independent NIC?
What Intel NIC model is it?
 
eno 7 and 8 are both on the motherboard:
Dual LAN with 10G SFP+ from D-1500 SoC.
Xeon D1518 in my case
 
If the NIC is on the SOC and you proved the cable, also the driver works as we know form eno8.
I would check if the mount of the SFP+ on the mainboard is ok.
 
Just did some more checking. The SFP+ module is plugged in all the way, it latches and is in as deep as eno8.
also inspected the motherboard. No visible traces of damage or bad solder joints.
So I guess it's indeed confirmed to be a broken NIC.
Added in a 10GbE SFP+ mellanox I ripped out of a computer, and it does work as expected on that.
 
Interesting, the PHY I can find on the mobo manual is supposed to be the CS2447 (https://www.supermicro.com/QuickRefs/motherboard/d/QRG-1858.pdf) but that may be a typo, as I can't find other info on that chip.
Anyway it's connecting to eno8 too, so will do some testing w.r.t. throughput on that one too, as I still have some other issues in my proxmox cluster, that may have to do with network speed. And will prepare to either RMA the board or avoid the SFP+ ports on the motherboard.
 
Hi, this in old thread, but as we face the same issue with this motherboard, we are very interested if you have found a solution for this issue ?
In our setup, the eno 7 is failing with increased MB temperature and recovers at colder temperatures.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!