Network Interface Not Connecting

SpaceFlier

New Member
Apr 9, 2024
7
0
1
I've got 2 servers in a Proxmox cluster and each of them has a nic which has 4 10g sfp+ ports. I've taken a transceiver and connected it to a fibre cable to a USW Enterprise 48 PoE switch. This was working fine until a while back were both servers lost connection over fibre. I ran an ethernet cable as a temp measure and it's finally time for me to troubleshoot the problem. Looking in ethtool I get the details for the connection with link speed and duplex however I get link detected: no
Code:
root@prox1:~# ethtool enp10s0f1np1
Settings for enp10s0f1np1:
        Supported ports: [ FIBRE ]
        Supported link modes:   10000baseT/Full
                                1000baseX/Full
                                10000baseSR/Full
                                10000baseLR/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
                                1000baseX/Full
                                10000baseSR/Full
                                10000baseLR/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: no

If i unplug the cable it falls back not knowing any details with duplex and speed unknown. My /etc/network/interfaces file looks like this:
Code:
auto lo
iface lo inet loopback

iface eno1 inet manual

iface enp6s0 inet manual

iface enp7s0 inet manual

iface eno2 inet manual

iface enp10s0f0np0 inet manual

iface enp10s0f1np1 inet manual

iface enp10s0f2np2 inet manual

iface enp10s0f3np3 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.2.2/24
        gateway 192.168.2.1
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

source /etc/network/interfaces.d/*
I have tried changing bridge-ports to enp10s0f1np1 however, that does not affect the ports. in the Unifi dashboard, I can ping that interface but I get no return. Dmesg has no errors and it recognises everything fine plugging in and out the cable.
What are my next steps for figuring out what's wrong and what can I do to solve it?
 
Are these servers next to eachother / close enough to be linked up directly? You could (at least temporarily) try this to verify the "server" side of things.

One other thing to check (which I doubt it would be, since you said it did work before) is that you didn't swap/cross the connectors. In the sfp-module, one fibre-port is a send and the other one a receive-connector, putting two send-connectors on the same fibre can not only be not-working, but it might even damage the device itself, as you're basically shooting a laser at something that should not be receiving a laser-beam, and heating things up (that's also why you should not look into a powered on SFP-connector either).

Finally, in your unifi-controller for the switch, did you check if the SFP-port reports any errors?
 
Last edited:
Thanks for direct server connection idea, I'll try that and get back to you on the results. As for crossing the fibres I don't think that's likely, they use lc connectors so I would really need to jam that in to the transceiver to cross it. And I don't see any errors in the unifi controller so I don't think that's the problem, and I have tried using different sfp ports on the switch as well
 
The direct server to server connection still resulted in no link detected. I have tried in different nics, different cables, and different transceivers as well, and setting the speeds and duplex manually instead of auto negotiation. Given how we can't pinpoint the problem, I'm wondering if you know any commands or scripts which can determine exactly what is failing in the setup, or perform some diagnostic test on various components. If you have any other suggestions that may help they are much appreciated thanks
 
I take it you tried a simple "ifup enp10s0f1np1" to try and force a connection already?

Other then that, could you check what Kernel-version you are running? If it's 6.8, try downgrading and pinning to 6.5 and see if that improves things (would need a reboot so would probably have to be scheduled.)
https://forum.proxmox.com/threads/o...e-8-available-on-test-no-subscription.144557/

If you are going to do a reboot anyway, you could also try loading either the Proxmox installer or a linux live-iso to see if network is detected on there (to see if it's a driver/configuration issue or a hardware-issue somewhere still)

Also as a sidenote, maybe you'd want to add a second bridge-interface in a separate network-range (so no overlapping ranges, that can cause other issues) and configure either a a test-vm on the other host or a local PC on that same IP-subnet, so you can do activity-testing without it disturbing your production connection.

Other then the above tips, I'm a bit out of suggestions, sorry, hope the above either fixes it or points towards a solution, or that someone else can chime in with ideas
 
Thanks for your suggestions, I woke up this morning and the port was working. I have no idea what caused the problem or the solution as nothing was changed before it went back up. Most likely it's a faulty component that I'll have to wrangle out or this might happen again. Thanks again for your help.
 
Nice that it is working, but also of course very annoying that you don't know what caused it exactly.
Let's hope it'll just stay gone for a long time, and otherwise you now have a few more idea's to test out in the (hopefully never happening) next time.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!