One Node Down, Network Issues

stsinc

Member
Apr 15, 2021
66
0
11
Hi,
For some reason I still do not understand, one of the three nodes in my cluster did not mount yesterday and is still down.
After some research, I ended up narrowing down the issue to a network issue.
So:
  • I have ifupdown2 installed
  • I checked the connection to the switch > OK
  • I did a dmesg | grep eth and found the name of the Ethernet card in the node to be enp1s0f0
  • I checked the interfaces content and it seems to comply with what I have on my other nodes:
Code:
auto lo
iface lo inet loopback
iface enp1s0f0 inet manual
        mtu 9000
auto vmbr0
iface vmbr0 inet static
        address 192.168.1.69/24
        gateway 192.168.1.1
        bridge-ports enp1s0f0
        bridge-stp off
        bridge-fd 0
        mtu 9000

Here is what happens:
  1. I have attached a screen and a keyboard to the node: this way, I can check the console without being interrupted by network issues
  2. I reboot the node from its console
  3. I check the switch, the LED lights up
  4. I do an ip a -- the Ethernet card enp1s0f0 is DOWN
  5. as a result:
    1. the bridge vmbr0 does not show on the list
    2. and the Web interface of Proxmox cannot be displayed
I attach a screen copy of the console:

IMG_20210419_092330654.png

What is happening and how can I ge out of this issue that is blocking for our work?
Best,
Stephen
 
Last edited:
Did the networking once work with that docker bridge? Such a bridge does a lot of voodoo with your networking, it wouldn't be a wonder if that bridge breaks the "normal" interface.
That's the reason why docker is encouraged to be used inside a VM.
 
Oh yes, it was working yesterday morning.
You are right, there are several Docker stacks installed in this node but they are installed:
  • either in Turnkey Linux Core CTs
  • or in VMs
Do you think that it would be useful to uninstall docker from the node itself?
 
I just checked, and docker is NOT installed in the node itself: apt remove docker tells me docker is not installed.
 
I also just did a pct list to check if any of the containers/vms with docker embedded were still active > all are stopped
 
If docker is not installed I would suggest to remove the docker bridge that can be seen in your screenshot, just to rule out any influence on your networking.
 
So, I followed your advice:
1. Removed the docker0 bridge:
Code:
ip link set docker0 down
brctl delbr docker0
2. ifup -a
3. ip a : Now both the network card interface (enp1s0f0) and the Proxmox bridge (vumbr0) are active YAYYYY!!!
4. A green tick is displayed again in the cluster interface
5. But after approx. two minutes something weird happens when I do an ip a again:
  • the network card interface disappears
  • a NEW "veth" appears instead, along the existing one.
So, in conclusion:
  • We are definitely on the good track because the node has been fully functional for a short amount of time
  • How can I get rid of those pesky vethxx and what ae they anyway???
 
UPDATE
  • Obviously, the vethxx stuff also comes from docker, they are virtual Ethernet ports
  • I tried to get rid of them both (now there are two) by ip link delete <veth_ID> but it responds: Cannot find device <veth_ID>
 
UPDATE #2
In fact, I feel totally dumb because at one point I must have installed docker bare-metal on the node.
I was trying to apt remove docker and got an error. But docker is NOT installed under "docker".
 
UPDATE #3
I love docker but I did not know it it would behave like Alien in the movie -- how difficult it is to get rid of it, even when installed partially (in my case only docker-ce was installed) !!
So I eventually removed all docker presence on the first level of the node (bare metal). I also disabled any call to it from the systemctl
Still:
  • my network card interface still does not mount
  • one veth still appears
PLEASE HELP!
 
Hm, my experience with docker network is a bit rusty, but my guess would be to remove /var/lib/docker completely, if you already uninstalled every docker package. Plus probably a reboot afterwards.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!