[SOLVED] Why is this veth/bridge added to my LXC container? [SOLVED]

Dec 2, 2020
70
27
38
34
The background of the story is documented here.

I have a LXC container that I cannot reach from a specific subnet, but I can reach it from other hosts (e.g. the router).

The sleuthing led me down to identify the problem in the LXC container's routing setup.

Code:
ip route

Code:
default via 192.168.40.1 dev eth0
...
192.168.16.0/20 dev br-985a84259068 proto kernel scope link src 192.168.16.1
192.168.40.0/24 dev eth0 proto kernel scope link src 192.168.40.17

The problem here is br-985a84259068 and the subnet 192.168.16.1. I do not know where this comes from. I have other containers with the same template in the same and other subnets that do not have this bridge.

The route is responsible for why I cannot reach the host from a client in subnet 30, e.g.: 192.168.30.11
Code:
ip route get 192.168.30.11

Code:
> 192.168.30.11 dev br-985a84259068 src 192.168.16.1 uid 0
cache

This ends up somewhere/nowhere.

If I shut this bridge, everything works fine:
Code:
ifconfig br-985a84259068 down
ip route get 192.168.30.11

Code:
> 192.168.30.11 via 192.168.40.1 dev eth0 src 192.168.40.17 uid 0
cache

However, if I restart the container, the bridge is added/activated again.

Code:
brctl show br-985a84259068

Code:
> bridge name bridge id STP enabled interfaces
> br-985a84259068 8000.02428b97932d no vethcd0643c

I do not know when this bridge is added. dmesg | grep vethcd0643c only shows:
Code:
> [17790.634737] br-985a84259068: port 1(vethcd0643c) entered blocking state
> [17790.634739] br-985a84259068: port 1(vethcd0643c) entered disabled state
> [17790.634841] device vethcd0643c entered promiscuous mode
> [17790.634932] br-985a84259068: port 1(vethcd0643c) entered blocking state
> [17790.634934] br-985a84259068: port 1(vethcd0643c) entered forwarding state
> [17790.635068] br-985a84259068: port 1(vethcd0643c) entered disabled state
> [17791.758316] IPv6: ADDRCONF(NETDEV_CHANGE): vethcd0643c: link becomes ready
> [17791.758347] br-985a84259068: port 1(vethcd0643c) entered blocking state
> [17791.758348] br-985a84259068: port 1(vethcd0643c) entered forwarding state
> [52723.576986] br-985a84259068: port 1(vethcd0643c) entered disabled state
> [53746.670014] device vethcd0643c left promiscuous mode
> [53746.670020] br-985a84259068: port 1(vethcd0643c) entered disabled state

There are two lines on the LXC startup debug log regarding veth, but I do not know what to make of it:
Code:
cat /tmp/lxc-ID.log | grep veth
Code:
> lxc-start 102 20210421035111.653 DEBUG    network - network.c:instantiate_veth:449 - Instantiated veth tunnel "veth102i0 <--> vethHRwYY5"
> lxc-start 102 20210421035151.122 INFO     network - network.c:lxc_delete_network_priv:3232 - Removed interface "veth102i0" from ""


This is the LXC conf:

Code:
cat /var/lib/lxc/102/config

Code:
> lxc.cgroup.relative = 0
> lxc.cgroup.dir.monitor = lxc.monitor/102
> lxc.cgroup.dir.container = lxc/102
> lxc.cgroup.dir.container.inner = ns
> lxc.arch = amd64
> lxc.include = /usr/share/lxc/config/debian.common.conf
> lxc.include = /usr/share/lxc/config/debian.userns.conf
> lxc.apparmor.profile = generated
> lxc.apparmor.allow_nesting = 1
> lxc.mount.auto = sys:mixed
> lxc.monitor.unshare = 1
> lxc.tty.max = 2
> lxc.environment = TERM=linux
> lxc.uts.name = docker
> lxc.cgroup.memory.limit_in_bytes = 10586423296
> lxc.cgroup.memory.memsw.limit_in_bytes = 21172846592
> lxc.cgroup.cpu.shares = 1024
> lxc.rootfs.path = /var/lib/lxc/102/rootfs
> lxc.net.0.type = veth
> lxc.net.0.veth.pair = veth102i0
> lxc.net.0.hwaddr = 8E:09:62:C87:E9
> lxc.net.0.name = eth0
> lxc.net.0.script.up = /usr/share/lxc/lxcnetaddbr
> lxc.idmap = u 0 100000 33
> lxc.idmap = g 0 100000 33
> lxc.idmap = u 33 1005 1
> lxc.idmap = g 33 1005 1
> lxc.idmap = u 34 100034 65502
> lxc.idmap = g 34 100034 65502
> lxc.cgroup.cpuset.cpus = 0,2,5-6
 
Last edited:
What is strange is that the bridge is a loop to the host to itself:

Code:
hostname -I

Code:
>192.168.40.17 172.28.0.1 172.23.0.1 192.168.16.1 172.17.0.1

- 192.168.40.17 is the correct IP of the host
- 192.168.16.1 is the IP where the internal bridge points to

Inspect who 192.168.16.1 is, by verifying fingerprints:

Code:
apt-get install nmap
nmap 192.168.16.1 --script ssh-hostkey

Code:
Starting Nmap 7.70 ( https://nmap.org ) at 2021-04-21 07:01 UTC
Nmap scan report for 192.168.16.1
Host is up (0.0000070s latency).
Not shown: 996 closed ports
PORT      STATE    SERVICE
22/tcp    open     ssh
| ssh-hostkey:
|   2048 37:18:5f:65:09:98:db:ac:50:bf:35:dc:93:d0:c6:05 (RSA)
|   256 b3:3e:2f:cf:d8:7a:7d:17:12:5e:a2:1a:0e:34:2e:d7 (ECDSA)
|_  256 b0:ea:3b:7b:a3:1d:fe:00:1d:65:7e:b0:fe:d3:39:e1 (ED25519)
80/tcp    open     http
443/tcp   open     https
49153/tcp filtered unknown

is the same as:

Code:
nmap 192.168.40.17 --script ssh-hostkey
Code:
Starting Nmap 7.70 ( https://nmap.org ) at 2021-04-21 07:03 UTC
Nmap scan report for 192.168.40.17
Host is up (0.0000070s latency).
Not shown: 996 closed ports
PORT      STATE    SERVICE
22/tcp    open     ssh
| ssh-hostkey:
|   2048 37:18:5f:65:09:98:db:ac:50:bf:35:dc:93:d0:c6:05 (RSA)
|   256 b3:3e:2f:cf:d8:7a:7d:17:12:5e:a2:1a:0e:34:2e:d7 (ECDSA)
|_  256 b0:ea:3b:7b:a3:1d:fe:00:1d:65:7e:b0:fe:d3:39:e1 (ED25519)
80/tcp    open     http
443/tcp   open     https
49153/tcp filtered unknown

Nmap done: 1 IP address (1 host up) scanned in 2.93 seconds
 
I solved my problem: It was Docker running on that LXC container that issued a default docker network in the 192.168.0.0 subnet -
it somehow missed that this range is in use. See the issue #37823 moby where it is described
how to change the default docker0 network. After I stopped all containers, changed the daemon.json and restarted docker daemon, the
issue was gone.

Docker is convenient in many situations, but all that automation in the background has its caveats... I was employed with this issue for a couple of days now.
 
Last edited: