Communication issue through containers

Francesco Piraneo G. · Mar 21, 2020

Hi all, I've a network related issue.

On ProxMox VE 6.1-8 I have several containers, all based on debian 10.3.
All containers are connected via two bridges:

- vmbr0 connected to the physical ethernet port enp0s8 for outside connection, with address 192.168.1.210/24;
- vmbr1 internal to all containers; all addresses are 192.168.10.0/24.

CT100 is a mailserver with a MariaDB instance for user's credentials and a webserver with roudcube; MariaDB has been configured to listen on ALL addresses without ssl (just for a container to container communication). The nmap gives these results:

Code:

# nmap -sT 192.168.1.150
Starting Nmap 7.70 ( https://nmap.org ) at 2020-03-21 08:58 CET
Nmap scan report for mail.mydomain.ch (192.168.1.150)
Host is up (0.00014s latency).
Not shown: 992 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
25/tcp   open  smtp
80/tcp   open  http
143/tcp  open  imap
443/tcp  open  https
587/tcp  open  submission
993/tcp  open  imaps
3306/tcp open  mysql

# nmap -sT 192.168.10.13
Starting Nmap 7.70 ( https://nmap.org ) at 2020-03-21 08:59 CET
Nmap scan report for 192.168.10.13
Host is up (0.00014s latency).
Not shown: 992 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
25/tcp   open  smtp
80/tcp   open  http
143/tcp  open  imap
443/tcp  open  https
587/tcp  open  submission
993/tcp  open  imaps
3306/tcp open  mysql

So MariaDB ports are listening on ALL interfaces (of course also on 127.0.0.1 - dovecot and postfix can query for user's credentials).

CT105 is a "tool" container, with tools for internal administrative use, like phpmyadmin, with following interfaces:

- eth0 : 192.168.1.152/24
- eth1 : 192.168.10.12/24

Well, the issue: When issuing nmap from this container against the CT100 container, here is the result:

Code:

# nmap -sT 192.168.1.150
Starting Nmap 7.80 ( https://nmap.org ) at 2020-03-21 09:04 CET
Nmap scan report for 192.168.1.150
Host is up (0.00018s latency).
Not shown: 993 closed ports
PORT    STATE SERVICE
22/tcp  open  ssh
25/tcp  open  smtp
80/tcp  open  http
143/tcp open  imap
443/tcp open  https
587/tcp open  submission
993/tcp open  imaps
MAC Address: 96:75:B8:D9:37:7B (Unknown)

# nmap -sT 192.168.10.13
Starting Nmap 7.80 ( https://nmap.org ) at 2020-03-21 09:04 CET
Nmap scan report for 192.168.10.13
Host is up (0.00015s latency).
Not shown: 993 closed ports
PORT    STATE SERVICE
22/tcp  open  ssh
25/tcp  open  smtp
80/tcp  open  http
143/tcp open  imap
443/tcp open  https
587/tcp open  submission
993/tcp open  imaps
MAC Address: 42:8C:59:BB:7B:49 (Unknown)

As you notice the port 3306 appears to be closed!

As a proof I created a third container, CT107 called "test", where I installed another MariaDB instance and configured as the instance at CT100 and the port 3306 are correctly open and reacheable from the CT105.

If the port 3306 appeared to be closed also nmap-ping from CT100 I'd supposed something wrong on MariaDB instance, but the 3306 is open locally but appears to be closed when querying from other containers! So I'm thinking something related to ProxMox.

One more thing: Firewall has been disabled on the containers, all containers are unpriviledged, I tried to clone the CT100 container without any result (port 3306 open when queried locally, closed when queried from other containers); I'd like to avoid to rebuild the CT100 from scratch due to long configuration time, and I prefers to understands what's happening before simply redo the job, with the risk to have the same issue.

Any help is strongly appreciated to understand what's happening.

Francesco

TodorPetkov · Mar 21, 2020

Hello,

"Firewall has been disabled on the containers " - how about firewall on the host (hypervisor) itself? If you run packet capture on CT100, do you see packets coming to tcp port 3306?

Francesco Piraneo G. · Mar 21, 2020

TodorPetkov said:
how about firewall on the host (hypervisor) itself? If you run packet capture on CT100, do you see packets coming to tcp port 3306?

Previously the firewall was enabled but the rule's table was empty. Now I completely disabled the firewall.

I tried an `nmap` from the host (host ip address is 192.168.1.210 - so same subnet of the containers) to the CT100 container but the port 3306 is closed.

However an `nmap` from the host to the `test - CT107` container revealed that the port 3306 is open... so the issue is just between the CT100 to the world...

A side note: CT100 container originally has one single network card (the vmbr0) when I installed MariaDB! Then I added a second card on the vmbr1. On the CT107 I added the second eth card bond on vmbr1 then I installed MariaDB... I don't know if this may be the difference!

TodorPetkov · Mar 21, 2020

Can you run "ip route list" on the problematic container? Also, if you run tcpdump on it, do you see traffic arriving?

Francesco Piraneo G. · Mar 21, 2020

TodorPetkov said:
Can you run "ip route list" on the problematic container? Also, if you run tcpdump on it, do you see traffic arriving?

Todor, great news! Thank you in the meanwhile! We have traffic.

As you suggested:

CT100

Bash:

# ip route list
default via 192.168.1.1 dev eth0 onlink
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.150
192.168.10.0/24 dev eth1 proto kernel scope link src 192.168.10.13

CT105

Bash:

# ip route list
default via 192.168.1.1 dev eth0 proto static
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.152
192.168.10.0/24 dev eth1 proto kernel scope link src 192.168.10.12

With tcpdump:
Traffic from CT105 to CT100

Bash:

# tcpdump -i eth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
22:07:36.931771 IP 192.168.10.12 > 192.168.10.13: ICMP echo request, id 2823, seq 1, length 64
22:07:36.931822 IP 192.168.10.13 > 192.168.10.12: ICMP echo reply, id 2823, seq 1, length 64
22:07:37.959145 IP 192.168.10.12 > 192.168.10.13: ICMP echo request, id 2823, seq 2, length 64
22:07:37.959188 IP 192.168.10.13 > 192.168.10.12: ICMP echo reply, id 2823, seq 2, length 64
22:07:38.983169 IP 192.168.10.12 > 192.168.10.13: ICMP echo request, id 2823, seq 3, length 64
22:07:38.983213 IP 192.168.10.13 > 192.168.10.12: ICMP echo reply, id 2823, seq 3, length 64
22:07:40.007126 IP 192.168.10.12 > 192.168.10.13: ICMP echo request, id 2823, seq 4, length 64
22:07:40.007159 IP 192.168.10.13 > 192.168.10.12: ICMP echo reply, id 2823, seq 4, length 64
22:07:42.151115 ARP, Request who-has 192.168.10.12 tell 192.168.10.13, length 28
22:07:42.151200 ARP, Request who-has 192.168.10.13 tell 192.168.10.12, length 28
22:07:42.151205 ARP, Reply 192.168.10.13 is-at 42:8c:59:bb:7b:49 (oui Unknown), length 28
22:07:42.151208 ARP, Reply 192.168.10.12 is-at 2e:6e:0e:d2:f4:17 (oui Unknown), length 28


22:09:19.410961 IP 192.168.10.12.53224 > 192.168.10.13.mysql: Flags [S],[/S] seq 3687049004, win 64240, options [mss 1460,sackOK,TS val 526845707 ecr 0,nop,wscale 7], length 0
22:09:19.411004 IP 192.168.10.13.mysql > 192.168.10.12.53224: Flags [R.], seq 0, ack 3687049005, win 0, length 0
22:09:24.551094 ARP, Request who-has 192.168.10.12 tell 192.168.10.13, length 28
22:09:24.551162 ARP, Request who-has 192.168.10.13 tell 192.168.10.12, length 28
22:09:24.551166 ARP, Reply 192.168.10.13 is-at 42:8c:59:bb:7b:49 (oui Unknown), length 28
22:09:24.551168 ARP, Reply 192.168.10.12 is-at 2e:6e:0e:d2:f4:17 (oui Unknown), length 28
^C
18 packets captured
18 packets received by filter
0 packets dropped by kernel

I ran tcpdump over CT100 and in the first segment I started a ping from CT105 to CT100; the packets flows.
In the second segment I started phpmyadmin on CT105 and tried to login; so I've seen the packets of the login on CT100.
So definitively we have traffic flowing the containers... But still not understanding why from outside CT100 the port 3306 is closed.

Francesco

TodorPetkov · Mar 21, 2020

So, you have two networks, 192.168.1/24 on eth0 and 192.168.10/24 on eth1. You have a machine on 192.168.1 network and you try to access
192.168.10.13 from it on tcp/3306, is this correct?
If yes, then you may need to have a look at your gateway. The packet leaves the machine 192.168.1.x, goes to the default gateway, it routes to the container on eth1, then it returns the reply via eth0, but the mac addresses differ in the reply, so the source machine can get confused.
Try to run tcpdump on the source machine, on the problematic container (two tcpdumps - on eth0 and eth1), and also to capture packets on the gateway - see where it fails.

LnxBil · Mar 22, 2020

Just curious: Are all containers on two networks?

Francesco Piraneo G. · Mar 22, 2020

TodorPetkov said:
So, you have two networks, 192.168.1/24 on eth0 and 192.168.10/24 on eth1. You have a machine on 192.168.1 network and you try to access
192.168.10.13 from it on tcp/3306, is this correct?

Not exactly. Both containers has a second VNIC on 192.168.10.0/24 network... I suppose (and frankly speaking I hope!) that when querying from the test container to the MariaDB container both use the 192.168.10.0/24 network that don't need routing.
But just to be clear, MariaDB on it's container is listening on ALL NICs! The configuration is 0.0.0.0 as listening address parameter and when nmap-ping I see the port 3306 open in ALL NICs - Unfortunately ONLY on the MariaDB container!

When I try to nmap-ing from other container (or the host itself) to the MariaDB container the port 3306 is closed.

Just to be complete, on this container I have also a dovecot server (it's a mail server container...) and in the early configuration also secure POP3 and POP3 ports were opens! Same issue here, so the secure POP3 and standard POP3 were opens locally but appeared to be closed from other containers! But this doesn't posed so much problems for me because I don't want to support POP3 anymore so I inhibited dovecot on this ports.

This is to say: Maybe there is some NICs configuration related issue?

Francesco Piraneo G. · Mar 22, 2020

LnxBil said:
Just curious: Are all containers on two networks?

Yes! Both containers have 192.168.1.0/24 and 192.168.10.0/24 in two NICs: ETH0 and ETH1 bond on VMBR0 and VMBR1 respectively.

VMBR0 bond to enp0s8 then to the WAN. VMBR1 only local to have communication between containers.

Francesco Piraneo G. · Mar 22, 2020

TodorPetkov said:
So, you have two networks, 192.168.1/24 on eth0 and 192.168.10/24 on eth1. You have a machine on 192.168.1 network and you try to access
192.168.10.13 from it on tcp/3306, is this correct?

Another idea I had, it's related to another issue I had on a ProxMox infrastructure based and a VM based on Ubuntu 18.04: After an update the networking was down.

After writing to the Contabo support (that hosted this VMs and that uses ProxMox too) they reported a "more common Ubuntu 18 netplan misconfiguration that occurs after an update".

But as I know Debian (that I currently use) don't have netplan... right?

LnxBil · Mar 22, 2020

Francesco Piraneo G. said:
Yes! Both containers have 192.168.1.0/24 and 192.168.10.0/24 in two NICs: ETH0 and ETH1 bond on VMBR0 and VMBR1 respectively.

VMBR0 bond to enp0s8 then to the WAN. VMBR1 only local to have communication between containers.

Why don't you just use your local network (no other bridge)? WAN is not involved because the packets never leave your NIC.

Francesco Piraneo G. · Mar 22, 2020

LnxBil said:
Why don't you just use your local network (no other bridge)? WAN is not involved because the packets never leave your NIC.

As you correctly noted the packets don't leave the NIC and the machine is behind a firewall; but philosophically seems to me a correct choice to separate a "container only" network and a network attached to a WAN... At the time vNICs are free of charge... :-D

TodorPetkov · Mar 22, 2020

A lot of IFs here, but:
If you have two containers, say, CT1 with eth0 and eth1, and CT2 with eth0 and eth1, and eth0 nics are in the network0, and eth1 nics are in network1, then when you run scan from CT1 to CT2 in network0, by default it will use eth0 and so on.
Can you run the following test:
CT1: IP1_0 on eth0 and IP1_1 on eth1
CT2: IP2_0 on eth0 and IP2_1 on eth1
ON CT1 - run nmap/mysql connect/netcat/whatever to the problematic container on tcp/3306, and in the same time run tcpdump on CT1 with '-e' flag (it shows the ethernet header/MAC address): tcpdump -nni eth1 host IP1_1
ON CT2 - run "tcpdump -nni eth1 host IP1_1

Compare both outputs, see if packets arrives on CT2, if it gets a reply to CT1 and if CT1 receives the reply

Do you know if both containers are on the same physical host?

Francesco Piraneo G. · Mar 22, 2020

TodorPetkov said:
Do you know if both containers are on the same physical host?

Yes, both containers are on the same host (it's my server); well we know that we have traffic between the two containers also when trying to access to MariaDB; now the question is: Why MariaDB appears to be closed from outside BUT the port is open when nmapping from the same containers to the vNIC addresses?

Search

Search

Communication issue through containers

Francesco Piraneo G.

Member

TodorPetkov

Active Member

Francesco Piraneo G.

Member

TodorPetkov

Active Member

Francesco Piraneo G.

Member

TodorPetkov

Active Member

LnxBil

Distinguished Member

Francesco Piraneo G.

Member

Francesco Piraneo G.

Member

Francesco Piraneo G.

Member

LnxBil

Distinguished Member

Francesco Piraneo G.

Member

TodorPetkov

Active Member

Francesco Piraneo G.

Member