Random "connection refused" due to firewall from other container

Sebastian256 · Sep 20, 2017

Hello,

I encountered a weird problem with my proxmox server. When trying to ssh into one of the LXC containers, I sometimes get "connection refused". With a retry, the connection works fine.

After capturing packets at different spots I found out that the problem occurs in the following scenario:

PVE host is on IP 2001:db8::1
There are 2 containers:
- 101 on 2001:db8::101
  The pve firewall accepts ssh connections
- 102 on 2001:db8::102
  The pve firewall rejects ssh connections
Both containers are bridged to vmbr0.
For each container, pve creates an internal interface
- veth101i0 represents container 101 on vmbr0
- veth102i0 represents container 102 on vmbr0

The following steps trigger the problem:

I want to ssh from the pve host to container 101
The pve host initiates a tcp connection from 2001:db8::1 to 2001:db8::101
The pve host looks for the IP 2001:db8::101 in its ndp cache and finds the mac address 01:23:45:67:89:ab
A tcp packet is created with SYN flag, destination ip 2001:db8::101, and destination mac 01:23:45:67:89:ab. This packet is sent to vmbr0.
The bridge does not have the bridge port for mac address 01:23:45:67:89:ab in its cache, the packet is flooded to all bridge ports.
Now we have two copies of the same packet. One is on veth101i0, the other one on veth102i0.
The copy on veth102i0 is processed first. It hits the ip6tables reject rule that is configured as reject-with tcp-reset. A tcp response packet with the RST flag is generated.
Now the copy on veth101i0 is processed. It reaches the container where we send the SYN+ACK response.
The pve host sees the response generated in step 7 first. The connection is aborted and a 'Connection Refused' error is shown by the ssh client.
Now the pve host sees the response generated in step 8, but it doesn't match to any pending tcp connection. The response packet is therefore being ignored.
Subsequent connection attempts work fine because vmbr0 has now cached on which bridge port 01:23:45:67:89:ab can be reached. Interface veth102i0 will no longer receive the initial packet and therefore no RST response is created.

To avoid this issue, you can use DROP instead of REJECT as policy, or ensure that the vmbr0 bridge cache always knows where to reach your containers (e.g. constantly producing traffic). Any better idea?

Code:

root@pve ~ # pveversion -v
proxmox-ve: 5.0-21 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-32 (running version: 5.0-32/2560e073)
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.15-1-pve: 4.10.15-15
pve-kernel-4.10.17-3-pve: 4.10.17-21
pve-kernel-4.10.17-1-pve: 4.10.17-18
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve3
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-12
qemu-server: 5.0-15
pve-firmware: 2.0-2
libpve-common-perl: 5.0-18
libpve-guest-common-perl: 2.0-11
libpve-access-control: 5.0-6
libpve-storage-perl: 5.0-15
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-9
pve-qemu-kvm: 2.9.1-1
pve-container: 2.0-16
pve-firewall: 3.0-3
pve-ha-manager: 2.0-2
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-1
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.11-pve17~bpo90

Edit:
I opened a bug for it because there has been no reply in this thread.

spanchy · Dec 10, 2023

Hello, I want up the topic. Same in 8.1
My nginx (reverse proxy, lxc) sometimes get error 'connection refused' when connect to lxc in the same node.
But if I disable pve-firewall - all working fine.

Code:

proxmox-ve: 8.1.0 (running kernel: 6.2.16-15-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1

ph0x · Jan 10, 2024

I encountered this problem as well and finally found someone who dug into it. More than six years ago ...
I had the most problems with VMs which have a lot of virtual nics. Setting the input policy to DROP rather than REJECT seems to resolve it.

Search

Search

Random "connection refused" due to firewall from other container

Sebastian256

Active Member

spanchy

Active Member

ph0x

Renowned Member