Gateway inside VM

NStorm · Jul 15, 2015

I've faced some strange behaviour I couldn't resolve yet. Probably I'll figure it out soon, but will post it here in case anyone can answer faster than I handle it myself. I'll simplify things a bit to keep it easier to understand.
I have node with recent Proxmox running, call it pve-node1. eth0 connected to LAN, eth1 connected to WAN (Internet). Both got no IPs attached and bridged to vmbr0 and vmbr1 instead. vmbr0 has IP in LAN 192.168.1.100. vmbr1 has no IPs on WAN.
I've transfered a physical gateway server to a VM container, and attached both vmbr0 and vmbr1 to it. As eth0 and eth1 inside VM they have IPs 192.168.1.1 and X.X.X.X (IP on the Internet).
Same configuration worked perfectly on dedicated physical server and served Internet connections through NAT for whole LAN, including pve-node1. But once it was converted to VM, whole LAN can access Internet without any issues as before, except for the pve-node1. I can ping the Internet from there, even big-sized packets works. But TCP sessions fails to establish. I can see the first incoming SYN,ACK packet and thats all, no further incoming packets, session just hangs until timeout.
I've seen such things when ingress and egress routes are different. But it's not my firewall or routing setup here, because everything works for other hosts. Got to be related with KVM virtio-net / bridges. But I can't figure out how yet, because I don't see why it shouldn't work.

Richard · Jul 20, 2015

NStorm said:
I've faced some strange behaviour I couldn't resolve yet. Probably I'll figure it out soon, but will post it here in case anyone can answer faster than I handle it myself. I'll simplify things a bit to keep it easier to understand.
I have node with recent Proxmox running, call it pve-node1. eth0 connected to LAN, eth1 connected to WAN (Internet). Both got no IPs attached and bridged to vmbr0 and vmbr1 instead. vmbr0 has IP in LAN 192.168.1.100. vmbr1 has no IPs on WAN.
I've transfered a physical gateway server to a VM container, and attached both vmbr0 and vmbr1 to it.

Container? It's rather sounds you have a KVM.

NStorm said:
As eth0 and eth1 inside VM they have IPs 192.168.1.1 and X.X.X.X (IP on the Internet).
Same configuration worked perfectly on dedicated physical server and served Internet connections through NAT for whole LAN, including pve-node1. But once it was converted to VM, whole LAN can access Internet without any issues as before, except for the pve-node1. I can ping the Internet from there, even big-sized packets works. But TCP sessions fails to establish. I can see the first incoming SYN,ACK packet and thats all, no further incoming packets, session just hangs until timeout.
I've seen such things when ingress and egress routes are different. But it's not my firewall or routing setup here, because everything works for other hosts. Got to be related with KVM virtio-net / bridges. But I can't figure out how yet, because I don't see why it shouldn't work.

Post /etc/network/interfaces and details about configuration (route, iptables) in the VM.

NStorm · Jul 21, 2015

Yes, I meant KVM. My mistake for messing that because I use both OpenVZ and KVM. But this is about KVM.
My configuration is quite complicated. This is why I didn't posted it at first. Decided I could keep it simple and probably someone had faced similar issue. But I've tried to flush firewall and routes and setup a simple rules to test without any success.
But I've finally managed to overcome the problem. While debugging traffic with tcpdump, I've notice that 4th packet (after 3 packets of negotiation) were marked as incorrect checksum. Right after there I've had no reply packets. That is usually fine with hardware checksum offloading. But seems here it was an issue just because there was no actual hardware NIC involved as packets are travelled only through virtual networking and a bridge. Once I've disabled tx checksumming on vmbr0 (ethtool -K vmbr0 tx off) everything started to work.

EDIT: I need to investigate more on the problem and what does offloading actually means for bridge devices (as this is not a hardware) and does it affects performance and system. Because switching tx offloading on vmbr doesn't affects attached physical interfaces and they still have offloading enabled. But at least I've localized the problem:

Code:

root@node1:~# telnet google.com 80
Trying 85.112.121.108...
Connected to google.com.
Escape character is '^]'.
GET /

^C^]
telnet> quit
Connection closed.
root@node1:~# ethtool -K vmbr0 tx off
Actual changes:
tx-checksumming: off
    tx-checksum-ip-generic: off
tcp-segmentation-offload: off
    tx-tcp-segmentation: off
    tx-tcp-ecn-segmentation: off
    tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
root@node1:~# telnet google.com 80
Trying 85.112.121.104...
Connected to google.com.
Escape character is '^]'.
GET /
HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.ru/?gfe_rd=cr&ei=JUWuVbXLDY3AsAHgwoGABw
Content-Length: 258
Date: Tue, 21 Jul 2015 13:12:05 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.ru/?gfe_rd=cr&amp;ei=JUWuVbXLDY3AsAHgwoGABw">here</A>.
</BODY></HTML>
Connection closed by foreign host.
root@node1:~# 
root@node1:~# ethtool -K vmbr0 tx on
Actual changes:
tx-checksumming: on
    tx-checksum-ip-generic: on
tcp-segmentation-offload: on
    tx-tcp-segmentation: on
    tx-tcp-ecn-segmentation: on
    tx-tcp6-segmentation: on
udp-fragmentation-offload: on [fixed]
root@node1:~# telnet google.com 80
Trying 85.112.121.103...
Connected to google.com.
Escape character is '^]'.
GET /

^C
^]
telnet> quit
Connection closed.

So far I assume offloading for bridges does nothing than a disabling actual software checksumming to reduce CPU load, because assuming it will go out of physical NIC where hardware will calculate and update checksum value. But if the packet goes through the bridge and to the tap device (tap${VMID}i0) there are no hardware NIC to handle checksumming so it remains incorrect and the packets are dropped from the VM side as incorrect (wrong checksum).

EDIT2: Now I wonder why the checksum for first SYN and SYN,ACK packets are correct with offloading on bridge enabled...
EDIT3: In case it matters the guest network type are set to virtio-net.
EDIT4: Lol, too much edits. But seems like I got it why it works like that I've mentionned in EDIT2 - the problem most likely arises in TCP offloading, not just a generic tx checksum offload and because I have nat enable on VM as described in 1st post (coz TCP connections node <-> VM works fine) .

NStorm · Aug 28, 2015

Any comments from ProxMox devs on this?

NStorm · Sep 22, 2015

After upgrade to pve-kernel-2.6.32-41-pve I cannot set bridge offloading options anymore:

Code:

# ethtool -K vmbr0 tx off
Could not change any device features

manu · Oct 6, 2015

Hi
Can you post the content of /etc/network/interfaces on your host ?

NStorm · Oct 8, 2015

Hello.

Bug #733 are related to this issue. Seems like we have a different pve-kernel-2.6.32-41-pve on repositories with different release dates, sizes and hash sums.

/etc/network/interfaces:

Code:

# network interface settings
auto lo
iface lo inet loopback

iface eth0 inet manual

iface eth1 inet manual

auto eth2
iface eth2 inet manual

auto vlan252
iface vlan252 inet manual
    vlan_raw_device eth0

auto vlan301
iface vlan301 inet manual
    vlan_raw_device eth0

auto vlan302
iface vlan302 inet manual
    vlan_raw_device eth0

auto vmbr0
iface vmbr0 inet static
    address  192.168.9.231
    netmask  255.255.255.0
    gateway  192.168.9.246
    bridge_ports eth0
    bridge_stp on
    bridge_fd 0
    up ip route add 192.168.0.0/16 via 192.168.9.1 && ip route add 10.0.0.0/8 via 192.168.9.9 && ip route add 192.168.252.0/24 via 192.168.9.234 && ethtool -K vmbr0 tx off
    down ip route del 192.168.0.0/16 via 192.168.9.1 && ip route del 10.0.0.0/8 via 192.168.9.9 && ip route del 192.168.252.0/24 via 192.168.9.234

auto vmbr1
iface vmbr1 inet manual
    bridge_ports vlan301
    bridge_stp off
    bridge_fd 0

auto vmbr2
iface vmbr2 inet static
    address  192.168.248.1
    netmask  255.255.255.248
    bridge_ports eth2
    bridge_stp off
    bridge_fd 0

auto vmbr302
iface vmbr302 inet manual
    bridge_ports vlan302
    bridge_stp off
    bridge_fd 0

This one works perfectly fine with pve-kernel-2.6.32-40-pve.

manu · Oct 12, 2015

Hi
The pve-kernel-3.10.0-12-pve package from pve 3.4, or the pve-kernel-4.2.1-1-pve from Proxmox VE 4.0 do not exhibit this behaviour.
It would make senses to upgrade your system as 2.6.32 is old, in matter of kernel times.

NStorm · Oct 12, 2015

Proxmox 4.0 was officially release only few days ago. And the kernel 3.10 for pve 3.4 aren't supported and considered experimental.

tom · Oct 12, 2015

NStorm said:
Proxmox 4.0 was officially release only few days ago. And the kernel 3.10 for pve 3.4 aren't supported and considered experimental.

3.10 on 3.4 is known to work well and is supported.

default kernel on Proxmox VE VE is still 2.6.32, as 3.10 does NOT support openvz.

NStorm · Oct 13, 2015

Unfortunately I do use OpenVZ extensively. I'm not ready to migrate to LXC and besides 2.6.32 worked just fine before those recent updates.

manu · Oct 29, 2015

I just noticed that also it is not possible to disable globally checksumming for the bridge, you can do this *per protocol* and this works whit all PVE 3.4 kernels.

ethtool -K vmbr0 tx-checksum-ip-generic off is not working
but ethtool -K vmbr0 tx-checksum-ipv4
and ethtool -K vmbr0 tx-checksum-ipv6 do

since I assume the specific overrides the general, does that fixes your networking issue ?

Search

Search

Gateway inside VM

NStorm

Active Member

Richard

Renowned Member

NStorm

Active Member

NStorm

Active Member

NStorm

Active Member

manu

Proxmox Staff Member

NStorm

Active Member

manu

Proxmox Staff Member

NStorm

Active Member

tom

Proxmox Staff Member

NStorm

Active Member

manu

Proxmox Staff Member

We value your privacy