Strange connectivity issues with Hetzner vSwitch Cloud Network

TheJKM

New Member
Oct 12, 2023
1
0
1
Hello,

I use Proxmox since version 4.4 in my homelab, and recently set up my first Proxmox instance on a Hetzner Root Server. The machine has a single public IPv4 and the usual /64 IPv6 subnet. While most of the setup is working, I have an issue where I need some more ideas.

For IPv6, the host and every VM has an address from the subnet. This is the most straightforward configuration and works flawless.
For IPv4, the public address is assigned to the host. For the VMs, I use a routed setup with masquerading on the host. This is also working flawless.

Additionally, the machine and the VMs have to be accessible from Hetzner cloud servers within a private network. To make this work, I connected a vSwitch to the cloud network and to the server. Hetzner uses VLAN tag 4000 for vSwitch packages. I created an additional interface for this VLAN, and an additional VM Bridge. The bridge has an IP address from the vSwitch private subnet. All VMs connected to this VM Bridge also get an IP address from this subnet. For the VMs, this is working as it should, the VMs can access the cloud servers and vice versa.
On the host, I experience a strange issue. Right after reboot, the connection works incoming and outgoing. After some minutes, it stops working. The host is inaccessible from all cloud servers. In the other direction, the cloud servers are also at first inaccessible from the host. However, after a few packages, the connection comes back alive and works for a few minutes in both directions - until the issue starts again.
When I "fix" the connection with a ping, it takes 3-4 lost packages until the connection works.
When I "fix" it using traceroute to a cloud server, it takes from 14 to 16 until the vSwitch answers.
The issue is only on the host - on the VMs it's always working, independently from the current state on the host.

What could be the issue? Honestly I'm running out of ideas, and I couldn't find a similar case in the web. I'll append my /etc/network/interfaces configuration with the IPs masked.

One note, I really don't want to have more than one public IPv4 address. There will be quite some VMs, and I need the public IPv4 just for API provides who think it's fine to not offer IPv6 for their APIs. All incoming traffic goes through a load balancer and the private network. In times of short IPv4 supplies, I don't want to waste them just for having IPv4 internet connectivity.

In case you need any more information, feel free to ask. Looking forward for any ideas.

TheJKM

Code:
source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

iface lo inet6 loopback

auto enp27s0
iface enp27s0 inet static
    address <PUBLIC_IP>/32
    gateway <PUBLIC_IP_GATEWAY>
    pointopoint <PUBLIC_IP_GATEWAY>
    up route add -net <PUBLIC_IP_NET_BASE> netmask 255.255.255.192 gw <PUBLIC_IP_GATEWAY> dev enp27s0

iface enp27s0 inet6 static
    address <PUBLIC_IP_V6>/128
    gateway fe80::1

auto enp27s0.4000
iface enp27s0.4000 inet manual
    mtu 1400

auto vmbr0
iface vmbr0 inet static
    address 192.168.0.1/24
    bridge-ports none
    bridge-stp off
    bridge-fd 0
    post-up   iptables -t nat -A POSTROUTING -s '192.168.0.0/24' -o enp27s0 -j MASQUERADE
    post-down iptables -t nat -D POSTROUTING -s '192.168.0.0/24' -o enp27s0 -j MASQUERADE

iface vmbr0 inet6 manual
    address <ANOTHER_PUBLIC_IP_V6>/64
    up ip -6 route add <PUBLIC_IP_V6_PREFIX>/64 dev vmbr0

auto vmbr4000
iface vmbr4000 inet static
    address 10.0.30.10/24
    bridge-ports enp27s0.4000
    bridge-stp off
    bridge-fd 0
    mtu 1400
    up ip route add 10.0.0.0/16 via 10.0.30.1 dev vmbr4000
    down ip route del 10.0.0.0/16 via 10.0.30.1 dev vmbr4000
 
Make sure you set the MTU of 1400 on the VLAN and VM-Bridge. Also set the VMs that are inside this network to MTU of 1 (inherit MTU from Bridge)
If you use OPNSense as Router, you might also want to configure the MTU on its interfaces.

WARNING: The VMs might go down until you reboot them!

I have seen working ping but had issues with cross-node data transfer. Classical 'randomness' of a MTU issue.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!