As noted above the Proxmox SDN function is not yet sufficient for my usecase so here is my solution.
I have two hosts at Hetzner in a Proxmox cluster. For internal network I have a vSwitch with VLAN tag 4000. Each host have one public IPv4 address and a /64 scope of IPv6. What I want is that all internal traffic happens on the internal lan and that the VM's uses the host they are running on as default gateway and those that should have IPv6 get the address from the range of the respective host.
First I set up a bridge on the physical interface with the public IPv4 and IPv6 addresses, then I add a tagged interface for the VLAN. I'm not sure if it was really necessary to have the public Ipv4 address on a bridge since I don't have any public IPv4 addresses for the VM's but when I initially did the setup (with one host) I couldn't get the tagged interface up without having the other bridge. Never mind, it works like this and I have the flexibility to request more IPv4 addresses from Hetzner if I would need to have any directly on a VM. I also put my IPv6 /64 scope on the internal network except for first address which is kept for the host
To route incoming IPv4 traffic to VM's I use iptables port forwarding and nginx reverse proxy.
My /etc/network/interfaces hence looks like this:
Code:
auto lo
iface lo inet loopback
iface lo inet6 loopback
auto lo:1
iface lo:1 inet static
address y.y.y.254/32
iface enp0s31f6 inet manual
auto vmbr0
iface vmbr0 inet static
address x.x.x.x/26
gateway x.x.x.x
bridge-ports enp0s31f6
bridge-stp off
bridge-fd 1
bridge-vlan-aware yes
bridge-vids 4000
bridge_hw enp0s31f6
bridge-hello 2
bridge-maxage 12
up route add -net x.x.x.x netmask 255.255.255.192 gw x.x.x.x dev enp0s31f6
#post-up and post-down for iptables port forwarding removed
iface vmbr0 inet6 static
address z:z:z:z::2/128
gateway fe80::1
auto vmbr0.4000
iface vmbr0.4000 inet static
address y.y.y.1/24
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
post-up iptables -t nat -A POSTROUTING -s 'y.y.y.0/24' ! -d 'y.y.y.0/24' -o vmbr0 -j MASQUERADE
post-down iptables -t nat -D POSTROUTING -s 'y.y.y.y.0/24' ! -d 'y.y.y.0/24' -o vmbr0 -j MASQUERADE
post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
iface vmbr0.4000 inet6 static
address z:z:z:z::3/64
up ip -6 route add z:z:z:z::3/64 dev vmbr0.4000
Code:
root@h1 /home/sverker # ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet y.y.y.254/32 scope global lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
I then have ISC dhcpd in a failover configuration om both hosts (the other host has ip y.y.y.2 on the internal network) setting y.y.y.254 as default gateway for the VM's.
Works as intended, I should be able to work with dynamic dns updates to get ingress traffic to the right host. I don't do SNAT on port forwarding, hence the client does a direct return so external traffic needs to come in to the host it is running on.
To get the desired functionality for IPv6 was a bit more complicated. My VM's get their IPv6 information via router advertisement, a sidenote is that it doesn't seem to matter what is set for IP config in cloud init when the VM is running newer dist that uses NetworkManager or Netplan to manage network settings. Anyway, what I needed to do was to block router advertisement from the wrong host to reach the VM's.
Router advertisement is ICMPv6 messages and part of Neighbour Discovery Protocol. There is a checkbox on under options for the VM firewall to allow NDP and if that is selected it also enables RA. Hence the solution was to uncheck that box and instead create two security groups, one for each host. Those security groups contains the rules to enable all ICMPv6 messages of NDP but for RA only allow those with sender address of the host the VM is running on. Then I assign the VM with correct security group. A side effect is that VM's that I don't want to have IPv6 address I leave NDP unchecked on FW and don't add that security group which means they'll never get any RA to find out what public IPv6 address they would have.
Ok, it doesn't work automatically on a migration to get IPv6 address updated but I only migrate VM's manually anyway and in mean time IPv6 will work just that it is sent over the internal network via the other host.
This works like a charm on AlmaLinux and Ubuntu VM's but for some reason Debian 12 VM's still gets RA from both hosts no matter if I add the security group or not. Everything looks right even when I look with ip6tables -L, NDP packets should be dropped and everything else I've tested that should be dropped is. It doesn't matter, I just had one test VM with Debian all the others either run AlmaLinux or Ubuntu and they work as intended but it puzzles me.
One thing that I have left is that I have a Wireguard VPN between the datacenter and my office terminating on one of the hosts. For now I just push static routes via dhcp but it should be possible to have vpn to both hosts and sort routing via BGP.
End result is that I have got the intended functionality. I have an internal network through which the VM's can communicate with each other, IPv4 egress traffic goes via the host the VM is running on and IPv6 addresses gets assigned from the scope assigned to each host.