I want to start of saying I understand very little about how IPV6 works under the hood, and I'm not sure if my description of the issue is correct but I'll try my best.
I had noticed one of my LXC containers running gitlab would occasionally just stop being accessible from WAN (i'm behind CGNAT so only accessible via ipv6), restarting the services did nothing and the services didn't show any logs but restarting the container always fixed it.
Cutting the troubleshooting short, because it took forever; the root cause was
So my understanding of this after some back and forth with an AI is that proxmox should handle the
I had another container that this never happened to, even with 73 days of uptime. I'll try to provide as much info as I can here about the issue, and see if its something wrong with my setup or I should be reporting this as a bug somewhere.
Host network config:
LXC network configs:
State inside container 204 during outage:
State inside container 206 at the same time (up 73 days, still working):
Minor detail: the problematic container did have
I had noticed one of my LXC containers running gitlab would occasionally just stop being accessible from WAN (i'm behind CGNAT so only accessible via ipv6), restarting the services did nothing and the services didn't show any logs but restarting the container always fixed it.
Cutting the troubleshooting short, because it took forever; the root cause was
ip -6 route show default would show nothing, as the route had expired and not been renewed (or was somehow removed?)So my understanding of this after some back and forth with an AI is that proxmox should handle the
accept_ra = 2 flag on the host and pass the routes onto the LXCs, so the LXCs can just use ipv6=auto (SLAAC) without needing extra configuration. For now I just re-added a static route and that solved the problem for this container.I had another container that this never happened to, even with 73 days of uptime. I'll try to provide as much info as I can here about the issue, and see if its something wrong with my setup or I should be reporting this as a bug somewhere.
Code:
# pveversion
pve-manager/9.1.6/71482d1833ded40a (running kernel: 6.17.4-1-pve)
# pveversion --verbose
proxmox-ve: 9.1.0 (running kernel: 6.17.4-1-pve)
pve-manager: 9.1.6 (running version: 9.1.6/71482d1833ded40a)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17: 6.17.13-1
proxmox-kernel-6.17.13-1-pve-signed: 6.17.13-1
proxmox-kernel-6.17.9-1-pve-signed: 6.17.9-1
proxmox-kernel-6.17.4-2-pve-signed: 6.17.4-2
proxmox-kernel-6.17.4-1-pve-signed: 6.17.4-1
proxmox-kernel-6.8: 6.8.12-17
proxmox-kernel-6.8.12-17-pve-signed: 6.8.12-17
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 19.2.3-pve4
corosync: 3.1.10-pve1
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx12
intel-microcode: 3.20251111.1~deb13u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.2
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.7
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.5
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-4
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.4-1
proxmox-backup-file-restore: 4.1.4-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.8
pve-cluster: 9.0.7
pve-container: 6.1.2
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.18-1
pve-ha-manager: 5.1.1
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-7
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.4
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.4.0-pve1
Host network config:
Code:
iface vmbr0 inet6 static
address fe80::11/64
gateway fe80::
accept_ra 2
LXC network configs:
Code:
# Container 204 (affected)
net0: name=eth0,bridge=vmbr0,gw=192.168.88.1,hwaddr=F2:FB:E0:DA:B5:73,ip=192.168.2.4/16,ip6=auto,type=veth
# Container 206 (not yet affected, identical config)
net0: name=eth0,bridge=vmbr0,gw=192.168.88.1,hwaddr=FE:7E:A8:5A:47:81,ip=192.168.2.6/16,ip6=auto,type=veth
State inside container 204 during outage:
Code:
# ip -6 route show default
(empty)
# ip -6 route show
2a02:xxxx::/64 dev eth0 proto ra metric 1024 expires 2495210sec mtu 1500 hoplimit 64 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
# ip -6 addr show eth0
inet6 2a02:xxxx/64 scope global dynamic mngtmpaddr noprefixroute (still present)
# sysctl net.ipv6.conf.eth0.accept_ra
net.ipv6.conf.eth0.accept_ra = 0
# sysctl net.ipv6.conf.all.forwarding
net.ipv6.conf.all.forwarding = 0
# nstat -az TcpExtListenDrops
TcpExtListenDrops 83334 (incrementing -- kernel dropping inbound SYNs it can't respond to)
State inside container 206 at the same time (up 73 days, still working):
Code:
# ip -6 route show default
default via fe80::de2c:6eff:fe61:f4a8 dev eth0 proto ra metric 1024 expires 1626sec mtu 1500 hoplimit 64 pref medium
# sysctl net.ipv6.conf.eth0.accept_ra
net.ipv6.conf.eth0.accept_ra = 0
Minor detail: the problematic container did have
firewall=1 before I removed it in attempting to recover, but the cluster fw was always disabled so I don't think that did anything.