I've got a rather complicated setup for home use, I'll try to give enough detail up front.
The problem is with vm105:
100(cloud) -- runs caddy as reverse proxy on docker, serves traffic through to other docker containers on itself, and other VMs and LXC containers across vlan15 and vlan20
105(haos) -- home assistant, can't get any traffic to respond to caddy, but from vm100 I can ping and I see multicast arp traffic from vm105
All VMs sit on vmbr0 with all vlan's enabled on the interface, and terminate on the proxmox network device for each VM.
I also have vmbr1 which is used for some VMs to communicate without relying on a physical network device.
PVE VM configs:
vm100:
vm105:
PVE network config:
PVE bridge setup:
PVE vlans:
I usually have firewall enabled for all guests, but disabled it for vm100 and vm105 for testing.
vm100 has IPs 10.15.1.220 and 10.20.1.220
vm105 has IP 10.20.1.228
Ping works:
But if I run "wget http://10.20.1.228:8123" from a separate terminal on vm100, nothing.
I know the traffic is leaving vm100, reaches vm105, then the return response comes back again all on vmbr0:
But the traffic never seems to reach vm100's tap device again:
The only traffic that shows up is some ARP and multicast coming from vm105.
As a test, I've done the same thing in reverse and I believe its the same result.
From vm105, running "curl http://10.20.1.220":
Nothing ever arrives on tap105i0.
I also have vlan10, which my PCs reside on. I can http://10.20.1.228:8123 from 10.10.1.1 and it works fine.
So my conclusion is that there's *something* blocking return traffic between the bridge and some tap devices.
The problem is with vm105:
100(cloud) -- runs caddy as reverse proxy on docker, serves traffic through to other docker containers on itself, and other VMs and LXC containers across vlan15 and vlan20
105(haos) -- home assistant, can't get any traffic to respond to caddy, but from vm100 I can ping and I see multicast arp traffic from vm105
All VMs sit on vmbr0 with all vlan's enabled on the interface, and terminate on the proxmox network device for each VM.
I also have vmbr1 which is used for some VMs to communicate without relying on a physical network device.
PVE VM configs:
vm100:
root@bedrock:~# cat /etc/pve/qemu-server/100.conf
name: cloud
net0: virtio=BC:24:11:3A:C7:69,bridge=vmbr0,tag=20
net1: virtio=BC:24:11:32:00:07,bridge=vmbr0,tag=15
vm105:
root@bedrock:~# cat /etc/pve/qemu-server/105.conf
name: haos
net0: virtio=02:8C:44:C6:3F:9C,bridge=vmbr0,tag=20
PVE network config:
root@bedrock:~# cat /etc/network/interfaces
...
iface vmbr0 inet manual
bridge-ports enp3s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 10 15 20 25 30 50
#VLAN Bridge for Guests
auto vmbr1
iface vmbr1 inet static
address 192.168.1.1/24
bridge-ports none
bridge-stp off
bridge-fd 0
PVE bridge setup:
root@bedrock:~# brctl show
bridge name bridge id STP enabled interfaces
...
vmbr0 8000.408d5c780643 no enp3s0
fwpr101p0
fwpr103p0
fwpr104p0
fwpr104p1
fwpr200p0
fwpr200p1
fwpr200p2
fwpr201p0
fwpr202p0
fwpr204p0
tap100i0
tap100i1
tap105i0
...
PVE vlans:
root@bedrock:~# bridge vlan show
port vlan-id
enp3s0 1 PVID Egress Untagged
10
15
20
25
30
50
vmbr0 1 PVID Egress Untagged
vmbr1 1 PVID Egress Untagged
...
tap100i0 20 PVID Egress Untagged
tap100i1 15 PVID Egress Untagged
...
tap105i0 20 PVID Egress Untagged
I usually have firewall enabled for all guests, but disabled it for vm100 and vm105 for testing.
vm100 has IPs 10.15.1.220 and 10.20.1.220
vm105 has IP 10.20.1.228
Ping works:
root@cloud:~# tcpdump -nnvvS -i enp6s18 host 10.20.1.228
tcpdump: listening on enp6s18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:10:03.687792 IP (tos 0x0, ttl 64, id 38461, offset 0, flags [DF], proto ICMP (1), length 84)
10.20.1.220 > 10.20.1.228: ICMP echo request, id 4, seq 1, length 64
09:10:03.688111 IP (tos 0x0, ttl 63, id 57101, offset 0, flags [none], proto ICMP (1), length 84)
10.20.1.228 > 10.20.1.220: ICMP echo reply, id 4, seq 1, length 64
But if I run "wget http://10.20.1.228:8123" from a separate terminal on vm100, nothing.
I know the traffic is leaving vm100, reaches vm105, then the return response comes back again all on vmbr0:
root@bedrock:~# tcpdump -i vmbr0 host 10.20.1.228 and host 10.20.1.220
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vmbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:17:56.342721 IP 10.20.1.220.47944 > 10.20.1.228.8123: Flags [S], seq 3301836575, win 64240, options [mss 1460,sackOK,TS val 3035091219 ecr 0,nop,wscale 7], length 0
09:17:56.342867 IP 10.20.1.228.8123 > 10.20.1.220.47944: Flags [S.], seq 269643130, ack 3301836576, win 65160, options [mss 1460,sackOK,TS val 2921087751 ecr 3035091219,nop,wscale 7], length 0
09:17:57.359655 IP 10.20.1.228.8123 > 10.20.1.220.47944: Flags [S.], seq 269643130, ack 3301836576, win 65160, options [mss 1460,sackOK,TS val 2921088768 ecr 3035091219,nop,wscale 7], length 0
09:17:57.374254 IP 10.20.1.220.47944 > 10.20.1.228.8123: Flags [S], seq 3301836575, win 64240, options [mss 1460,sackOK,TS val 3035092251 ecr 0,nop,wscale 7], length 0
09:17:57.374390 IP 10.20.1.228.8123 > 10.20.1.220.47944: Flags [S.], seq 269643130, ack 3301836576, win 65160, options [mss 1460,sackOK,TS val 2921088782 ecr 3035091219,nop,wscale 7], length 0
09:17:59.407710 IP 10.20.1.228.8123 > 10.20.1.220.47944: Flags [S.], seq 269643130, ack 3301836576, win 65160, options [mss 1460,sackOK,TS val 2921090816 ecr 3035091219,nop,wscale 7], length 0
09:18:01.374173 ARP, Request who-has 10.20.1.228 tell 10.20.1.220, length 28
09:18:01.374313 ARP, Reply 10.20.1.228 is-at 02:8c:44:c6:3f:9c (oui Unknown), length 28
But the traffic never seems to reach vm100's tap device again:
root@bedrock:~# tcpdump -i tap100i0 src 10.20.1.228
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on tap100i0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:18:01.374310 ARP, Reply 10.20.1.228 is-at 02:8c:44:c6:3f:9c (oui Unknown), length 28
09:18:23.075582 IP 10.20.1.228.33981 > 239.255.255.250.1900: UDP, length 324
The only traffic that shows up is some ARP and multicast coming from vm105.
As a test, I've done the same thing in reverse and I believe its the same result.
From vm105, running "curl http://10.20.1.220":
root@bedrock:~# tcpdump -i vmbr0 host 10.20.1.228 and host 10.20.1.220
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vmbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:27:51.495515 IP 10.20.1.228.54682 > 10.20.1.220.http: Flags [S], seq 3539920993, win 64240, options [mss 1460,sackOK,TS val 2921682903 ecr 0,nop,wscale 7], length 0
09:27:51.495701 IP 10.20.1.228.54682 > 10.20.1.220.http: Flags [S], seq 3539920993, win 64240, options [mss 1460,sackOK,TS val 2921682903 ecr 0,nop,wscale 7], length 0
09:27:51.495895 IP 10.20.1.220.http > 10.20.1.228.54682: Flags [S.], seq 3076438070, ack 3539920994, win 65160, options [mss 1460,sackOK,TS val 1513592998 ecr 2921682903,nop,wscale 7], length 0
09:27:52.495805 IP 10.20.1.228.54682 > 10.20.1.220.http: Flags [S], seq 3539920993, win 64240, options [mss 1460,sackOK,TS val 2921683904 ecr 0,nop,wscale 7], length 0
Nothing ever arrives on tap105i0.
I also have vlan10, which my PCs reside on. I can http://10.20.1.228:8123 from 10.10.1.1 and it works fine.
So my conclusion is that there's *something* blocking return traffic between the bridge and some tap devices.
Last edited: