Hello,
I have this setup:
Router (openWRT) <--> Unifi USW2.5G 5 switch <--> Proxmox VE
VE is installed on a 2 NICs mini PC. NIC 1 is connected to my primary network
NIC2 is a bridge which is used from the SDN.
Problem: Most VLANs are not working.
I have 3 zones, named LXC, OCP, BMC.
Each zone has one or more subnets.
Now, my main issue is that I have zero consistency on accessibility inside these zones.
Most of the time, I can fire up an LXC in the BMC network and I can get access to it. Or to have access to SOME of the VNets inside the OCP zone. LXC is 99% not working.
I have tried to setup different VLAN IDs, different subnets, still I get no consistency. Some times VNET A is working but making any change make it unusable.
Obviously I need some help here.
The configuration is the following:
ip link shows all the interfces up, including the VLANs on the bridge.
Apart from the Provisioning which has no GW, I try to see what traffic works and what doesn't.
So, I have this tcpdump running in PVE:
Now, I stop the LXC and assign an IP on each of them, then I try to reach the LXC from my PC:
So, the output is this:
tcpdump on the PVE shows NO traffic for 103.
On the router, I see this:
Again, I see nothing on the PVE and router's tcpdump shows again RST:
So, this works!
Finally:
Router says this:
So, this drives me nuts
Router has the same configuration for all the VLANs.
In the switch, I removed the PVE from its switch port and I connected my other server, a RHEL box, on which I configured all these VLANs. I have ZERO problems assigning IPs to VMs running on RHEL, which means the switch is also correct.
So, what is wrong with the Proxmox setup?
I have this setup:
Router (openWRT) <--> Unifi USW2.5G 5 switch <--> Proxmox VE
VE is installed on a 2 NICs mini PC. NIC 1 is connected to my primary network
NIC2 is a bridge which is used from the SDN.
Problem: Most VLANs are not working.
I have 3 zones, named LXC, OCP, BMC.
Each zone has one or more subnets.
Now, my main issue is that I have zero consistency on accessibility inside these zones.
Most of the time, I can fire up an LXC in the BMC network and I can get access to it. Or to have access to SOME of the VNets inside the OCP zone. LXC is 99% not working.
I have tried to setup different VLAN IDs, different subnets, still I get no consistency. Some times VNET A is working but making any change make it unusable.
Obviously I need some help here.
The configuration is the following:
JSON:
{
"fabrics": {
"ids": {}
},
"controllers": {
"ids": {}
},
"version": 18,
"subnets": {
"ids": {
"BMC-192.168.101.0-28": {
"type": "subnet",
"vnet": "BMCnet",
"gateway": "192.168.101.1"
},
"LXCNet-192.168.201.0-25": {
"vnet": "LXCSub",
"type": "subnet",
"gateway": "192.168.201.1"
},
"OCP-192.168.101.32-28": {
"gateway": "192.168.101.33",
"vnet": "Node",
"type": "subnet"
},
"OCP-192.168.101.128-25": {
"vnet": "VM",
"type": "subnet",
"gateway": "192.168.101.129"
},
"OCP-192.168.101.64-29": {
"gateway": "192.168.101.65",
"vnet": "External",
"type": "subnet"
},
"OCP-192.168.101.48-28": {
"vnet": "Storage",
"type": "subnet",
"gateway": "192.168.101.49"
},
"OCP-192.168.101.16-28": {
"type": "subnet",
"vnet": "Prov"
}
}
},
"vnets": {
"ids": {
"External": {
"tag": 105,
"zone": "OCP",
"type": "vnet",
"alias": "OCP External Network"
},
"VM": {
"zone": "OCP",
"tag": 106,
"type": "vnet",
"alias": "OCP VM Network"
},
"Storage": {
"zone": "OCP",
"tag": 104,
"alias": "OCP Storage Network",
"type": "vnet"
},
"Node": {
"zone": "OCP",
"tag": 103,
"type": "vnet",
"alias": "OCP Node management"
},
"Prov": {
"alias": "OCP Provisioning",
"type": "vnet",
"zone": "OCP",
"tag": 102
},
"BMCnet": {
"alias": "BMC Network",
"type": "vnet",
"tag": 101,
"zone": "BMC"
},
"LXCSub": {
"tag": 201,
"zone": "LXCNet",
"alias": "LXC Network",
"type": "vnet"
}
}
},
"zones": {
"ids": {
"OCP": {
"type": "vlan",
"ipam": "pve",
"bridge": "vmbr1"
},
"LXCNet": {
"bridge": "vmbr1",
"ipam": "pve",
"type": "vlan"
},
"BMC": {
"bridge": "vmbr1",
"ipam": "pve",
"type": "vlan"
}
}
}
}
ip link shows all the interfces up, including the VLANs on the bridge.
Apart from the Provisioning which has no GW, I try to see what traffic works and what doesn't.
So, I have this tcpdump running in PVE:
Code:
tcpdump -i enp4s0.VLANID host LXC_IP and '(port 22 or icmp)'
Now, I stop the LXC and assign an IP on each of them, then I try to reach the LXC from my PC:
So, the output is this:
Code:
# 101, OK:
ssh root@192.168.101.4
The authenticity of host '192.168.101.4 (192.168.101.4)' can't be established.
ED25519 key fingerprint is: SHA256:86xHkdSc+s/+/YADr+OObLUNsVmj5s7g3MszFeEid+Q
This host key is known by the following other names/addresses:
# 103, NOK:
ssh root@192.168.101.36
ssh: connect to host 192.168.101.36 port 22: Connection refused
tcpdump on the PVE shows NO traffic for 103.
On the router, I see this:
Code:
09:50:02.451091 IP 192.168.0.3.45726 > 192.168.101.36.22: Flags [S], seq 2871565335, win 64240, options [mss 1460,sackOK,TS val 810257169 ecr 0,nop,wscale 10], le
ngth 0
09:50:02.451210 IP 192.168.101.36.22 > 192.168.0.3.45726: Flags [R.], seq 0, ack 2871565336, win 0, length 0
Code:
#104, NOK
ssh root@192.168.101.68
ssh: connect to host 192.168.101.68 port 22: Connection refused
Code:
tcpdump -ni br-lan 'host 192.168.101.68 and tcp port 22'
10:40:37.751717 IP 192.168.0.3.59584 > 192.168.101.68.22: Flags [S], seq 1082890419, win 64240, options [mss 1460,sackOK,TS val 251469019 ecr 0,nop,wscale 10], le
ngth 0
10:40:37.751798 IP 192.168.101.68.22 > 192.168.0.3.59584: Flags [R.], seq 0, ack 1082890420, win 0, length 0
Code:
ssh root@192.168.101.132
The authenticity of host '192.168.101.132 (192.168.101.132)' can't be established.
Finally:
Code:
ssh root@192.168.201.4
ssh: connect to host 192.168.201.4 port 22: No route to host
Router says this:
Code:
tcpdump -ni br-lan 'host 192.168.201.4 and tcp port 22'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-lan, link-type EN10MB (Ethernet), capture size 262144 bytes
10:46:59.930394 IP 192.168.0.3.44192 > 192.168.201.4.22: Flags [S], seq 4172331600, win 64240, options [mss 1460,sackOK,TS val 2207098193 ecr 0,nop,wscale 10], le
ngth 0
10:47:00.935243 IP 192.168.0.3.44192 > 192.168.201.4.22: Flags [S], seq 4172331600, win 64240, options [mss 1460,sackOK,TS val 2207099198 ecr 0,nop,wscale 10], le
ngth 0
10:47:01.959230 IP 192.168.0.3.44192 > 192.168.201.4.22: Flags [S], seq 4172331600, win 64240, options [mss 1460,sackOK,TS val 2207100222 ecr 0,nop,wscale 10], le
ngth 0
10:47:02.983247 IP 192.168.0.3.44192 > 192.168.201.4.22: Flags [S], seq 4172331600, win 64240, options [mss 1460,sackOK,TS val 2207101246 ecr 0,nop,wscale 10], le
ngth 0
So, this drives me nuts
Router has the same configuration for all the VLANs.
In the switch, I removed the PVE from its switch port and I connected my other server, a RHEL box, on which I configured all these VLANs. I have ZERO problems assigning IPs to VMs running on RHEL, which means the switch is also correct.
So, what is wrong with the Proxmox setup?
Last edited: