One out of three nodes networking issue - node unreachable

damjank

Active Member
Apr 2, 2020
30
1
28
Hello,

I have 3 identical servers running identical version of Proxmox with all of therm fully upgraded.
Each has 2 network cards installed, one is on-board quad 10gb and another is PCie quad port SPF+ 10gb. I have created two bridges, each bridge having all 4 ports assigned. Both bridges has defined static address on different subnet, naturally. Also, each card(s) and in turn vmbr is connected to different switch. Each switch is connected to router. Router is Sophos OS that has all LAN traffic allowed, no restrictions.

What I want to have is, vmbr0 will be general VM traffic. The SPF+ cards will handle only storage (ceph) traffic. Since all of ceph traffic should be local, the switch being connected to FW (or even not) is immaterial. The problem is - comms between nodes, lets call them node1 and node2 and node3. If I ping node2 and node 3 from node1 - no traffic, nothing, nada, dead end. If I ping node1 from other nodes, same result. If I ping node2 from node3 and vice versa it works properly. So only node1 is isolated. If I do traceroute from node2 to node3 traceroute immediately hops to that node, since it is actually local. But if I do traceroute towards node1 or from node1 is first hops to GW and then it does not go forward. All 3 nodes are showing idential routing table. The switch is Arista dcs-7124s without any special config or restrictions.

Please help as I am pulling my hair out (not leaving me with much to go) trying to find issue. Any suggestion or insight is welcomed! Thanks!!
 
But if I do traceroute towards node1 or from node1 is first hops to GW and then it does not go forward
Hi,
if the packets are routed to the gateway rather than to the other node in the same subnet, this might indicate that your subnet mask is not configured correctly. Please verify by checking the output of ip addr and see if your nodes show up as neighbors ip neigh.

In general, please post your configs and relevant information for others to help and spot potential configuration issues (best in code tags for better readability).
 
So I have triple checked - all of the bridges were having /28 netmask - no differences. I have tried with /26 and /25 netmasks - it did not work. Once I switched to /24 netmask, it started to work. No clue how and why - I wanted to shorten the network for this not to unneccessarily use 255 IP in this network but - if now works, then lets that be it. Still, I wonder what could cause this problem with not working if the netmask is different then /24

Thank you for pointing this out, at least, now its working.
 
So I have triple checked - all of the bridges were having /28 netmask - no differences. I have tried with /26 and /25 netmasks - it did not work. Once I switched to /24 netmask, it started to work. No clue how and why - I wanted to shorten the network for this not to unneccessarily use 255 IP in this network but - if now works, then lets that be it. Still, I wonder what could cause this problem with not working if the netmask is different then /24

Thank you for pointing this out, at least, now its working.
My guess, the IP address assigned to that one particular node was outside of the subnet defined by the network mask.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!