I set up a test cluster a few months ago with older servers. Everything worked as expected - no problems.
So I decided to buy 3 new machines for a production cluster and of course... things start to get strange
I have a few public IP /24 networks and mixing them on the old testcluster was no problem.
So I did the same on the new servers.
The nodes have e.g. x.y.132.10, x.y.132.11, x.y.132.12
And some test CTs have IPs like x.y.132.48 or x.y.134.48 or x.y.135.48
When I start up the CT I can ping those from the outside for some time like 30secs. The the ping stops (=waits forever)
If I open a console of that CT via the webgui and start a ping from that CT to the outside (e.g. ping 8.8.8.8) then suddenly I can also ping the CT from the outside again: i.e. the ping from above continues.
Looks smthg like this (x.y.134.48 is the IP of the CT):
...
64 bytes from x.y.134.48: icmp_req=310 ttl=53 time=6.65 ms
64 bytes from x.y.134.48: icmp_req=311 ttl=53 time=43.4 ms
64 bytes from x.y.134.48: icmp_req=312 ttl=53 time=8.32 ms
64 bytes from x.y.134.48: icmp_req=586 ttl=53 time=7.17 ms
64 bytes from x.y.134.48: icmp_req=587 ttl=53 time=7.06 ms
64 bytes from x.y.134.48: icmp_req=588 ttl=53 time=7.45 ms
64 bytes from x.y.134.48: icmp_req=589 ttl=53 time=39.0 ms
64 bytes from x.y.134.48: icmp_req=590 ttl=53 time=6.65 ms
64 bytes from x.y.134.48: icmp_req=591 ttl=53 time=7.15 ms
64 bytes from x.y.134.48: icmp_req=592 ttl=53 time=7.99 ms
64 bytes from x.y.134.48: icmp_req=593 ttl=53 time=7.62 ms
64 bytes from x.y.134.48: icmp_req=594 ttl=53 time=6.74 ms
64 bytes from x.y.134.48: icmp_req=595 ttl=53 time=32.1 ms
64 bytes from x.y.134.48: icmp_req=596 ttl=53 time=7.03 ms
64 bytes from x.y.134.48: icmp_req=597 ttl=53 time=7.53 ms
64 bytes from x.y.134.48: icmp_req=598 ttl=53 time=7.50 ms
64 bytes from x.y.134.48: icmp_req=599 ttl=53 time=6.78 ms
64 bytes from x.y.134.48: icmp_req=1155 ttl=53 time=8.54 ms
64 bytes from x.y.134.48: icmp_req=1156 ttl=53 time=6.87 ms
64 bytes from x.y.134.48: icmp_req=1157 ttl=53 time=7.11 ms
64 bytes from x.y.134.48: icmp_req=1158 ttl=53 time=7.29 ms
64 bytes from x.y.134.48: icmp_req=1159 ttl=53 time=49.7 ms
...
The 3 new machines are attached to the same switch as the old testcluster - could that cause some problems?
Also I gave the new machines additional 10Gbit NICs, however they are all on a private range (10.10.1.x)
Any suggestions? Thanks!
So I decided to buy 3 new machines for a production cluster and of course... things start to get strange
I have a few public IP /24 networks and mixing them on the old testcluster was no problem.
So I did the same on the new servers.
The nodes have e.g. x.y.132.10, x.y.132.11, x.y.132.12
And some test CTs have IPs like x.y.132.48 or x.y.134.48 or x.y.135.48
When I start up the CT I can ping those from the outside for some time like 30secs. The the ping stops (=waits forever)
If I open a console of that CT via the webgui and start a ping from that CT to the outside (e.g. ping 8.8.8.8) then suddenly I can also ping the CT from the outside again: i.e. the ping from above continues.
Looks smthg like this (x.y.134.48 is the IP of the CT):
...
64 bytes from x.y.134.48: icmp_req=310 ttl=53 time=6.65 ms
64 bytes from x.y.134.48: icmp_req=311 ttl=53 time=43.4 ms
64 bytes from x.y.134.48: icmp_req=312 ttl=53 time=8.32 ms
64 bytes from x.y.134.48: icmp_req=586 ttl=53 time=7.17 ms
64 bytes from x.y.134.48: icmp_req=587 ttl=53 time=7.06 ms
64 bytes from x.y.134.48: icmp_req=588 ttl=53 time=7.45 ms
64 bytes from x.y.134.48: icmp_req=589 ttl=53 time=39.0 ms
64 bytes from x.y.134.48: icmp_req=590 ttl=53 time=6.65 ms
64 bytes from x.y.134.48: icmp_req=591 ttl=53 time=7.15 ms
64 bytes from x.y.134.48: icmp_req=592 ttl=53 time=7.99 ms
64 bytes from x.y.134.48: icmp_req=593 ttl=53 time=7.62 ms
64 bytes from x.y.134.48: icmp_req=594 ttl=53 time=6.74 ms
64 bytes from x.y.134.48: icmp_req=595 ttl=53 time=32.1 ms
64 bytes from x.y.134.48: icmp_req=596 ttl=53 time=7.03 ms
64 bytes from x.y.134.48: icmp_req=597 ttl=53 time=7.53 ms
64 bytes from x.y.134.48: icmp_req=598 ttl=53 time=7.50 ms
64 bytes from x.y.134.48: icmp_req=599 ttl=53 time=6.78 ms
64 bytes from x.y.134.48: icmp_req=1155 ttl=53 time=8.54 ms
64 bytes from x.y.134.48: icmp_req=1156 ttl=53 time=6.87 ms
64 bytes from x.y.134.48: icmp_req=1157 ttl=53 time=7.11 ms
64 bytes from x.y.134.48: icmp_req=1158 ttl=53 time=7.29 ms
64 bytes from x.y.134.48: icmp_req=1159 ttl=53 time=49.7 ms
...
The 3 new machines are attached to the same switch as the old testcluster - could that cause some problems?
Also I gave the new machines additional 10Gbit NICs, however they are all on a private range (10.10.1.x)
auto lo
iface lo inet loopback
iface eno1 inet manual
auto ens6
iface ens6 inet static
address 10.10.1.10
netmask 255.255.255.0
iface eno2 inet manual
auto vmbr0
iface vmbr0 inet static
address x.y.132.125
netmask 255.255.255.0
gateway x.y.132.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
iface lo inet loopback
iface eno1 inet manual
auto ens6
iface ens6 inet static
address 10.10.1.10
netmask 255.255.255.0
iface eno2 inet manual
auto vmbr0
iface vmbr0 inet static
address x.y.132.125
netmask 255.255.255.0
gateway x.y.132.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
Any suggestions? Thanks!