Weird cluster issue

Pink_Waters

New Member
Jul 22, 2023
10
1
3
So I have a cluster of 3 nodes. pve1, pve2 and pve3
lately I started experiencing a weird issue. when i log into the web interface of pve1 ip address and click on the shell for pve3. i get connection closed. but if I log into the web interface using pve3 ip address and click on the shell of node pve1 its working fine.
also cannot migrate VMs to pve3 anymore as it fails.
Tried a bunch of troubleshooting beginning with trying to ssh directly from pve1 to pve3 it hangs for a a minute or two then i get "connection closed by ip port 22"
tried ssh from pve3 to pve1 and it connects fine. so its seems like a one way issue from pve1 to pve3.
I have tried to "pvecm updatecerts" and making sure the keys on each node for ssh match and so far I have not found a solution.
any thoughts ?
 
Sounds like pve3 drops the connection.
Is there a firewall on pve3? That blocks port 22?

Or did you checked with "netstat -tulpen" on pve3 if anything is listening on port 22?
 
Tried a bunch of troubleshooting beginning with trying to ssh directly from pve1 to pve3 it hangs for a a minute or two then i get "connection closed by ip port 22"
Check MTU consistency, check with long duration pings for packet loss. Enable debug on SSH side (both server and client), try "iperf" tests between the nodes.
Check for traffic errors with ethtool and netstat.
Replace cable, switch ports, try direct cable connection between pve3 and pve1, does ssh/iperf/ping work ok?

This is clearly not a hypervisor issue, but rather a basic network layer problem. Trace back your steps - what has changed prior to issue appearing?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I have firewall turned off.
the only change that happened is that initially i had nodes connected together in a full mesh using the 10gbe nics using a private network subnet for corrosync, and used the 1gbe ports to connect nodes to the switch that connects them to the network.
then I removed the mesh network/cables and connected the nodes into a 10gbe switch thats connected to the network and used the 1gbe ports fed into a network isolated unmanaged switch for corrosync using a private network subnet.

the weird issue is that pve2 can ssh into pve1 and pve3 fine. and pve3 can ssh into pve2 and pve1 fine. and pve1 can ssh into pve2. it just cannot ssh into pve3.
 
Last edited:
the only change that happened
Even at the high level that you described it here, thats clearly a very significant network reconfiguration... The most likely explanation is that something is not properly configured or you have a bad piece of hardware.

Given this additional information, I would start at an even more basic troubleshooting, ie configuration review.
What is the output of "ip a" on each node? What is the context of /etc/network/interfaces? Does ssh/etc work over 1G vs 10G?

Again, this is not a hypervisor (PVE software) problem. You are dealing with basic Linux network troubleshooting here.

good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I will provide "ip a" output when I am back at work, but its a simple configuration. each node has 2 bonded lacp 10gbe cables into the 10gbe switch. which carries the network that the SSH is trying to use to connect nodes together.
The other 1gbe cables into that umanaged switch is just for the corrosync network. which should be unrelated to this issue because it uses a totally different subnet for this purpose.

it just beats me how can only pve3 drop ssh incoming from pve1 only. i have spent like 8 hours troubleshooting this to no avail. very frustrating.
 
Last edited:
. each node has 2 bonded lacp 10gbe cables into the 10gbe switch
You story becomes more and more complex with each post, yet there is light in the end of the tunnel.

LACP uses source and destination IP addresses to hash the traffic to be sent over a selected interface. Most likely the hash between pve3 and pve1 lands on the part of the channel that is not working properly. Remove one port from LACP at a time and keep testing, that should help you identify which area to concentrate on.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Ramalama
You story becomes more and more complex with each post, yet there is light in the end of the tunnel.

LACP uses source and destination IP addresses to hash the traffic to be sent over a selected interface. Most likely the hash between pve3 and pve1 lands on the part of the channel that is not working properly. Remove one port from LACP at a time and keep testing, that should help you identify which area to concentrate on.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
That's what i laughed already either :)

In the first post of him, it looked like pve3 dropped packets for whatever reason...

In the second post of him, he told that "the only thing" he did, was changing the whole set-up :)
from direct connection between nodes and one uplink port to a switch, to basically everything to a switch and uplink ports to a separate unmanaged switch....

In the third post he is using lacp xD
Which changes again everything:)

It's funny tbh :)

The next idea that comes into my mind would be to see the network/interfaces and the switch config.
And probably asking him what interface is connected to what port on the switch xD

If i would be you, i would offer him an payed support bbgeek xD
 
@bbgeek: That makes sense I was thinking it might be the bond from the switch side since I just asked the network admin to setup the port groups for me identical. Now I am doubting that this is the case. or might be a bad one bad 10gbe cable involved here.

@Ramalama: I don't think it is that funny honestly. regardless of how the set up being direct or through a switch. given the info I have that everything should be right on the switch side. I was dealing with an odd issue only affect ssh from one node to the other in just one way.
that funnier part I would assume is that your post adds zero contribution nonetheless.
 
@Ramalama: I don't think it is that funny honestly. regardless of how the set up being direct or through a switch. given the info I have that everything should be right on the switch side.
I did find the progression of your thread quite entertaining. You must admit that, absent all the details, your opening post sent the community on the wild goose chase. I do applaud you for coming clean with all the details, thats not always the case with members coming in here to blame PVE for every network issue on earth without disclosing what they did.

. I was dealing with an odd issue only affect ssh from one node to the other in just one way.
I am confident that had you spent more time with basic Linux network troubleshooting, you would have found that it was not just ssh.

Good luck with your network admin.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Ramalama
Problem solved. It was not a bonding issue. Was an openssh bug. I had to either change MTU down from 9000 to 1400 or simply specify the cipher is the ssh_config file which I did. Connected right away.
 
Back in comment #4 MTU was my first suggestion to check.

Problem solved. It was not a bonding issue. Was an openssh bug. I had to either change MTU down from 9000 to 1400 or simply specify the cipher is the ssh_config file which I did. Connected right away.
working around MTU inconsistencies by trying to reduce specific application packet size is a road paved to:

i have spent like 8 hours troubleshooting this to no avail. very frustrating.

but if you are happy with this solution, sure, go for it. Chat with you in a week or two about another "weird" network issue :)


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
Back in comment #4 MTU was my first suggestion to check.


working around MTU inconsistencies by trying to reduce specific application packet size is a road paved to:



but if you are happy with this solution, sure, go for it. Chat with you in a week or two about another "weird" network issue :)


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
It's a bug with openssh. Been reported since a long time.
Mtu is setup for jumpo frames on both switch and proxmox side. Which inconsistencies you're referring to?
If there I something I am missing. I am willing to learn more and do things right.

https://lists.debian.org/debian-ssh/2014/07/msg00007.html
 
Last edited:
It's a bug with openssh. Been reported since a long time.
Mtu is setup for jumpo frames on both switch and proxmox side. Which inconsistencies you're referring to?
If there I something I am missing. I am willing to learn more and do things right.

https://lists.debian.org/debian-ssh/2014/07/msg00007.html
A network problem reported by someone 9 years ago, with no follow up... Millions of devices right this second are using MTU9000 and SSH.

What you've reported as solution tells me the following:
a) You had jumbo MTU, which further complicates your environment and which you did not disclose before.
b) One of your devices generates an SSH payload that fits into 9000 (assuming this client was correctly configured) but which was above "safe" size for the other client/switch.
c) You changed the SSH cipher and now the payload of initial SSH session negotiation fits into an unknown "safe" MTU size.
d) You reported that reducing the MTU to 1400 (on that one client?) "solves" your problem. That indicates that a device upstream of your client was not configured for jumbo MTU properly. Now the initial client restricts itself to ~1400 packet size (fragmenting packets larger than that) and "safely" fits into upstream (switch, inter-switch, next client).
e) There is still outstanding question of why you found 1400 to be safe size on, presumably, "simple" flat one switch network. Are there VLANs involved as well?

The solution is simple - set the MTU on each port (client, client/switch, inter-switch, server/switch, server) to one consistent value.
All you did now - changed the initial payload for SSH. Few weeks from now, if you generate any reasonable load on the network, you will continue experiencing "weird" issues.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
That makes sense. Thank you so much for explaining that.
A question that might get me to make more sense of this, so if all the proxmox node interfaces using those links that connect to the 10gbe switch are set to jump frames along with the ports on the 10gbe switch are configured for jumpo frames, that is still not enough?
 
A question that might get me to make more sense of this, so if all the proxmox node interfaces using those links that connect to the 10gbe switch are set to jump frames along with the ports on the 10gbe switch are configured for jumpo frames,
its "Jumbo frames" - reference to the fact that they are quite larger than standard 1500 size frames.
that is still not enough?
This is a trick question, as you dont define "enough". The answer to such questions is always "it depends".

Here is the rule for using above standard ethernet frame size: all connection points, end to end, must be set to the same size*^^**.

* - not only must they be set to the same side, the firmware employed must support it properly, which is not always the case.
^^ - special circumstances can require reduction of frame size. As described in the 2014 bug report you found one person's issue was related to VPN tunnel where VPN tunnel encapsulation further reduced usable frame size to the client. VLANs is another such possibility.
** - in special advanced cases MTU may be dissimilar. Anyone who does this should have multiple meetings with their Network Team and have everyone sign off near Ancient Weirwood prior to implementation that they will never come to public forum with this knowledge.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!