Error 596 : Issue with communication between two nodes (1 and 2), but no issues between nodes 3 and 4

sbe

New Member
Dec 9, 2022
3
0
1
Hello everyone,

I'm experiencing a communication issue between two of my Proxmox nodes (1 and 2), while everything works fine between nodes 3 and 4.
The communication test is being performed through the web GUI using the SHELL menu.

Here’s a summary of the situation:

  • pve 1 => 2: ERROR 596
  • pve 1 => 3 and 4: OK
  • pve 2 => 1: ERROR 596
  • pve 2 => 3 and 4: OK
  • pve 3 => 1, 2, 4: OK
  • pve 4 => 1, 2, 3: OK
In the shell, I’m getting the following errors:
Code:
kex_exchange_identification: read: Connection reset by peer
Connection reset by <pve-ip> port 22

I’m also seeing this error in the GUI:
Connection error 596: Connection reset by peer

Already tested:
I’ve tried connecting via SSH using the following command:
Code:
ssh -o "HostKeyAlias=pveX" root@<pve-ip>

Note: An upgrade was recently done on pve 1 and pve 2 via the GUI.

pve1 = v8.2.7
pve2 = v8.1.4
pve3 = v8.1.4
pve4 = v8.1.4

Any advice or troubleshooting steps would be greatly appreciated.

Thanks in advance!
 
Thank you so much for your suggestion, I really appreciate your help!

I performed a manual test to check time synchronization across the servers:

Code:
root@pve1:~# date
Thu Oct 10 02:30:28 PM CEST 2024

root@pve2:~# date
Thu Oct 10 02:30:28 PM CEST 2024

root@pve3:~# date
Thu Oct 10 02:30:29 PM CEST 2024

root@pve4:~# date
Thu Oct 10 02:30:29 PM CEST 2024


After 24 hours, the issue has evolved without any maintenance operations on the nodes.
  • pve2 => pve1: Now working fine (there was a request to update the SSH certificate).
  • pve1 => pve2: I’m still getting an error in the shell interface: Host key verification failed.
    However:
  • pve1 => pve2 works fine for all other menus.
After running the command ssh -o 'HostKeyAlias=pve2' root@X.X.X.12 from pve1, everything is working again.

Could it be that after running several commands, we just needed to wait for a few hours? What’s your opinion on this?
Does that look good to you?

I'm changing the status of this issue, even though the root cause wasn’t identified.

Thanks in advance!
 
Last edited:
I d'ont have any other idea, maybe found in logs why before they failed to communicate :/

Happy to see that 's solved :)
 
Hello everyone,

The same problem has recurred, with no maintenance operation since.

Any advice or troubleshooting steps would be greatly appreciated.

Thanks in advance!
 
OK,
hi again,

i've not really more idea, but go investiguate...

Network thinks:
- Is you PVE hosts with problem have maybe a duplicated IP by another VM/hardware on the same network.
- Same thinks with MAC @.

Firewall/ SecureLogin thinks:

- Did you have done any specific changes on FW of yours hosts ? Same think with maybe a Fail2Ban soft or similar that's blacklisted yours Hosts?
- If vlans deployed, check ARP table & firewall /routing on network hardware.

Check if root have right for SSH access on SSH service.

Did you have checked all ssh connexion manually with the root ssh key ? ( ssh -i /path/to/ssh-priv-key root@pvehost)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!