Broken Pipe 596 on cluster after changing IPs

harmonyp

Member
Nov 26, 2020
195
4
23
46
Our public IPs are 202.55.21.xxx I am running the following to switch from public to internal IPs for the cluster

Code:
killall -9 corosync
systemctl stop pve-cluster
systemctl stop pvedaemon
systemctl stop pvestatd
systemctl stop pveproxy


sed -i 's/202.55.21/10.0.10/g' /etc/corosync/corosync.conf
sed -i 's/202.55.21/10.0.10/g' /etc/hosts


killall -9 corosync
systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pvestatd
systemctl restart pveproxy
systemctl restart corosync.service

The hostnames ping to the LAN ips (we are using same D class) and I can connect root@node2 for example but for some reason Broken Pipe 596 shows in the GUI
 
Last edited:
Our public IPs are 202.55.21.xxx I am running the following to switch from public to internal IPs for the cluster

Code:
killall -9 corosync
systemctl stop pve-cluster
systemctl stop pvedaemon
systemctl stop pvestatd
systemctl stop pveproxy


sed -i 's/202.55.21/10.0.10/g' /etc/corosync/corosync.conf
sed -i 's/202.55.21/10.0.10/g' /etc/hosts


killall -9 corosync
systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pvestatd
systemctl restart pveproxy
systemctl restart corosync.service

The hostnames ping to the LAN ips (we are using same D class) and I can connect root@node2 for example but for some reason Broken Pipe 596 shows in the GUI
ssh host keys might be the problem.
 
@DerDanilo @Moayad

Code:
10.0.10.220    NODE1.mydomain.com      NODE1
10.0.10.221    NODE2.mydomain.com      NODE2
10.0.10.222    NODE3.mydomain.com      NODE3

/etc/pve/corosync.conf

Code:
logging {
  debug: off
  to_syslog: yes
}


nodelist {
  node {
    name: NODE1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.0.10.220
  }
  node {
    name: NODE2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.0.10.221
  }
  node {
    name: NODE3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.0.10.222
  }
}


quorum {
  provider: corosync_votequorum
}


totem {
  cluster_name: NODE-UK
  config_version: 3
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}


There is a connection between the nodes

a4cf4220aded33afe362b0cb8c5f786f.png


With everything stopped I have tried running

mv /root/.ssh/known_hosts /root/.ssh/known_hosts_old
mv /root/.ssh/authorized_keys /root/.ssh/authorized_keys_old
pvecm updatecerts -f

I noticed when I ssh into one of the other nodes in the cluster using 10.0.10.X it takes 3-5 seconds to connect anything else in the 10.0.10.X range is instant. I tried changing MTU 1500/9000 no difference
 
Thank you for the output!

Please, can you provide us with the following information:
- Do you have access to the nodes, i.e., the nodes see each others?
- Do you see error message in the Syslog (/var/etc/syslog) or journalctl -f? (you can attach the syslog as well:) )
- Does the pvecm updatecerts -f command return an error?
- From the screenshot you provided above, I see that the NODE3 ask for the password and that shouldn't
- Can you please also post the output of `pvecm status`
 
  • Like
Reactions: DerDanilo

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!