Broken Pipe 596 on cluster after changing IPs

harmonyp

Member
Nov 26, 2020
196
4
23
47
Our public IPs are 202.55.21.xxx I am running the following to switch from public to internal IPs for the cluster

Code:
killall -9 corosync
systemctl stop pve-cluster
systemctl stop pvedaemon
systemctl stop pvestatd
systemctl stop pveproxy


sed -i 's/202.55.21/10.0.10/g' /etc/corosync/corosync.conf
sed -i 's/202.55.21/10.0.10/g' /etc/hosts


killall -9 corosync
systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pvestatd
systemctl restart pveproxy
systemctl restart corosync.service

The hostnames ping to the LAN ips (we are using same D class) and I can connect root@node2 for example but for some reason Broken Pipe 596 shows in the GUI
 
Last edited:
Our public IPs are 202.55.21.xxx I am running the following to switch from public to internal IPs for the cluster

Code:
killall -9 corosync
systemctl stop pve-cluster
systemctl stop pvedaemon
systemctl stop pvestatd
systemctl stop pveproxy


sed -i 's/202.55.21/10.0.10/g' /etc/corosync/corosync.conf
sed -i 's/202.55.21/10.0.10/g' /etc/hosts


killall -9 corosync
systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pvestatd
systemctl restart pveproxy
systemctl restart corosync.service

The hostnames ping to the LAN ips (we are using same D class) and I can connect root@node2 for example but for some reason Broken Pipe 596 shows in the GUI
ssh host keys might be the problem.
 
@DerDanilo @Moayad

Code:
10.0.10.220    NODE1.mydomain.com      NODE1
10.0.10.221    NODE2.mydomain.com      NODE2
10.0.10.222    NODE3.mydomain.com      NODE3

/etc/pve/corosync.conf

Code:
logging {
  debug: off
  to_syslog: yes
}


nodelist {
  node {
    name: NODE1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.0.10.220
  }
  node {
    name: NODE2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.0.10.221
  }
  node {
    name: NODE3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.0.10.222
  }
}


quorum {
  provider: corosync_votequorum
}


totem {
  cluster_name: NODE-UK
  config_version: 3
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}


There is a connection between the nodes

a4cf4220aded33afe362b0cb8c5f786f.png


With everything stopped I have tried running

mv /root/.ssh/known_hosts /root/.ssh/known_hosts_old
mv /root/.ssh/authorized_keys /root/.ssh/authorized_keys_old
pvecm updatecerts -f

I noticed when I ssh into one of the other nodes in the cluster using 10.0.10.X it takes 3-5 seconds to connect anything else in the 10.0.10.X range is instant. I tried changing MTU 1500/9000 no difference
 
Thank you for the output!

Please, can you provide us with the following information:
- Do you have access to the nodes, i.e., the nodes see each others?
- Do you see error message in the Syslog (/var/etc/syslog) or journalctl -f? (you can attach the syslog as well:) )
- Does the pvecm updatecerts -f command return an error?
- From the screenshot you provided above, I see that the NODE3 ask for the password and that shouldn't
- Can you please also post the output of `pvecm status`
 
  • Like
Reactions: DerDanilo