Cluster nodes are using ipv6 address but corosync config shows only 2 ipv4 addresses

Enthylsa · Dec 7, 2022

I have a 3 node cluster. Each node has a static ipv4 and ipv6 address and additionally an layer2 ipv4 connection via switch.

Since one of the latest updates of PVE the cluster information showed the layer2 ipv4 addresses. But now I see the ipv6 address at the node overview.

In parallel i See replication errors where ssh via ipv6 address is tried and than failed because further communication is expecting the ipv4 addresses:

corosync:

Code:

nodelist {
  node {
    name: server4
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.0.10.0
    ring1_addr: 95.216.100.221
  }
  node {
    name: server5
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.0.10.1
    ring1_addr: 95.216.38.237
  }
  node {
    name: server6
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.0.10.2
    ring1_addr: 65.21.230.42
  }
}

example error message for replication:

Code:

command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=server6' root@2a01:4f9:6a:1cca::2 pvecm mtunnel -migration_network 10.0.10.0/30 -get_migration_ip' failed: exit code 255

What drives me crazy is, that this error does not happen all the time but 2-5 times per day for a 15 min interval replication.

Anyone an idea why the ring0 ip is not used for the server?

shanreich · Dec 7, 2022

What is the content of your /etc/hosts file?

Enthylsa · Dec 7, 2022

It contains all IPv4 and ipv6 entries, starting with IPv4.

Code:

  GNU nano 5.4                                                                                                                                          /etc/hosts                                                                                                                                                   
127.0.0.1 localhost.localdomain localhost
10.0.10.2 server6.lm-consult.com server6
65.21.230.42 server6.lm-consult.com server6

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
2a01:4f9:6a:1cca::2 server6.lm-consult.com server6

shanreich · Dec 7, 2022

Enthylsa said:

It contains all IPv4 and ipv6 entries, starting with IPv4.

Code:

  GNU nano 5.4                                                                                                                                          /etc/hosts                                                                                                                                                  
127.0.0.1 localhost.localdomain localhost
10.0.10.2 server6.lm-consult.com server6
65.21.230.42 server6.lm-consult.com server6

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
2a01:4f9:6a:1cca::2 server6.lm-consult.com server6

I think the problem here is that you have multiple IPs for the same hostname, which probably leads to this issue from what I can tell. It would make sense to use a different FQDN and hostname for each IP (e.g. server6-ipv6), otherwise it can clash. After you made sure that only one entry exists for a hostname in /etc/hosts, you can restart the service pve-cluster via systemctl restart pve-cluster. Then the changed hosts file should get picked up and migration should work as expected.

You should also be able to check which IP gets used for the SSH connection by outputting /etc/pve/.members.

Enthylsa · Dec 7, 2022

@shanreich thanks for the hints. For my understanding, why is it not recommended to use the same FQDN for IPv4 and IPv6? I think its a pretty common pattern right?

shanreich · Dec 7, 2022

Enthylsa said:
@shanreich thanks for the hints. For my understanding, why is it not recommended to use the same FQDN for IPv4 and IPv6? I think its a pretty common pattern right?

Yes, it quite a common pattern and usually poses no problems with other systems. For Proxmox specifically there are some peculiarities with how our clustering works, which requires the entries in the file /etc/hosts to be unique. Because the PVE cluster does not handle this case gracefully in some instances, this can lead to problems with PVE.

Enthylsa · Dec 7, 2022

@shanreich ok thanks for the help

Search

Search

Cluster nodes are using ipv6 address but corosync config shows only 2 ipv4 addresses

Enthylsa

Member

shanreich

Proxmox Staff Member

Enthylsa

Member

shanreich

Proxmox Staff Member

Enthylsa

Member

shanreich

Proxmox Staff Member

Enthylsa

Member