I've spent so many hours and pulled too many hairs for this already.
TLDR: Migrating a VM fails because ssh of the target node listens on IPv6 only (as desired+configured), and migration is attempting connection to IPv4.
All I want is the connection to go through IPv6, or the VPN IP.
The error given is something like:
In an attempt without a migration network, the attempted command was:
Background:
I have joined node2 to node1 and created a cluster while both could communicate over IPv4 and IPv6.
Later, a policy revision stated that IPv4 will not be used for cluster communication, unless it is the VPN address.
Now both nodes aren't available for
Things I tried:
Now, the nodes both have a VPN IP in the range 10.10.0.0/16, but the IPv6 addresses are "in different networks", so to speak, because they both have a unique /64 subnet. I'm not sure whether this is problematic in any way.
Appendix:
TLDR: Migrating a VM fails because ssh of the target node listens on IPv6 only (as desired+configured), and migration is attempting connection to IPv4.
All I want is the connection to go through IPv6, or the VPN IP.
The error given is something like:
ssh: connect to host 13.23.7.13 port 22: Connection refused
TASK ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=enaq' root@13.23.7.13 pvecm mtunnel -migration_network 2a01:48:16:402::2/64 -get_migration_ip' failed: exit code 255
In an attempt without a migration network, the attempted command was:
/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=enaq' root@13.23.7.13 /bin/true
Background:
I have joined node2 to node1 and created a cluster while both could communicate over IPv4 and IPv6.
Later, a policy revision stated that IPv4 will not be used for cluster communication, unless it is the VPN address.
Now both nodes aren't available for
ssh
over IPv4.Things I tried:
Now, the nodes both have a VPN IP in the range 10.10.0.0/16, but the IPv6 addresses are "in different networks", so to speak, because they both have a unique /64 subnet. I'm not sure whether this is problematic in any way.
- Changed
corosync.conf
so that each node has only one IP, the v6 IP. Also, set theip_version
option toipv6
. Previously it wasipv4-6
. - Setting different values for the "Migration Network" option in the cluster options didn't help. I tried a VPN IPv4, a public IPv6, and the default setting of *nothing*. Also tried
secure
andinsecure
types. - Restarted both nodes simultaneously (all 2 of them), after doing the changes above.
- Studying the source code of Proxmox to determine where the IP is coming from, I couldn't dive deeper than a certain level:
- $ip = PVE::Cluster::remote_node_ip($node);
- if (my $ip = $nodelist->{$nodename}->{ip}) {
- eval { @resolved_raw = PVE::Tools::getaddrinfo_all($hostname); };
- my ($err, @res) = Socket::getaddrinfo($hostname, '0', \%hints);
- I also went into the C code (I think it was `server.c`) that gets the `nodelist` structure over the network. And the code that analyses `corosync.conf`... ah. I got lost then.
Appendix:
/etc/pve/corosync.conf
with IPv4s removed:
Code:
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: ola
nodeid: 1
quorum_votes: 1
ring0_addr: 2a01:48:16:402::2
}
node {
name: enaq
nodeid: 2
quorum_votes: 1
ring0_addr: 2a01:48:21:2a2::2
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: Dolphin
config_version: 2
interface {
linknumber: 0
}
interface {
linknumber: 1
}
ip_version: ipv6
link_mode: passive
secauth: on
version: 2
}
/etc/pve/datacenter.cfg
:
Code:
console: html5
email_from: proxmox@example.com
keyboard: en-us
max_workers: 6
migration: type=secure
next-id: lower=100,upper=254