Wrong Join Information IP

If later the application uses the IPv6 on that interface, it is a flaw of the application. The admin selected an interface AND an IPv4 family address.
That's what must be used. If other address family is being used, is wrong.
That will be used, the address from link0, and optional further links are written out as is to the corosync.conf no transformation whatsoever, if it was IPv4 on selecting that will be written there, and vice versa. So this won't change.

Again I'm really sorry this is flawed. I select link0 with ipv4 address that's what muse be used. I selected a public ipv4. Proxmox can listen on all interfaces, for sure. No problem. But to send traffic, it must use the selected interfaces and addresses. That does not happen.
If people select links and addresses, those are to be used. I don't see how conceptually this is difficult to understand. People select links those should be used. If another public address is needed for join info exchange, ask for what address to use, or configure it and write it to a conf somewhere.

And again, that happens. If you select something it will be used! But the pre-selected choice, which is automatically made, will be a result of the gai.conf ordering. Change the pre-selected one to your choice and it will be used...

Do you want to change the pre-selected choice, change gai.conf.

What are links 1 and 2 etc after link 0 used for and how?
Are they not used for corosync?

They are mainly for redundancy:
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_redundancy

Do I have to put any of the public ip's on these links, or can them all be private?

Can be all private, the nodes just need to be able to talk to each others all over them with good latency (ideally <2 ms may work up to <10 ms)

Will that interfere with something?

Normally the main corosync link, that is if no priorities set the one with the lowest link ID, e.g., link0, is used. And that one should not be put together with IO related traffic, e.g., ceph private network or NFS traffic, etc. While corosync does not require much bandwidth it has hard requirements on latency, IO traffic is normally of a shape which can quickly interfere with that requirement.

https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_cluster_network_requirements

So we recommend a physically separate network for the main link, but adding other networks as fall back links - in order of traffic expected on them. The separate one could be even just a 100Mbps switch for small node counts.
 
That will be used, the address from link0, and optional further links are written out as is to the corosync.conf no transformation whatsoever, if it was IPv4 on selecting that will be written there, and vice versa. So this won't change.
Alright so the links selected when Creating Cluster will be used for corosync and only for corosync.
Another link with a public address from whatever available will be used to exchange join info. Did I understand correctly?

And again, that happens. If you select something it will be used! But the pre-selected choice, which is automatically made, will be a result of the gai.conf ordering. Change the pre-selected one to your choice and it will be used...

Do you want to change the pre-selected choice, change gai.conf.
Well its easy to acknowledge we have different opinions here. I don't like "pre-selected" choices where the machine has a will over the administrator.
Imo the admin would select a link for this purpose and that link would be used always.


So, configuring from GUI where you don't manually select priorities, is it safe to assume link0 will have a higher priority than link1 and so on even if I don't set the priority manually as in the given examples?

Can be all private, the nodes just need to be able to talk to each others all over them with good latency (ideally <2 ms may work up to <10 ms)

I have a latency around the 2ms and finding it high for a private network, but got to tackle one issue at the time lol

Normally the main corosync link, that is if no priorities set the one with the lowest link ID, e.g., link0, is used. And that one should not be put together with IO related traffic, e.g., ceph private network or NFS traffic, etc. While corosync does not require much bandwidth it has hard requirements on latency, IO traffic is normally of a shape which can quickly interfere with that requirement.

https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_cluster_network_requirements

So we recommend a physically separate network for the main link, but adding other networks as fall back links - in order of traffic expected on them. The separate one could be even just a 100Mbps switch for small node counts.
We're on the same page on having separate networks for different traffic. I mistakenly thought that link0 would be the link used to exchange node information.

So now if I want to take that network out of of my cluster config, I edit /etc/pve/corosync and just change the IP's there and its done?
 
Another link with a public address from whatever available will be used to exchange join info. Did I understand correctly?

An arbitrary address, the "peer address", will be used to make the API calls to exchange the join info and cluster authentication key.
This can be one of the those "link0", "link1", ... later used for cluster traffic or something else. The sole need is that the joining node can use it to talk to a cluster nodes API (https port 8006), and if it's only during joining.

So, configuring from GUI where you don't manually select priorities, is it safe to assume link0 will have a higher priority than link1 and so on even if I don't set the priority manually as in the given examples?

Exactly, if there are no priorities configured then lower ID is higher priority. I do not like that whole priority value importance mix, tbh, it just adds confusion - but well, most of the time manually priorities are not required anyway - they are just nice to have for network changes - where one wants to shift priority temporarily.

We're on the same page on having separate networks for different traffic. I mistakenly thought that link0 would be the link used to exchange node information.

In small simple setups with mostly local storage it could even work out OK, but for anything with more than a few nodes or ceph or the like involved it just creates issues the worst time one can imagine (e.g., ceph rebalancing due to some OSD fallout leading to overloaded network -> cluster breaks).

So now if I want to take that network out of of my cluster config, I edit /etc/pve/corosync and just change the IP's there and its done?

Normally, if the cluster is fully working and you obey https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_edit_corosync_conf (as you already mentioned that one it should be OK): Yes

Just ensure that link0 from all nodes is on the same network, similar with link1.

If I should tell for sure I'd need to check the output of:
Bash:
pvecm status
cat /etc/pve/corosync.conf

To ensure the cluster is healthy and the state of the config now.
 
Ok. Thanks a lot for your patience and for the fruitful explanations.

I did edit the corosync.conf in the meanwhile as described on the documentation editing the file and systemctl restart the service afterwards.
Apparently worked just fine.

Code:
Cluster information
-------------------
Name:             MyCluster
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Tue Jun 16 17:12:01 2020
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.b6
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2  
Flags:            Quorate 

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.11.49.1 (local)
0x00000002          1 10.11.49.2

So now... Ceph monitors and why it failed upon initial setup is my next issue.
https://forum.proxmox.com/threads/ceph-errors-on-brand-new-install.71486/

Cheers!
 
Hello!

I had a similar issue but instead of the IPv4 address It had an external adapter as the IP for the join information.

I was able to edit /etc/hosts and change from the external IP to the internal IP (to be the first entry seen, on the second line). Then the join info had the correct IP and other nodes could join with no issue!

Hope this helps someone out there. Thanks for all you guys do
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!