[SOLVED] Moving cluster operation traffic to another network

kobuki

Renowned Member
Dec 30, 2008
473
27
93
I made a mistake when I created a 5-node cluster. I added all nodes by IP to the cluster and `pvecm status` now shows:

Code:
Cluster information
-------------------
Name:             cluster2
Config Version:   5
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon Sep 25 14:42:15 2023
Quorum provider:  corosync_votequorum
Nodes:            5
Node ID:          0x00000001
Ring ID:          1.4d
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.15.15.11 (local)
0x00000002          1 10.15.15.12
0x00000003          1 10.15.15.13
0x00000004          1 10.15.15.15
0x00000005          1 10.15.15.16

The cluster is operational, the problem is that the IPs on the subnet 10.15.15.0/24 don't actually point to the host names of the nodes. They point to a different host name assigned to each host. This is because I want to separate the cluster traffic and the admin network traffic - they are on different physical interfaces.

The IPs and host names look like the table below.

Host names for the cluster/corosync IPs:
Code:
pve1c.whatever.local -> 10.15.15.11
pve2c.whatever.local -> 10.15.15.12
etc.

"Real" host names for the admin access network (10.15.2.0/24):
Code:
pve1.whatever.local -> 10.15.2.11
pve2.whatever.local -> 10.15.2.12
etc.

The latter is what `hostname -f` shows on the nodes. Consequently, Proxmox uses this hostname and the associated 10.15.2.0/24 network for any cluster operation aside from corosync, but this is not good, as this is not intended for cluster operations like migrations, replication, etc., and is a gigabit network instead of the 10G network for the cluster.

Now, my question is: can I simply change the host names so that pveX.whatever.local (hostnames without the trailing 'c' in their name) point to the cluster IPs as shown in `pvecm status`? Would that cause any issues? There is also a CEPH cluster configured on the nodes on the correct networks dedicated to it, but CEPH has its own separate network setting, so I don't hink it would be affected. Production VMs are already running on the cluster, unfortunately. Worst case I could schedule a full downtime.
 
Well, for whomever finds this thread among the other similar ones, the issue was solved by fixing /etc/hosts to align the host names with the desired subnet, and then restarting pve-cluster.service and then corosync.service. Replication and other heavy traffic is flowing on the network it was destined to. NB it also fixed the IPs in /etc/pve/.members, too.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!