Entries on file /etc/hosts - need clarification

stefanodm

New Member
May 20, 2023
6
1
3
Hello Community, I have a cluster with 5 nodes (currently) and I need clarification about hostname resolution.
Each node has 4 NICS and during the setup I configured a network interface for the web ui, one for the cluster network, one for a bridge entirely dedicated to the VMs and finally one dedicated to the ceph cluster.
Code:
10.10.25.x/8 Management Network
10.10.35.x/8 Cluster Network
10.10.30.x/8 Guests Network
172.16.10.x/16 Ceph Network

Code:
pvecm status
Cluster information
-------------------
Name:             Cluster
Config Version:   11
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Tue Aug 22 17:54:43 2023
Quorum provider:  corosync_votequorum
Nodes:            5
Node ID:          0x00000005
Ring ID:          1.52d
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.10.35.1
0x00000002          1 10.10.35.2
0x00000003          1 10.10.35.3
0x00000004          1 10.10.35.5
0x00000005          1 10.10.35.4 (local)
Code:
root@pve4:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pve1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.10.35.1
  }
  node {
    name: pve2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.10.35.2
  }
  node {
    name: pve3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.10.35.3
  }
  node {
    name: pve4
    nodeid: 5
    quorum_votes: 1
    ring0_addr: 10.10.35.4
  }
  node {
    name: pve5
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 10.10.35.5
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: Cluster
  config_version: 11
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

I want to premise that everything is working fine, although most likely because of the 255.0.0.0 netmask. If I look at /etc/hosts some nodes resolve the name on 10.10.25.X others on 10.10.35.X which is the cluster network. I can't figure out where this difference is coming from. In the LAN I have the DNS service which always resolves the node name with the address of the management UI, but, actually, I can't figure out which is the right mapping between hostname and IP address to set in /etc/hosts. First 3 nodes have ceph installed, but having separate interfaces/net i don't think this could lead to problems.

Screenshot 2023-08-22 180852.jpg
Look here at the "server address" column
Screenshot 2023-08-22 180925.jpg
I'm asking myself the problem also because the change of the netmask of the interfaces to a 255.255.255.0 is planned and I believe that as things are now I could have serious problems.
Thanks in advance to anyone who can give me suggestions.
 
What are the contents of your /etc/hosts files right now?
 
Hello LuKas, as i wrote If I look at /etc/hosts some nodes resolve the name on 10.10.25.X others on 10.10.35.X which is the cluster network.
Here two examples:

Code:
127.0.0.1 localhost.localdomain localhost
10.10.25.1 pve1.domain.local pve1
Code:
127.0.0.1 localhost.localdomain localhost
10.10.35.3 pve3.domain.local pve3

The former points to management network
The latter points to cluster network

Which one should be the right one?
 
Which one should be the right one?

I'd recommend setting all to resolve to the IP on the management network.

Also, somewhat unrelated: Consider setting up a redundant network configuration for the cluster engine (corosync). This is useful und recommended if you are running HA-enabled guests in your cluster. If you have HA guests, the a node will fence itself (it reboots) if the communication in the cluster network is disturbed. A second ring for corosync in a separate network solves that.
In your case you could create a second ring in your management network.

https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_redundancy
 
Thank you for your support. It's precious. What about narrowing the cluster network from 10.10.35.x/8 to 10.10.35.x/24 as it should be since the beginning? Either Is there a specific procedure or simply change the values in the respective networking config is sufficient?
 
Either Is there a specific procedure or simply change the values in the respective networking config is sufficient?
I think it should be enough to just change the network configuration. If you encounter any breakage, it should hopefully be easy to switch back.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!