PVE cluster change ip address and corosync

shafeeks · Jan 30, 2018

Hello,

We have a cluster without HA of 6 Proxmox servers with backend DRBD9 storage and version 4.4-13 in production that consists of 20 VMs and 5 LXC. On bridge vmbr0 we have our public ip addresses onto which we built the cluster. So the cluster was available on the public addresses.

After some moment, we decided to switch the cluster from public ip addresses (41.213.15.x) to a private address that will be accessible internally through vpn. We created another bridge vmbr1 on each server where we installed OVSbridge with vlan. The second bridge will be used as private ip address (10.146.10.x.). We modified /etc/hosts and /etc/network/interfaces files accordingly but we forgot to change the ip in corosync configuration file which is still pointing to an old ip address as per below.

totem {
cluster_name: cluster-run
config_version: 6
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 41.213.15.10
ringnumber: 0
}
}

As per corosync configuration file, the cluster is bind on ip address 41.213.15.10 that is not in use anymore. A server (node2 with ip address 41.213.15.7) from the cluster was rebooted and now it cannot join the cluster.

On this node ie node2

root@node2:~# pvecm status
Cannot initialize CMAP service

Syslog says:

node2 pmxcfs[26779]: [quorum] crit: quorum_initialize failed: 2
node2 pmxcfs[26779]: [confdb] crit: cmap_initialize failed: 2
node2 pmxcfs[26779]: [dcdb] crit: cpg_initialize failed: 2
node2 pmxcfs[26779]: [status] crit: cpg_initialize failed: 2

But on the other side, the all the rest of proxmox servers are still working in the cluster. We will not reboot them for fear of breaking the cluster. If we reboot, We knw there will be a problem.
How is that possible? Is there any cache or written configuration somewhere?

On all other nodes, the command pvecm status gives us the following:

Quorum information
------------------
Date: Tue Jan 30 13:21:20 2018
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000004
Ring ID: 1/13628
Quorate: Yes

Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 5
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 41.213.15.6
0x00000003 1 41.213.15.8
0x00000004 1 41.213.15.74 (local)
0x00000005 1 41.213.15.75
0x00000006 1 41.213.15.76

We are planning to modify the /etc/hosts file correctly as per below in all the proxmox servers

10.146.10.1 node1.cluster.local node1 pvelocalhost
10.146.10.2 node2.cluster.local node2
10.146.10.3 node3.cluster.local node3
10.146.10.4 node4.cluster.local node4
10.146.10.5 node5.cluster.local node5
10.146.10.6 node6.cluster.local node6

Then modify corosync config file for bindinetaddr to 10.150.65.1 and increment its config file version.

We have 25 VM running here on production and we would not like to break the whole cluster.

Is there other configuration to do to make sure that the cluster will be working without any problem?

Thanks for your help

Best regards
Shafeek

shafeeks · Feb 1, 2018

Hello,
could someone have encounter same issues or could counsel on how to solve the about issue without losing the cluser and vms?
Thanks in advance

shafeeks · Feb 11, 2018

Anyone for an advice?
Thanks

mmenaz · Feb 14, 2018

I can't tell you the solution but only that an easy way to find other part of the configuration where your old public IP could be saved is something like

Code:

rgrep 41.213.15 /etc

so you can find all the occurrence of the IP in all the files under /etc and it's subdirs
but don't know if it can be stored elsewhere too or the consequences of what you have done and how to resolve, sorry.

Search

Search

PVE cluster change ip address and corosync

shafeeks

Renowned Member

shafeeks

Renowned Member

shafeeks

Renowned Member

mmenaz

Renowned Member

We value your privacy