Change IP of cluster node

Bubbagump210

Member
Oct 1, 2020
53
33
23
45
I have a situation where a "test" cluster moved to production out from under me and I need to get a few of the nodes proper IPs. I have a read a few threads here that seem to make it sound easy, but I am getting close to making a big mess. The one thread (this one) said to edit:

/etc/network/interfaces
/etc/hosts

on the node you want to update then edit

/etc/pve/corosync.conf

on a node in the cluster assuming quorum was right.

This didn't work at all and I have one node completely in outer space. It appears I am going to have to reinstall. I'd like to not have to do this again. :)

Soo, is there an updated procedure for this?
 
Last edited:
It should work if you changed the network configuration then the Corosync.

This didn't work at all and I have one node completely in outer space. It appears I am going to have to reinstall. I'd like to not have to do this again. :)
what does the Syslog say on the node where the configuration was changed?
 
It should work if you changed the network configuration then the Corosync.


what does the Syslog say on the node where the configuration was changed?
I ended up reinstalling to get the node back so the logs were lost.

Let me be super sure I understand the process as it sounds like it *should* be easy.

1. On the node to have a new IP - Edit /etc/network/interfaces and /etc/hosts to reflect the new IP. /etc/hosts should ONLY have the new IP.
2. On the node to have the new IP - Run 'ifdown someinterface; ifup someinterface' to apply the new IP.

At this point the cluster is broken as none of the nodes can see their old partner.

3. On a node in the cluster with quorum - Edit /etc/pve/corosync.conf. Change the IP of the node to the new IP, increment the version. This should then update on all of the nodes on the cluster that are currently showing in the cluster with 'pvecm status'.

Where I got down a rabbit hole is as soon as I changed the IP of the node, it of course didn't get updated configs from the greater cluster as /etc/pve (pve-cluster) was essentially broken. Then I was in the weeds running 'pmxcfs -l' and editing /etc/pve/corosync.conf on the new IP node to match the rest of the cluster. I would reboot and the IP changed node simply wouldn't join the greater cluster. So I am missing something major OR way over complicating and probably both.
 
At this point the cluster is broken as none of the nodes can see their old partner.
You had to give a fake vote to the quorum using `pvecm expected 1` command, also if there is a HA you should stop it until you finish the operation.
 
So run 'pvecm expected 1' on the node with the new IP? It will then get its new corosync.conf from the main cluster?

And when you say HA, are you meaning if shared storage is used for VM failover? In this case, I don't have shared storage.
 
I tried again and this is what I am getting in the syslog on a node in the cluster:

Code:
Mar 18 13:31:29 pve-firewall2 pmxcfs[924]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 28)
Mar 18 13:31:29 pve-firewall2 corosync[1044]:   [CFG   ] Config reload requested by node 4
Mar 18 13:31:29 pve-firewall2 corosync[1044]:   [TOTEM ] new config has different address for link 0 (addr changed from 192.168.73.102 to 192.168.73.105). Internal value was NOT changed.
Mar 18 13:31:29 pve-firewall2 corosync[1044]:   [CFG   ] Cannot configure new interface definitions: To reconfigure an interface it must be deleted and recreated. A working interface needs to be available to corosync at all times
Mar 18 13:31:29 pve-firewall2 pmxcfs[924]: [dcdb] crit: corosync-cfgtool -R failed with exit code 7#010

The node with the changed IP never got a new corosync even with 'pvecm expected 1'.
 
Last edited:
So run 'pvecm expected 1' on the node with the new IP? It will then get its new corosync.conf from the main cluster?
No, this will tell the Corosync there is a new value of expected votes, (instead of no votes).

And when you say HA, are you meaning if shared storage is used for VM failover? In this case, I don't have shared storage.
No, I mean the High Availability


The node with the changed IP never got a new corosync even with 'pvecm expected 1'.
Did you restart the corosync service? systemctl restart corosync.service did you compare with the /etc/pve/corosync.conf between the nodes if the IPs are correct?
 
Ok, I have this figured out and the result is largely why I started this thread - every procedure I have found is incomplete or has major assumptions that may not be obvious. So, future Googlers, here is the deal. Do this is in this order on the respective nodes.

NODE = the node that is getting the new IP.
CLUSTER = all other Proxmox nodes that will maintain quorum and can talk to one another throughout this procedure.
ONE CLUSTER = any one single node within CLUSTER

On NODE

1. Edit /etc/pve/corosync.conf.
2. Update the IP for NODE.
3. Increment
Code:
config_version:

This change should push out a new corosync.conf to all nodes in CLUSTER. Confirm all nodes in CLUSTER have the new /etc/pve/corosync.conf. At this point the cluster will be broken. If you run
Code:
 pvecm status
on the NODE, you will see it can't find the rest of the nodes in the cluster. If you run
Code:
 pvecm status
on CLUSTER you will see they can all see each other but NODE is missing.

Still on NODE
1. Edit /etc/network/interfaces and update the IP to the desired IP.
2. Edit /etc/hosts and update the IP to the new IP.
3.
Code:
ifdown vmbr0; ifup vmbr0
to get your interface to have the new static IP. Change "vmbr0" to the name of your interface.
4. Restart corosync and pve-cluster.
Code:
systemctl restart corosync
Code:
systemctl restart pve-cluster

On CLUSTER
1. Restart corosync on EVERY member of CLUSTER.
Code:
systemctl restart corosync

At this point
Code:
pvecm status
should show all nodes as being in the cluster, good quorum, and NODE has its proper IP. Be patient as this can take a minute. To be extra sure, run
Code:
cat /etc/pve/.members
on NODE and this should show all the correct IPs.

Additional cleanup.

On NODE:

1. Optional: Edit /etc/issue. Update to the new IP on NODE. This ensures the console login screen shows the right IP.
2. Edit /etc/pve/storage.cfg and update any references to the old NODE IP - likely only an issue if you run PVE and PBS next to each other.
3. Optional: Edit /etc/pve/priv/known_hosts and update the IP of NODE.

Other weirdness: In this process I have found sometimes VMs and containers lose their network connection and need to be rebooted. I haven't found a good way to avoid this or fix it beyond a VM/CT reboot. If anyone has an idea to make this 100% zero downtime (or near zero downtime), let me know and I'll add that step.
 
Last edited:
Ok, I have this figured out and the result is largely why I started this thread - every procedure I have found is incomplete or has major assumptions that may not be obvious. So, future Googlers, here is the deal. Do this is in this order on the respective nodes.

NODE = the node that is getting the new IP.
CLUSTER = all other Proxmox nodes that will maintain quorum and can talk to one another throughout this procedure.
ONE CLUSTER = any one single node within CLUSTER

On NODE

1. Edit /etc/pve/corosync.conf.
2. Update the IP for NODE.
3. Increment
Code:
config_version:

This change should push out a new corosync.conf to all nodes in CLUSTER. Confirm all nodes in CLUSTER have the new /etc/pve/corosync.conf. At this point the cluster will be broken. If you run
Code:
 pvecm status
on the NODE, you will see it can't find the rest of the nodes in the cluster. If you run
Code:
 pvecm status
on CLUSTER you will see they can all see each other but NODE is missing.

Still on NODE
1. Edit /etc/network/interfaces and update the IP to the desired IP.
2. Edit /etc/hosts and update the IP to the new IP.
3.
Code:
ifdown vmbr0; ifup vmbr0
to get your interface to have the new static IP. Change "vmbr0" to the name of your interface.
4. Restart corosync and pve-cluster.
Code:
systemctl restart corosync
Code:
systemctl restart pve-cluster

On CLUSTER
1. Restart corosync on EVERY member of CLUSTER.
Code:
systemctl restart corosync

At this point
Code:
pvecm status
should show all nodes as being in the cluster, good quorum, and NODE has its proper IP. Be patient as this can take a minute. To be extra sure, run
Code:
cat /etc/pve/.members
on NODE and this should show all the correct IPs.

Additional cleanup.

On NODE:

1. Optional: Edit /etc/issue. Update to the new IP on NODE. This ensures the console login screen shows the right IP.
2. Edit /etc/pve/storage.cfg and update any references to the old NODE IP - likely only an issue if you run PVE and PBS next to each other.
3. Optional: Edit /etc/pve/priv/known_hosts and update the IP of NODE.

Other weirdness: In this process I have found sometimes VMs and containers lose their network connection and need to be rebooted. I haven't found a good way to avoid this or fix it beyond a VM/CT reboot. If anyone has an idea to make this 100% zero downtime (or near zero downtime), let me know and I'll add that step.

So this looks like it assumes you're doing the re-ip at the same time you're doing the above.
Does anyone have any idea what to do if you've re-ip'd the cluster devices and moved to to another site with a completely new ip address scheme and no way to use the old IPs?
 
Ok, I have this figured out and the result is largely why I started this thread - every procedure I have found is incomplete or has major assumptions that may not be obvious. So, future Googlers, here is the deal. Do this is in this order on the respective nodes.

NODE = the node that is getting the new IP.
CLUSTER = all other Proxmox nodes that will maintain quorum and can talk to one another throughout this procedure.
ONE CLUSTER = any one single node within CLUSTER

On NODE

1. Edit /etc/pve/corosync.conf.
2. Update the IP for NODE.
3. Increment
Code:
config_version:

This change should push out a new corosync.conf to all nodes in CLUSTER. Confirm all nodes in CLUSTER have the new /etc/pve/corosync.conf. At this point the cluster will be broken. If you run
Code:
 pvecm status
on the NODE, you will see it can't find the rest of the nodes in the cluster. If you run
Code:
 pvecm status
on CLUSTER you will see they can all see each other but NODE is missing.

Still on NODE
1. Edit /etc/network/interfaces and update the IP to the desired IP.
2. Edit /etc/hosts and update the IP to the new IP.
3.
Code:
ifdown vmbr0; ifup vmbr0
to get your interface to have the new static IP. Change "vmbr0" to the name of your interface.
4. Restart corosync and pve-cluster.
Code:
systemctl restart corosync
Code:
systemctl restart pve-cluster

On CLUSTER
1. Restart corosync on EVERY member of CLUSTER.
Code:
systemctl restart corosync

At this point
Code:
pvecm status
should show all nodes as being in the cluster, good quorum, and NODE has its proper IP. Be patient as this can take a minute. To be extra sure, run
Code:
cat /etc/pve/.members
on NODE and this should show all the correct IPs.

Additional cleanup.

On NODE:

1. Optional: Edit /etc/issue. Update to the new IP on NODE. This ensures the console login screen shows the right IP.
2. Edit /etc/pve/storage.cfg and update any references to the old NODE IP - likely only an issue if you run PVE and PBS next to each other.
3. Optional: Edit /etc/pve/priv/known_hosts and update the IP of NODE.

Other weirdness: In this process I have found sometimes VMs and containers lose their network connection and need to be rebooted. I haven't found a good way to avoid this or fix it beyond a VM/CT reboot. If anyone has an idea to make this 100% zero downtime (or near zero downtime), let me know and I'll add that step.
Do I need to update the hosts file on the other nodes in the cluster?
 
Ok, I have this figured out and the result is largely why I started this thread - every procedure I have found is incomplete or has major assumptions that may not be obvious. So, future Googlers, here is the deal. Do this is in this order on the respective nodes.

NODE = the node that is getting the new IP.
CLUSTER = all other Proxmox nodes that will maintain quorum and can talk to one another throughout this procedure.
ONE CLUSTER = any one single node within CLUSTER

On NODE

1. Edit /etc/pve/corosync.conf.
2. Update the IP for NODE.
3. Increment
Code:
config_version:

This change should push out a new corosync.conf to all nodes in CLUSTER. Confirm all nodes in CLUSTER have the new /etc/pve/corosync.conf. At this point the cluster will be broken. If you run
Code:
 pvecm status
on the NODE, you will see it can't find the rest of the nodes in the cluster. If you run
Code:
 pvecm status
on CLUSTER you will see they can all see each other but NODE is missing.

Still on NODE
1. Edit /etc/network/interfaces and update the IP to the desired IP.
2. Edit /etc/hosts and update the IP to the new IP.
3.
Code:
ifdown vmbr0; ifup vmbr0
to get your interface to have the new static IP. Change "vmbr0" to the name of your interface.
4. Restart corosync and pve-cluster.
Code:
systemctl restart corosync
Code:
systemctl restart pve-cluster

On CLUSTER
1. Restart corosync on EVERY member of CLUSTER.
Code:
systemctl restart corosync

At this point
Code:
pvecm status
should show all nodes as being in the cluster, good quorum, and NODE has its proper IP. Be patient as this can take a minute. To be extra sure, run
Code:
cat /etc/pve/.members
on NODE and this should show all the correct IPs.

Additional cleanup.

On NODE:

1. Optional: Edit /etc/issue. Update to the new IP on NODE. This ensures the console login screen shows the right IP.
2. Edit /etc/pve/storage.cfg and update any references to the old NODE IP - likely only an issue if you run PVE and PBS next to each other.
3. Optional: Edit /etc/pve/priv/known_hosts and update the IP of NODE.

Other weirdness: In this process I have found sometimes VMs and containers lose their network connection and need to be rebooted. I haven't found a good way to avoid this or fix it beyond a VM/CT reboot. If anyone has an idea to make this 100% zero downtime (or near zero downtime), let me know and I'll add that step.
Thank you so much. I would love for the Proxmox VE team to add this to the docs. I followed this guide and it worked easily on my 3 node cluster.
 
Ok, I have this figured out and the result is largely why I started this thread - every procedure I have found is incomplete or has major assumptions that may not be obvious. So, future Googlers, here is the deal. Do this is in this order on the respective nodes.

NODE = the node that is getting the new IP.
CLUSTER = all other Proxmox nodes that will maintain quorum and can talk to one another throughout this procedure.
ONE CLUSTER = any one single node within CLUSTER

On NODE

1. Edit /etc/pve/corosync.conf.
2. Update the IP for NODE.
3. Increment
Code:
config_version:

This change should push out a new corosync.conf to all nodes in CLUSTER. Confirm all nodes in CLUSTER have the new /etc/pve/corosync.conf. At this point the cluster will be broken. If you run
Code:
 pvecm status
on the NODE, you will see it can't find the rest of the nodes in the cluster. If you run
Code:
 pvecm status
on CLUSTER you will see they can all see each other but NODE is missing.

Still on NODE
1. Edit /etc/network/interfaces and update the IP to the desired IP.
2. Edit /etc/hosts and update the IP to the new IP.
3.
Code:
ifdown vmbr0; ifup vmbr0
to get your interface to have the new static IP. Change "vmbr0" to the name of your interface.
4. Restart corosync and pve-cluster.
Code:
systemctl restart corosync
Code:
systemctl restart pve-cluster

On CLUSTER
1. Restart corosync on EVERY member of CLUSTER.
Code:
systemctl restart corosync

At this point
Code:
pvecm status
should show all nodes as being in the cluster, good quorum, and NODE has its proper IP. Be patient as this can take a minute. To be extra sure, run
Code:
cat /etc/pve/.members
on NODE and this should show all the correct IPs.

Additional cleanup.

On NODE:

1. Optional: Edit /etc/issue. Update to the new IP on NODE. This ensures the console login screen shows the right IP.
2. Edit /etc/pve/storage.cfg and update any references to the old NODE IP - likely only an issue if you run PVE and PBS next to each other.
3. Optional: Edit /etc/pve/priv/known_hosts and update the IP of NODE.

Other weirdness: In this process I have found sometimes VMs and containers lose their network connection and need to be rebooted. I haven't found a good way to avoid this or fix it beyond a VM/CT reboot. If anyone has an idea to make this 100% zero downtime (or near zero downtime), let me know and I'll add that step.

This is exactly what I was trying to do - thank you for sharing following the steps as outlined worked flawlessly!
 
That recipe did not work for me.
I needed to modify the "secondary IP" on a node.
Whenever I tried to write the edit, 'journalctl -fu corosync' complained:
"Cannot configure new interface definitions: To reconfigure an interface it must be deleted and recreated. ..."
My solution was to create an intermediary config with everything related to "secondary IP" removed.
Write that modification and see it immediately propagate through the cluster.
Then reintroduce the original config with the single modification I wanted.
Write and propagate again - this time without complaints.

File to edit: 'nano /etc/pve/corosync.conf'.
File can be edited without stopping anything, and the effects can be monitored meanwhile in other terminals for every clusternode by:
'journalctl -fu corosync'

---------------

The 3 stages "Before, intermediary and after" of config - with irrelevant parts removed:
Stuff in <> is not literal but explanatory.

---- BEFORE: ----
nodelist {
node {
name: blambox
nodeid: 4
quorum_votes: 1
ring0_addr: 10.10.1.12
ring1_addr: 10.102.1.12
}
node {
name: frenchie
nodeid: 3
quorum_votes: 1
ring0_addr: 10.10.1.10
ring1_addr: 10.2.1.10
}
node {
name: mediamox
nodeid: 1
quorum_votes: 1
ring0_addr: 10.10.1.3
ring1_addr: 10.2.1.3
}
}

totem {
cluster_name: intermox
config_version: <X>
interface {
linknumber: 0
}
interface {
linknumber: 1
}
ip_version: ipv4
link_mode: passive
secauth: on
version: 2
}


---- INTERMEDIARY: ----
nodelist {
node {
name: blambox
nodeid: 4
quorum_votes: 1
ring0_addr: 10.10.1.12
<--- REMOVED>
}
node {
name: frenchie
nodeid: 3
quorum_votes: 1
ring0_addr: 10.10.1.10
<--- REMOVED>
}
node {
name: mediamox
nodeid: 1
quorum_votes: 1
ring0_addr: 10.10.1.3
<--- REMOVED>
}
}

totem {
cluster_name: intermox
config_version: <INCREMENT>
interface {
linknumber: 0
}
<--- REMOVED>
ip_version: ipv4
link_mode: passive
secauth: on
version: 2
}



---- AFTER: ----
nodelist {
node {
name: blambox
nodeid: 4
quorum_votes: 1
ring0_addr: 10.10.1.12
ring1_addr: 10.2.1.12 <--- CHANGED>
}
node {
name: frenchie
nodeid: 3
quorum_votes: 1
ring0_addr: 10.10.1.10
ring1_addr: 10.2.1.10
}
node {
name: mediamox
nodeid: 1
quorum_votes: 1
ring0_addr: 10.10.1.3
ring1_addr: 10.2.1.3
}
}

totem {
cluster_name: intermox
config_version: <INCREMENT>
interface {
linknumber: 0
}
interface {
linknumber: 1
}
ip_version: ipv4
link_mode: passive
secauth: on
version: 2
}
 
I dont' know how you all got this right because on my 7.4 system files /etc/pve are read-only. Any advice?
 
You had to give a fake vote to the quorum using `pvecm expected 1` command, also if there is a HA you should stop it until you finish the operation.
ended up here when I had troubles after changing IPs in a cluster and things seem to be good to go now but Iwanted to try to confirm that
Code:
pvecm expected 1
is a temporary chnge is that right? It appears as if my cluster is back to expecting 3 votes with quorum set to 2...
thanks for reading
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!