[SOLVED] changing corosync ring0 address and adding ring 1 notes

RobFantini

Famous Member
May 24, 2012
2,041
107
133
Boston,Mass
hello - i was having issues with changing corosync ring 0 address.

prior to that I had set up ring 1 . It seemed to be set up correctly -but was not .

see last post for solution.

I am removing useless lines from this thread. I'll leave error lines from syslog incase that helps with search on similar issues.
 
Last edited:
and wiki says:
Your hosts file entry for the corosync hostname and the one in ring0_addr from corosync.conf do not match or could not be resolved.

Fix them up and reboot/restart. If you need to change something in corosync.conf but have no write permissions see Write config when not quorate.

So - looks like a reboot/restart is needed.

Is there a systemctl restart service that can be tried before resorting to shutting down and starting all nodes? < when adding ring or changing address all nodes need to be rebooted.
 
Last edited:
one thing was wrong with config. I found out when a node was restarted.
Code:
Oct 26 09:23:14 pve9 corosync[4664]:  [MAIN  ] parse error in config: 2 is too many configured interfaces for the rrp_mode setting none.
Oct 26 09:23:14 pve9 corosync[4664]: error   [MAIN  ] parse error in config: 2 is too many configured interfaces for the rrp_mode setting none.
so I checked wolfgangs corosync.conf at network forum, and this was needed to fix that:
Code:
rrp_mode: passive

the above error was not in syslog when ring1 network was added to corosync.conf.
 
Last edited:
one thing was wrong with config. I found out when a node was restarted.
Code:
Oct 26 09:23:14 pve9 corosync[4664]:  [MAIN  ] parse error in config: 2 is too many configured interfaces for the rrp_mode setting none.
Oct 26 09:23:14 pve9 corosync[4664]: error   [MAIN  ] parse error in config: 2 is too many configured interfaces for the rrp_mode setting none.
so I checked wolfgangs corosync.conf at network forum, and this was needed to fix that:
Code:
rrp_mode: passive

the above error was not in syslog when ring1 network was added to corosync.conf.

so after office hours I'll try using the new conf with changed ring0 address.

adding rrp_mode did not fix:
Code:
Oct 26 17:26:14 pve2 corosync[4125]: notice  [CFG   ] Config reload requested by node 7
Oct 26 17:26:14 pve2 corosync[4125]: notice  [CFG   ] Modified entry 'totem.rrp_mode' in corosync.conf cannot be changed at run-time
Oct 26 17:26:14 pve2 corosync[4125]:  [CFG   ] Config reload requested by node 7
Oct 26 17:26:14 pve2 corosync[4125]:  [CFG   ] Modified entry 'totem.rrp_mode' in corosync.conf cannot be changed at run-time
Oct 26 17:26:14 pve2 corosync[4125]: crit    [VOTEQ ] configuration error: nodelist or quorum.expected_votes must be configured!
Oct 26 17:26:14 pve2 corosync[4125]: crit    [VOTEQ ] will continue with current runtime data
Oct 26 17:26:14 pve2 corosync[4125]:  [VOTEQ ] configuration error: nodelist or quorum.expected_votes must be configured!
Oct 26 17:26:14 pve2 corosync[4125]:  [VOTEQ ] will continue with current runtime data
Oct 26 17:26:14 pve2 pmxcfs[887717]: [status] notice: update cluster info (cluster name  20170226, version = 63)
 
after reverting to prior config, there is a clue:
Code:
Oct 26 17:28:20 pve2 corosync[4125]:  [CFG   ] Config reload requested by node 7
Oct 26 17:28:20 pve2 corosync[4125]:  [CFG   ] Modified entry 'totem.rrp_mode' in corosync.conf cannot be changed at run-time
Oct 26 17:28:20 pve2 pmxcfs[887717]: [status] notice: update cluster info (cluster name  20170226, version = 64)

Edit:
will try this when i can schedule a little down time tomorrow:
cp new config then
Code:
corosync-cfgtool -R

from man page:
-R Tell all instances of corosync in this cluster to reload corosync.conf

That is a nice utility - however AFAIK rebooting nodes is needed after adding a ring or changing network/addresses
 
Last edited:
restart all nodes fixed FAULTY errors. see below
Code:
# corosync-cfgtool -s
Printing ring status.
Local node ID 7
RING ID 0
       id      = 10.1.10.2
       status  = ring 0 active with no faults
RING ID 1
       id      = 10.10.1.2
       status  = ring 1 active with no faults
will try to change ring0 address next.

* these were in syslog after changing address and not restarting all nodes:
Code:
in logs:  many of these.  check each system
 # grep ringid /var/log/syslog
Oct 27 07:50:55 pve3 corosync[6597]: error   [TOTEM ] Marking ringid 0 interface 10.1.10.3 FAULTY
Oct 27 07:50:55 pve3 corosync[6597]:  [TOTEM ] Marking ringid 0 interface 10.1.10.3 FAULTY

Oct 27 07:54:55 pve10 corosync[10143]: error   [TOTEM ] Marking ringid 0 interface 10.10.0.10 FAULTY
Oct 27 07:54:55 pve10 corosync[10143]:  [TOTEM ] Marking ringid 0 interface 10.10.0.10 FAULTY

Oct 27 07:55:11 pve15 corosync[2808]: error   [TOTEM ] Marking ringid 0 interface 10.10.0.15 FAULTY
Oct 27 07:55:11 pve15 corosync[2808]:  [TOTEM ] Marking ringid 0 interface 10.10.0.15 FAULTY
 
Last edited:
tried cp new config
Code:
corosync-cfgtool -R
syslog:
Code:
Oct 27 03:52:13 pve15 corosync[2831]: notice  [CFG   ] Config reload requested by node 1                                                
Oct 27 03:52:13 pve15 corosync[2831]:  [CFG   ] Config reload requested by node 1                                                       
Oct 27 03:52:13 pve15 corosync[2831]: crit    [VOTEQ ] configuration error: nodelist or quorum.expected_votes must be configured!       
Oct 27 03:52:13 pve15 corosync[2831]: crit    [VOTEQ ] will continue with current runtime data                                          
Oct 27 03:52:13 pve15 corosync[2831]:  [VOTEQ ] configuration error: nodelist or quorum.expected_votes must be configured!              
Oct 27 03:52:13 pve15 corosync[2831]:  [VOTEQ ] will continue with current runtime data

so restart corosync did not work. try restart all nodes next.
 
all issues fixed .

lessons learned:

1- setting up dual ring:
- add to corosync.conf : rrp_mode: passive
- add ring1 config info. [ network addresses ]
- after adding ring1 , all nodes need to be restarted . one at a time.

2- changing ring0 address
after changing config
reboot all nodes - one at a time is fine.

Of course before changing corosync.conf test new network with ping and omping between all nodes.


thanks to Wolfgang for the help. his configuration files are at https://forum.proxmox.com/threads/bond_mode-active-backup-issue.47837/#post-224850
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!