Try to Change corosync IP Lost Quorum

afrugone

Renowned Member
Nov 26, 2008
106
0
81
I have for some time a cluster running the default installation with 7 nodes. I've follow the procedure in https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_redundancy. After that I lost Qyurum in node 1, ¿How can I recover it?

And move corosync from 172.27.111.131-137 to a new Interface at IP 10.111.111.131-137

I've made a copy of the original The original corosync.conf to corosync.conf.bak/new and in corosync.conf.new in Node 1 replace the lines "ring0_addr: 172.27.111.13x ---> 10.111.111.13x" and increase the version, the mv to corosync.conf
Then
"systemctl restart corosync"
And Lost quorum, node 1 is out, all other nodes are OK try to restore the original corosync.conf , but get "permission denied".

journalctl -b -u corosync
corosync[1202]: [TOTEM ] new config has different address for link 0 (addr changed from 172.27.111.131 to 10.111.111.131). Internal value was NOT changed.

Node 1:
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Mar 14 18:49:20 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.ada8
Quorate: No

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 1
Quorum: 4 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131 (local)

Node 2
pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Mar 14 18:30:41 2022
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000002
Ring ID: 2.ad9d
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 6
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.27.111.132 (local)
0x00000003 1 172.27.111.133
0x00000004 1 172.27.111.134
0x00000005 1 172.27.111.135
0x00000006 1 172.27.111.136
0x00000007 1 172.27.111.137
 
Still can not recover the Node 1, the corosync.conf file is changed in all nodes, including Node 1, and I cant restore the old configuration, access denied, I don't know what will happen if I restart any node. Please any hint of what to do is wellcome
 
Thanks Fabian.

I used "pvecm expected 1" now I can to edit or modify the corosync.conf file, besides I can Open the web console, for Node 1. But it is still out of the cluster.

The status show that Node 1 the new IP 10.111.111.131, the other nodes still using the OLD IP 172.27.111.x (but the corosync.conf has the new IP)

I have to reboot the other nodes to take the new configuration?

Bellow status and version

Thanks for your Help
Alfredo


root@pve001:/etc/pve# pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 11:12:59 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.ada8
Quorate: Yes

Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131 (local)



On the other nodes
root@pve007:~# pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 11:17:19 2022
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000007
Ring ID: 2.adbd
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 6
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.27.111.132
0x00000003 1 172.27.111.133
0x00000004 1 172.27.111.134
0x00000005 1 172.27.111.135
0x00000006 1 172.27.111.136
0x00000007 1 172.27.111.137 (local)

Corosync.conf in all nodes (1 to 7) is the same:

logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: pve001
nodeid: 1
quorum_votes: 1
ring0_addr: 10.111.111.131
}
node {
name: pve002
nodeid: 2
quorum_votes: 1
ring0_addr: 10.111.111.132
}
node {
name: pve003
nodeid: 3
quorum_votes: 1
ring0_addr: 10.111.111.133
}
node {
name: pve004
nodeid: 4
quorum_votes: 1
ring0_addr: 10.111.111.134
}
node {
name: pve005
nodeid: 5
quorum_votes: 1
ring0_addr: 10.111.111.135
}
node {
name: pve006
nodeid: 6
quorum_votes: 1
ring0_addr: 10.111.111.136
}
node {
name: pve007
nodeid: 7
quorum_votes: 1
ring0_addr: 10.111.111.137
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: Flex
config_version: 17
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
}

In all nodes same PVE version.
root@pve001:/etc/pve# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-5-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-8
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-5-pve: 5.13.19-13
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
 
I took the risk, having everything backed up, I restarted corosync on all nodes, now everything works fine with the new IP.

systemctl restart corosync
pvecm status

Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 16:56:34 2022
Quorum provider: corosync_votequorum
Nodes: 7
Node ID: 0x00000007
Ring ID: 1.adda
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 7
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131
0x00000002 1 10.111.111.132
0x00000003 1 10.111.111.133
0x00000004 1 10.111.111.134
0x00000005 1 10.111.111.135
0x00000006 1 10.111.111.136
0x00000007 1 10.111.111.137 (local)


Many Thanks for your Help
Best Regards
Alfredo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!