Try to Change corosync IP Lost Quorum

afrugone

Renowned Member
Nov 26, 2008
106
0
81
I have for some time a cluster running the default installation with 7 nodes. I've follow the procedure in https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_redundancy. After that I lost Qyurum in node 1, ¿How can I recover it?

And move corosync from 172.27.111.131-137 to a new Interface at IP 10.111.111.131-137

I've made a copy of the original The original corosync.conf to corosync.conf.bak/new and in corosync.conf.new in Node 1 replace the lines "ring0_addr: 172.27.111.13x ---> 10.111.111.13x" and increase the version, the mv to corosync.conf
Then
"systemctl restart corosync"
And Lost quorum, node 1 is out, all other nodes are OK try to restore the original corosync.conf , but get "permission denied".

journalctl -b -u corosync
corosync[1202]: [TOTEM ] new config has different address for link 0 (addr changed from 172.27.111.131 to 10.111.111.131). Internal value was NOT changed.

Node 1:
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Mar 14 18:49:20 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.ada8
Quorate: No

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 1
Quorum: 4 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131 (local)

Node 2
pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Mar 14 18:30:41 2022
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000002
Ring ID: 2.ad9d
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 6
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.27.111.132 (local)
0x00000003 1 172.27.111.133
0x00000004 1 172.27.111.134
0x00000005 1 172.27.111.135
0x00000006 1 172.27.111.136
0x00000007 1 172.27.111.137
 
Still can not recover the Node 1, the corosync.conf file is changed in all nodes, including Node 1, and I cant restore the old configuration, access denied, I don't know what will happen if I restart any node. Please any hint of what to do is wellcome
 
Thanks Fabian.

I used "pvecm expected 1" now I can to edit or modify the corosync.conf file, besides I can Open the web console, for Node 1. But it is still out of the cluster.

The status show that Node 1 the new IP 10.111.111.131, the other nodes still using the OLD IP 172.27.111.x (but the corosync.conf has the new IP)

I have to reboot the other nodes to take the new configuration?

Bellow status and version

Thanks for your Help
Alfredo


root@pve001:/etc/pve# pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 11:12:59 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.ada8
Quorate: Yes

Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131 (local)



On the other nodes
root@pve007:~# pvecm status
Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 11:17:19 2022
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000007
Ring ID: 2.adbd
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 6
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.27.111.132
0x00000003 1 172.27.111.133
0x00000004 1 172.27.111.134
0x00000005 1 172.27.111.135
0x00000006 1 172.27.111.136
0x00000007 1 172.27.111.137 (local)

Corosync.conf in all nodes (1 to 7) is the same:

logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: pve001
nodeid: 1
quorum_votes: 1
ring0_addr: 10.111.111.131
}
node {
name: pve002
nodeid: 2
quorum_votes: 1
ring0_addr: 10.111.111.132
}
node {
name: pve003
nodeid: 3
quorum_votes: 1
ring0_addr: 10.111.111.133
}
node {
name: pve004
nodeid: 4
quorum_votes: 1
ring0_addr: 10.111.111.134
}
node {
name: pve005
nodeid: 5
quorum_votes: 1
ring0_addr: 10.111.111.135
}
node {
name: pve006
nodeid: 6
quorum_votes: 1
ring0_addr: 10.111.111.136
}
node {
name: pve007
nodeid: 7
quorum_votes: 1
ring0_addr: 10.111.111.137
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: Flex
config_version: 17
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
}

In all nodes same PVE version.
root@pve001:/etc/pve# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-5-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-8
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-5-pve: 5.13.19-13
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
 
I took the risk, having everything backed up, I restarted corosync on all nodes, now everything works fine with the new IP.

systemctl restart corosync
pvecm status

Cluster information
-------------------
Name: Flex
Config Version: 17
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Mar 15 16:56:34 2022
Quorum provider: corosync_votequorum
Nodes: 7
Node ID: 0x00000007
Ring ID: 1.adda
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 7
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.111.111.131
0x00000002 1 10.111.111.132
0x00000003 1 10.111.111.133
0x00000004 1 10.111.111.134
0x00000005 1 10.111.111.135
0x00000006 1 10.111.111.136
0x00000007 1 10.111.111.137 (local)


Many Thanks for your Help
Best Regards
Alfredo