i have a 3 node cluster that was working just fine. i added some NICs to the servers and was going to try to move the main interface for the cluster to a bond. i started on node 1 and screwed up the configuration, so the node got booted from the cluster. i corrected the network config, but the node still remains isolated and i can't figure out the correct steps to bring him back in. AS part of my first attempt at moving interfaces i did mess around with the corosync.conf, which is what i think is messing up the cluster. I can ping every node by name and IP from every other node, so i know the networking is fine.
On node 1 here are some of the details
corosync.conf
node 1 network info:
On the other nodes:
on the other nodes the pvecm status shows as:
network interfaces on other nodes:
I am pretty sure all i need to do is edit the corosync.conf file on node 1, and increase the config_version. i tried that, but it still didn't join the cluster. i am sure i must have the order of steps wrong, but now i don't want to mess things up even further
On node 1 here are some of the details
corosync.conf
Code:
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: pve01
nodeid: 1
quorum_votes: 1
ring0_addr: 10.10.100.100
}
node {
name: pve02
nodeid: 2
quorum_votes: 1
ring0_addr: 10.10.100.105
}
node {
name: pve03
nodeid: 4
quorum_votes: 1
ring0_addr: 10.10.100.110
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: proxmox
config_version: 5
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
node 1 network info:
Code:
auto lo
iface lo inet loopback
iface enp0s31f6 inet manual
#NiC on motherboard
iface wlo1 inet manual
auto enp1s0
iface enp1s0 inet manual
#10GbE Card Port 0
auto enp1s0d1
iface enp1s0d1 inet manual
#10GbE Card Port 1
auto enp6s0f0
iface enp6s0f0 inet manual
#1GbE Card Port 0
auto enp6s0f1
iface enp6s0f1 inet manual
#1GbE Card Port 1
auto bond0
iface bond0 inet manual
bond-slaves enp6s0f0 enp6s0f1
bond-miimon 100
bond-mode 802.3ad
# bond-xmit-hash-policy layer2+3
# bond-downdelay 200
# bond-updelay 200
# bond-lacp-rate 1
#LACP of 1GbE Card Ports
auto vmbr0
iface vmbr0 inet manual
address 10.10.100.100/24
gateway 10.10.100.1
bridge-ports enp0s31f6
bridge-stp off
bridge-fd 0
#Bridge on motherboard NIC
auto vmbr1
iface vmbr1 inet static
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#Bridge on bond0
auto vmbr2
iface vmbr2 inet manual
bridge-ports enp1s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#Bridge for 10GbE Ports
On the other nodes:
Code:
Cluster information
-------------------
Name: proxmox
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Wed Apr 12 09:23:00 2023
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.9e1
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.10.100.100 (local)
on the other nodes the pvecm status shows as:
Code:
Cluster information
-------------------
Name: proxmox
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Wed Apr 12 08:42:06 2023
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000004
Ring ID: 2.9d9
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 2
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 10.10.100.105
0x00000004 1 10.10.100.110 (local)
network interfaces on other nodes:
Code:
auto lo
iface lo inet loopback
iface enp0s31f6 inet manual
#NiC on motherboard
auto enp6s0f0
iface enp6s0f0 inet manual
#1GbE Card Port 0
auto enp6s0f1
iface enp6s0f1 inet manual
#1GbE Card Port 1
auto enp1s0
iface enp1s0 inet manual
#10GbE Card Port 0
auto enp1s0d1
iface enp1s0d1 inet manual
#10GbE Card Port 1
auto bond0
iface bond0 inet manual
bond-slaves enp6s0f0 enp6s0f1
bond-miimon 100
bond-mode 802.3ad
#LACP of 1GbE Card Ports
auto vmbr0
iface vmbr0 inet static
address 10.10.100.105/24
gateway 10.10.100.1
bridge-ports enp0s31f6
bridge-stp off
bridge-fd 0
#Bridge on motherboard NIC
auto vmbr1
iface vmbr1 inet manual
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#Bridge on bond0
auto vmbr2
iface vmbr2 inet manual
bridge-ports enp1s0d1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#Bridge on 10Gb port
I am pretty sure all i need to do is edit the corosync.conf file on node 1, and increase the config_version. i tried that, but it still didn't join the cluster. i am sure i must have the order of steps wrong, but now i don't want to mess things up even further
Last edited: