Hi,
I can't get my cluster to work after a power failure on both nodes (someone of my office was thinging that this machines dont need to run ). It is not a productive system. I used it only to evaluate proxmox. So I want to know how to fix it:
The Error message I get:
On the WIKI it is written, that is is because of miss-configured /etc/hosts. But I have checked that - they are all right. The two Nodes in that cluster are connected with a direct connection between the two nodes (cluster) and a bonded interface (of two 1GB NICs for the ceph) without Switch.
Ping is possible between the two nodes on bond0.
My config is:
I can't get my cluster to work after a power failure on both nodes (someone of my office was thinging that this machines dont need to run ). It is not a productive system. I used it only to evaluate proxmox. So I want to know how to fix it:
The Error message I get:
Code:
Nov 27 18:58:31 pve2 systemd[1]: Starting Corosync Cluster Engine...
Nov 27 18:58:31 pve2 corosync[29910]: [MAIN ] Corosync Cluster Engine ('2.4.4-dirty'): started and ready to provide service.
Nov 27 18:58:31 pve2 corosync[29910]: [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog systemd xmlconf qdevices qnetd snmp pie relro bindnow
Nov 27 18:58:31 pve2 corosync[29910]: notice [MAIN ] Corosync Cluster Engine ('2.4.4-dirty'): started and ready to provide service.
Nov 27 18:58:31 pve2 corosync[29910]: info [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog systemd xmlconf qdevices qnetd snmp pie relro bindnow
Nov 27 18:58:31 pve2 corosync[29910]: warning [MAIN ] interface section bindnetaddr is used together with nodelist. Nodelist one is going to be used.
Nov 27 18:58:31 pve2 corosync[29910]: warning [MAIN ] Please migrate config file to nodelist.
Nov 27 18:58:31 pve2 corosync[29910]: [MAIN ] interface section bindnetaddr is used together with nodelist. Nodelist one is going to be used.
Nov 27 18:58:31 pve2 corosync[29910]: [MAIN ] Please migrate config file to nodelist.
Nov 27 18:58:31 pve2 corosync[29910]: notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 27 18:58:31 pve2 corosync[29910]: notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Nov 27 18:58:31 pve2 corosync[29910]: [TOTEM ] Initializing transport (UDP/IP Multicast).
Nov 27 18:58:31 pve2 corosync[29910]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Nov 27 18:58:31 pve2 corosync[29910]: notice [TOTEM ] The network interface is down.
Nov 27 18:58:31 pve2 corosync[29910]: [TOTEM ] The network interface is down.
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync configuration map access [0]
Nov 27 18:58:31 pve2 corosync[29910]: info [QB ] server name: cmap
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync configuration service [1]
Nov 27 18:58:31 pve2 corosync[29910]: info [QB ] server name: cfg
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Nov 27 18:58:31 pve2 corosync[29910]: info [QB ] server name: cpg
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync profile loading service [4]
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync resource monitoring service [6]
Nov 27 18:58:31 pve2 corosync[29910]: warning [WD ] Watchdog not enabled by configuration
Nov 27 18:58:31 pve2 corosync[29910]: warning [WD ] resource load_15min missing a recovery key.
Nov 27 18:58:31 pve2 corosync[29910]: warning [WD ] resource memory_used missing a recovery key.
Nov 27 18:58:31 pve2 corosync[29910]: info [WD ] no resources configured.
Nov 27 18:58:31 pve2 corosync[29910]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
Nov 27 18:58:31 pve2 corosync[29910]: notice [QUORUM] Using quorum provider corosync_votequorum
Nov 27 18:58:31 pve2 corosync[29910]: crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
Nov 27 18:58:31 pve2 corosync[29910]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
Nov 27 18:58:31 pve2 corosync[29910]: error [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
Nov 27 18:58:31 pve2 systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
Nov 27 18:58:31 pve2 systemd[1]: Failed to start Corosync Cluster Engine.
Nov 27 18:58:31 pve2 systemd[1]: corosync.service: Unit entered failed state.
Nov 27 18:58:31 pve2 systemd[1]: corosync.service: Failed with result 'exit-code'.
On the WIKI it is written, that is is because of miss-configured /etc/hosts. But I have checked that - they are all right. The two Nodes in that cluster are connected with a direct connection between the two nodes (cluster) and a bonded interface (of two 1GB NICs for the ceph) without Switch.
Ping is possible between the two nodes on bond0.
My config is:
Code:
root@pve1:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: pve1
nodeid: 1
quorum_votes: 1
ring0_addr: 10.212.212.1
}
node {
name: pve2
nodeid: 2
quorum_votes: 1
ring0_addr: 10.212.212.2
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: pvCluster
config_version: 2
interface {
bindnetaddr: 10.212.212.1
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
Code:
root@pve2:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: pve1
nodeid: 1
quorum_votes: 1
ring0_addr: 10.212.212.1
}
node {
name: pve2
nodeid: 2
quorum_votes: 1
ring0_addr: 10.212.212.2
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: pvCluster
config_version: 2
interface {
bindnetaddr: 10.212.212.1
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
Code:
root@pve1:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback
iface eno1 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.33.201
netmask 255.255.255.0
gateway 192.168.33.254
bridge_ports eno1
bridge_stp off
bridge_fd 0
up sysctl -w net.ipv4.ip_forward=1
up iptables -t nat -A POSTROUTING -o $IFACE -j MASQUERADE
down iptables -t nat -D POSTROUTING -o $IFACE -j MASQUERADE
auto vmbr1
iface vmbr1 inet static
address 192.168.233.254
netmask 255.255.255.0
bridge_ports none
bridge_stp off
bridge_fd 0
iface eno2 inet static
address 10.212.212.1
netmask 255.255.255.0
iface eno3 inet manual
iface eno4 inet manual
auto bond0
iface bond0 inet static
address 10.222.222.1
netmask 255.255.255.0
bond-slaves eno3 eno4
bond-mode balance-rr
bond-miimon 100
Code:
root@pve2:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback
iface eno1 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.33.202
netmask 255.255.255.0
gateway 192.168.33.254
bridge_ports eno1
bridge_stp off
bridge_fd 0
up sysctl -w net.ipv4.ip_forward=1
up iptables -t nat -A POSTROUTING -o $IFACE -j MASQUERADE
down iptables -t nat -D POSTROUTING -o $IFACE -j MASQUERADE
auto vmbr1
iface vmbr1 inet static
address 192.168.233.254
netmask 255.255.255.0
bridge_ports none
bridge_stp off
bridge_fd 0
iface eno2 inet static
address 10.212.212.2
netmask 255.255.255.0
iface eno3 inet manual
iface eno4 inet manual
iface enp6s0f0 inet manual
iface enp6s0f1 inet manual
auto bond0
iface bond0 inet static
address 10.222.222.2
netmask 255.255.255.0
bond-slaves eno3 eno4
bond-mode balance-rr
bond-miimon 100
Last edited: