Hi,
All the time I have a problem with one node in the cluster. The web interface still shows that it is offline, but the server is working.
I noticed that there is probably a problem with the corosync service.
service corosync status
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Wed 2017-09-13 10:50:16 CEST; 8min ago
Process: 19975 ExecStart = / usr / share / corosync / corosync start (code = exited, status = 1 / FAILURE)
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cmap
Sep 13 10:49:15 node-111 corosync [19982]: [SERV] Service engine loaded: corosync configuration service [1]
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cfg
Sep 13 10:49:15 node-111 corosync [19982]: [SERV] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cpg
Sep 13 10:49:15 corosync [19982]: [SERV] Service loading: corosync loading service [4]
Sep 13 10:50:16 corosync node-111 [19975]: Starting Corosync Cluster Engine (corosync): [FAILED]
Ntp 13 10:50:16 node-111 systemd [1]: corosync.service: control process exited, code = exited status = 1
Sep 13 10:50:16 node-111 systemd [1]: Failed to start Corosync Cluster Engine.
Sep 13 10:50:16 node-111 systemd [1]: Unit corosync.service entered failed state.
My corosync configuration looks like this:
cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: node-112
nodeid: 1
quorum_votes: 1
ring0_addr: node-112
}
node {
name: node-110
nodeid: 2
quorum_votes: 1
ring0_addr: node-110
}
node {
name: node-114
nodeid: 4
quorum_votes: 1
ring0_addr: 172.30.10.114
}
node {
name: node-113
nodeid: 3
quorum_votes: 1
ring0_addr: node-113
}
node {
name: node-111
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.2.111
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: INTEGART
config_version: 9
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 172.30.10.112
ringnumber: 0
}
}
On the other hand, a node that works properly has a corosync configuration:
cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: node-112
nodeid: 1
quorum_votes: 1
ring0_addr: node-112
}
node {
name: node-110
nodeid: 2
quorum_votes: 1
ring0_addr: node-110
}
node {
name: node-114
nodeid: 4
quorum_votes: 1
ring0_addr: 172.30.10.114
}
node {
name: node-113
nodeid: 3
quorum_votes: 1
ring0_addr: node-113
}
node {
name: node-111
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.2.111
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: INTEGART
config_version: 9
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 172.30.10.112
ringnumber: 0
}
}
Please help.
All the time I have a problem with one node in the cluster. The web interface still shows that it is offline, but the server is working.
I noticed that there is probably a problem with the corosync service.
service corosync status
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Wed 2017-09-13 10:50:16 CEST; 8min ago
Process: 19975 ExecStart = / usr / share / corosync / corosync start (code = exited, status = 1 / FAILURE)
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cmap
Sep 13 10:49:15 node-111 corosync [19982]: [SERV] Service engine loaded: corosync configuration service [1]
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cfg
Sep 13 10:49:15 node-111 corosync [19982]: [SERV] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Sep 13 10:49:15 node-111 corosync [19982]: [QB] server name: cpg
Sep 13 10:49:15 corosync [19982]: [SERV] Service loading: corosync loading service [4]
Sep 13 10:50:16 corosync node-111 [19975]: Starting Corosync Cluster Engine (corosync): [FAILED]
Ntp 13 10:50:16 node-111 systemd [1]: corosync.service: control process exited, code = exited status = 1
Sep 13 10:50:16 node-111 systemd [1]: Failed to start Corosync Cluster Engine.
Sep 13 10:50:16 node-111 systemd [1]: Unit corosync.service entered failed state.
My corosync configuration looks like this:
cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: node-112
nodeid: 1
quorum_votes: 1
ring0_addr: node-112
}
node {
name: node-110
nodeid: 2
quorum_votes: 1
ring0_addr: node-110
}
node {
name: node-114
nodeid: 4
quorum_votes: 1
ring0_addr: 172.30.10.114
}
node {
name: node-113
nodeid: 3
quorum_votes: 1
ring0_addr: node-113
}
node {
name: node-111
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.2.111
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: INTEGART
config_version: 9
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 172.30.10.112
ringnumber: 0
}
}
On the other hand, a node that works properly has a corosync configuration:
cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: node-112
nodeid: 1
quorum_votes: 1
ring0_addr: node-112
}
node {
name: node-110
nodeid: 2
quorum_votes: 1
ring0_addr: node-110
}
node {
name: node-114
nodeid: 4
quorum_votes: 1
ring0_addr: 172.30.10.114
}
node {
name: node-113
nodeid: 3
quorum_votes: 1
ring0_addr: node-113
}
node {
name: node-111
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.2.111
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: INTEGART
config_version: 9
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 172.30.10.112
ringnumber: 0
}
}
Please help.