We had a critical server power failure taking down the entire cluster (inverter on ups failed, knocking out mains breakers and killing all servers before maintenance staff could reach power room)
6 server cluster (vwk-prox01 to vwk-prox06)
Shared storage folder mounts properly on all servers, however there seems to be some issue with pmxcfs
The following shows up on all servers.
6 server cluster (vwk-prox01 to vwk-prox06)
Shared storage folder mounts properly on all servers, however there seems to be some issue with pmxcfs
The following shows up on all servers.
Code:
journalctl -xn
-- Logs begin at Tue 2018-06-19 09:16:08 SAST, end at Tue 2018-06-19 09:29:59 SAST. --
Jun 19 09:29:47 vwk-prox02 pmxcfs[1350]: [dcdb] crit: cpg_initialize failed: 2
Jun 19 09:29:47 vwk-prox02 pmxcfs[1350]: [status] crit: cpg_initialize failed: 2
Jun 19 09:29:53 vwk-prox02 pmxcfs[1350]: [quorum] crit: quorum_initialize failed: 2
Jun 19 09:29:53 vwk-prox02 pmxcfs[1350]: [confdb] crit: cmap_initialize failed: 2
Jun 19 09:29:53 vwk-prox02 pmxcfs[1350]: [dcdb] crit: cpg_initialize failed: 2
Jun 19 09:29:53 vwk-prox02 pmxcfs[1350]: [status] crit: cpg_initialize failed: 2
Jun 19 09:29:59 vwk-prox02 pmxcfs[1350]: [quorum] crit: quorum_initialize failed: 2
Jun 19 09:29:59 vwk-prox02 pmxcfs[1350]: [confdb] crit: cmap_initialize failed: 2
Jun 19 09:29:59 vwk-prox02 pmxcfs[1350]: [dcdb] crit: cpg_initialize failed: 2
Jun 19 09:29:59 vwk-prox02 pmxcfs[1350]: [status] crit: cpg_initialize failed: 2
Code:
systemctl status corosync.service -l
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: timeout) since Tue 2018-06-19 09:18:06 SAST; 7min ago
Process: 1421 ExecStart=/usr/share/corosync/corosync start (code=killed, signal=TERM)
Jun 19 09:16:36 vwk-prox02 corosync[1431]: [MAIN ] Corosync Cluster Engine ('2.4.2'): started and ready to provide service.
Jun 19 09:16:36 vwk-prox02 corosync[1431]: [MAIN ] Corosync built-in features: augeas systemd pie relro bindnow
Jun 19 09:18:06 vwk-prox02 systemd[1]: corosync.service start operation timed out. Terminating.
Jun 19 09:18:06 vwk-prox02 systemd[1]: Failed to start Corosync Cluster Engine.
Jun 19 09:18:06 vwk-prox02 systemd[1]: Unit corosync.service entered failed state.
Jun 19 09:18:06 vwk-prox02 corosync[1421]: Starting Corosync Cluster Engine (corosync):
.