I am trying to set up a qdevice node to a 2 node setup .
I fixed some issues, and I am almost there but the last step fails:
In the journal my attention goes to this:
In a more complete (but somewhat filtered transcript):
corosync.conf before adding qdevice (after pvecm qdevice remove)
I fixed it:
After adding the qdevice, I had this in the configuration:
I changed the expected_votes to '3' and did a
> systemctl start corosync-qdevice
After which it worked.
I fixed some issues, and I am almost there but the last step fails:
Code:
node 'p3': Importing cluster certificate and key
node 'p3': pk12util: PKCS12 IMPORT SUCCESSFUL
node 'p5': Importing cluster certificate and key
node 'p5': pk12util: PKCS12 IMPORT SUCCESSFUL
INFO: add QDevice to cluster configuration
INFO: start and enable corosync qdevice daemon on node 'p3'...
Job for corosync-qdevice.service failed because the control process exited with error code.
See "systemctl status corosync-qdevice.service" and "journalctl -xe" for details.
command 'ssh -o 'BatchMode=yes' -lroot 10.0.0.3 systemctl start corosync-qdevice' failed: exit code 1
In the journal my attention goes to this:
Code:
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] got nodeinfo message from cluster node 4
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
In a more complete (but somewhat filtered transcript):
Code:
Mar 2 00:31:12 p3 pmxcfs[3044]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 65)
Mar 2 00:31:12 p3 corosync[3396]: notice [CFG ] Config reload requested by node 1
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] Configuration reloaded. Dumping actual totem config.
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] Token Timeout (1000 ms) retransmit timeout (238 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] token hold (180 ms) retransmits before loss (4 retrans)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] join (50 ms) send_join (0 ms) consensus (1200 ms) merge (200 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Mar 2 00:31:12 p3 corosync[3396]: [CFG ] Config reload requested by node 1
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1301
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] missed count const (5 messages)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP token expired timeout (238 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP token problem counter (2000 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP threshold (10 problem count)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP multicast threshold (100 problem count)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] RRP mode set to none.
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] heartbeat_failures_allowed (0)
Mar 2 00:31:12 p3 corosync[3396]: debug [TOTEM ] max_network_delay (50 ms)
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] Reading configuration (runtime: 1)
Mar 2 00:31:12 p3 corosync[3396]: crit [VOTEQ ] configuration error: quorum.device.votes is too high or expected_votes is too low
Mar 2 00:31:12 p3 corosync[3396]: crit [VOTEQ ] disabling quorum device operations
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] ev_tracking=0, ev_tracking_barrier = 0: expected_votes = 2
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] Configuration reloaded. Dumping actual totem config.
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] Token Timeout (1000 ms) retransmit timeout (238 ms)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] token hold (180 ms) retransmits before loss (4 retrans)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] join (50 ms) send_join (0 ms) consensus (1200 ms) merge (200 ms)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1301
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] missed count const (5 messages)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP token expired timeout (238 ms)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP token problem counter (2000 ms)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP threshold (10 problem count)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP multicast threshold (100 problem count)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] RRP mode set to none.
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] heartbeat_failures_allowed (0)
Mar 2 00:31:12 p3 corosync[3396]: [TOTEM ] max_network_delay (50 ms)
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] Reading configuration (runtime: 1)
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] configuration error: quorum.device.votes is too high or expected_votes is too low
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] disabling quorum device operations
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] ev_tracking=0, ev_tracking_barrier = 0: expected_votes = 2
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] got nodeinfo message from cluster node 4
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 2 flags: 5
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] got nodeinfo message from cluster node 4
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] got nodeinfo message from cluster node 1
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 2 flags: 5
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] total_votes=3, expected_votes=2
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] Sending expected votes callback
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] node 1 state=1, votes=1, expected=3
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] node 4 state=1, votes=1, expected=2
Mar 2 00:31:12 p3 corosync[3396]: notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] got nodeinfo message from cluster node 1
Mar 2 00:31:12 p3 corosync[3396]: debug [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] got nodeinfo message from cluster node 4
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 2 flags: 5
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] got nodeinfo message from cluster node 4
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] got nodeinfo message from cluster node 1
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 2 flags: 5
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] total_votes=3, expected_votes=2
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] Sending expected votes callback
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] node 1 state=1, votes=1, expected=3
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] node 4 state=1, votes=1, expected=2
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] got nodeinfo message from cluster node 1
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar 2 00:31:12 p3 corosync-qdevice[32770]: Initializing votequorum
Mar 2 00:31:12 p3 corosync-qdevice[32770]: Initializing votequorum
Mar 2 00:31:12 p3 corosync[3396]: info [VOTEQ ] Registration of quorum device is disabled by incorrect corosync.conf. See logs for more information
Mar 2 00:31:12 p3 corosync-qdevice[32770]: Can't register votequorum device. Error CS_ERR_ACCESS
Mar 2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Main process exited, code=exited, status=1/FAILURE
Mar 2 00:31:12 p3 corosync[3396]: debug [CMAP ] exit_fn for conn=0x555c3215c840
Mar 2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Unit entered failed state.
Mar 2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
Mar 2 00:31:12 p3 corosync[3396]: [VOTEQ ] Registration of quorum device is disabled by incorrect corosync.conf. See logs for more information
Mar 2 00:31:12 p3 corosync-qdevice[32770]: Can't register votequorum device. Error CS_ERR_ACCESS
Mar 2 00:31:12 p3 corosync[3396]: [CMAP ] exit_fn for conn=0x555c3215c840
corosync.conf before adding qdevice (after pvecm qdevice remove)
JSON:
logging {
debug: on
to_syslog: yes
}
nodelist {
node {
name: p3
nodeid: 1
quorum_votes: 1
ring0_addr: 10.0.0.3
}
node {
name: p5
nodeid: 4
quorum_votes: 1
ring0_addr: 10.0.0.5
}
}
quorum {
expected_votes: 2
provider: corosync_votequorum
}
totem {
cluster_name: ourcluster
config_version: 64
interface {
bindnetaddr: 10.0.0.3
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
I fixed it:
After adding the qdevice, I had this in the configuration:
Code:
quorum {
device {
model: net
net {
algorithm: ffsplit
host: <HIDDEN_IP>
tls: on
}
votes: 1
}
expected_votes: 2
provider: corosync_votequorum
}
I changed the expected_votes to '3' and did a
> systemctl start corosync-qdevice
After which it worked.