[Solved] QDevice - service not starting on node

le_top

Renowned Member
Sep 6, 2013
42
0
71
I am trying to set up a qdevice node to a 2 node setup .

I fixed some issues, and I am almost there but the last step fails:

Code:
node 'p3': Importing cluster certificate and key
node 'p3': pk12util: PKCS12 IMPORT SUCCESSFUL
node 'p5': Importing cluster certificate and key
node 'p5': pk12util: PKCS12 IMPORT SUCCESSFUL
INFO: add QDevice to cluster configuration

INFO: start and enable corosync qdevice daemon on node 'p3'...
Job for corosync-qdevice.service failed because the control process exited with error code.
See "systemctl status corosync-qdevice.service" and "journalctl -xe" for details.
command 'ssh -o 'BatchMode=yes' -lroot 10.0.0.3 systemctl start corosync-qdevice' failed: exit code 1

In the journal my attention goes to this:
Code:
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] got nodeinfo message from cluster node 4
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0

In a more complete (but somewhat filtered transcript):
Code:
Mar  2 00:31:12 p3 pmxcfs[3044]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 65)
Mar  2 00:31:12 p3 corosync[3396]: notice  [CFG   ] Config reload requested by node 1
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] Configuration reloaded. Dumping actual totem config.
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] Token Timeout (1000 ms) retransmit timeout (238 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] token hold (180 ms) retransmits before loss (4 retrans)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] join (50 ms) send_join (0 ms) consensus (1200 ms) merge (200 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Mar  2 00:31:12 p3 corosync[3396]:  [CFG   ] Config reload requested by node 1
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1301
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] missed count const (5 messages)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP token expired timeout (238 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP token problem counter (2000 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP threshold (10 problem count)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP multicast threshold (100 problem count)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] RRP mode set to none.
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] heartbeat_failures_allowed (0)
Mar  2 00:31:12 p3 corosync[3396]: debug   [TOTEM ] max_network_delay (50 ms)
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] Reading configuration (runtime: 1)
Mar  2 00:31:12 p3 corosync[3396]: crit    [VOTEQ ] configuration error: quorum.device.votes is too high or expected_votes is too low
Mar  2 00:31:12 p3 corosync[3396]: crit    [VOTEQ ] disabling quorum device operations
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] ev_tracking=0, ev_tracking_barrier = 0: expected_votes = 2
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] Configuration reloaded. Dumping actual totem config.
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] Token Timeout (1000 ms) retransmit timeout (238 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] token hold (180 ms) retransmits before loss (4 retrans)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] join (50 ms) send_join (0 ms) consensus (1200 ms) merge (200 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1301
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] missed count const (5 messages)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP token expired timeout (238 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP token problem counter (2000 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP threshold (10 problem count)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP multicast threshold (100 problem count)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] RRP mode set to none.
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] heartbeat_failures_allowed (0)
Mar  2 00:31:12 p3 corosync[3396]:  [TOTEM ] max_network_delay (50 ms)
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] Reading configuration (runtime: 1)
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] configuration error: quorum.device.votes is too high or expected_votes is too low
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] disabling quorum device operations
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] ev_tracking=0, ev_tracking_barrier = 0: expected_votes = 2
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] got nodeinfo message from cluster node 4
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 2 flags: 5
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] got nodeinfo message from cluster node 4
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] got nodeinfo message from cluster node 1
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 2 flags: 5
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] total_votes=3, expected_votes=2
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] Sending expected votes callback
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] node 1 state=1, votes=1, expected=3
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] node 4 state=1, votes=1, expected=2
Mar  2 00:31:12 p3 corosync[3396]: notice  [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] got nodeinfo message from cluster node 1
Mar  2 00:31:12 p3 corosync[3396]: debug   [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] got nodeinfo message from cluster node 4
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 2 flags: 5
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] got nodeinfo message from cluster node 4
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] got nodeinfo message from cluster node 1
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 2 flags: 5
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: Yes First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] total_votes=3, expected_votes=2
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] Sending expected votes callback
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] node 1 state=1, votes=1, expected=3
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] node 4 state=1, votes=1, expected=2
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] got nodeinfo message from cluster node 1
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Mar  2 00:31:12 p3 corosync-qdevice[32770]: Initializing votequorum
Mar  2 00:31:12 p3 corosync-qdevice[32770]: Initializing votequorum
Mar  2 00:31:12 p3 corosync[3396]: info    [VOTEQ ] Registration of quorum device is disabled by incorrect corosync.conf. See logs for more information
Mar  2 00:31:12 p3 corosync-qdevice[32770]: Can't register votequorum device. Error CS_ERR_ACCESS
Mar  2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Main process exited, code=exited, status=1/FAILURE
Mar  2 00:31:12 p3 corosync[3396]: debug   [CMAP  ] exit_fn for conn=0x555c3215c840
Mar  2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Unit entered failed state.
Mar  2 00:31:12 p3 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
Mar  2 00:31:12 p3 corosync[3396]:  [VOTEQ ] Registration of quorum device is disabled by incorrect corosync.conf. See logs for more information
Mar  2 00:31:12 p3 corosync-qdevice[32770]: Can't register votequorum device. Error CS_ERR_ACCESS
Mar  2 00:31:12 p3 corosync[3396]:  [CMAP  ] exit_fn for conn=0x555c3215c840

corosync.conf before adding qdevice (after pvecm qdevice remove)
JSON:
logging {
  debug: on
  to_syslog: yes
}

nodelist {
  node {
    name: p3
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.0.0.3
  }
  node {
    name: p5
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 10.0.0.5
  }
}

quorum {
  expected_votes: 2
  provider: corosync_votequorum
}

totem {
  cluster_name: ourcluster
  config_version: 64
  interface {
    bindnetaddr: 10.0.0.3
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}



I fixed it:
After adding the qdevice, I had this in the configuration:
Code:
quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: <HIDDEN_IP>
      tls: on
    }
    votes: 1
  }
  expected_votes: 2
  provider: corosync_votequorum
}

I changed the expected_votes to '3' and did a

> systemctl start corosync-qdevice

After which it worked.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!