Qdevice is not voting

lukash · Feb 27, 2021

Hello guys,
I have a problem with my Qdevice. If I type "pvemc status", my first node give the following result:

Code:

root@pve1:~# pvecm status
Cluster information
-------------------
Name:             server
Config Version:   7
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sat Feb 27 00:16:05 2021
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.42
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2 
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 10.10.10.1 (local)
0x00000002          1  NA,NV,NMW 10.10.10.2
0x00000000          1            Qdevice
root@pve1:~#

It seems normal. But the second node returns the following

Code:

root@pve2:~# pvecm status
Cluster information
-------------------
Name:             server
Config Version:   7
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sat Feb 27 00:16:13 2021
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000002
Ring ID:          1.42
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2 
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 10.10.10.1
0x00000002          1  NA,NV,NMW 10.10.10.2 (local)
0x00000000          0            Qdevice (votes 1)
root@pve2:~#

The Qdevice is not voting for the Server

The node and the pi are in the same network.

Code:

root@pve2:~# ping 10.10.10.99
PING 10.10.10.99 (10.10.10.99) 56(84) bytes of data.
64 bytes from 10.10.10.99: icmp_seq=1 ttl=64 time=0.883 ms

Code:

root@raspberrypi:~# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
64 bytes from 10.10.10.2: icmp_seq=1 ttl=64 time=0.487 ms

What I have found is, that corosync-qdevice.service on node 2 run into an error.

Code:

root@pve2:~# systemctl start corosync-qdevice.service
Job for corosync-qdevice.service failed because the control process exited with error code.
See "systemctl status corosync-qdevice.service" and "journalctl -xe" for details.

Code:

root@pve2:~# systemctl status corosync-qdevice.service
● corosync-qdevice.service - Corosync Qdevice daemon
   Loaded: loaded (/lib/systemd/system/corosync-qdevice.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2021-02-27 00:26:14 CET; 1min 25s ago
     Docs: man:corosync-qdevice
  Process: 676324 ExecStart=/usr/sbin/corosync-qdevice -f $COROSYNC_QDEVICE_OPTIONS (code=exited, status=1/FAILURE)
 Main PID: 676324 (code=exited, status=1/FAILURE)

Feb 27 00:26:14 pve2 systemd[1]: Starting Corosync Qdevice daemon...
Feb 27 00:26:14 pve2 corosync-qdevice[676324]: Can't init nss (-8174): security library: bad database.
Feb 27 00:26:14 pve2 systemd[1]: corosync-qdevice.service: Main process exited, code=exited, status=1/FAILURE
Feb 27 00:26:14 pve2 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
Feb 27 00:26:14 pve2 systemd[1]: Failed to start Corosync Qdevice daemon.

Code:

-- A start job for unit corosync-qdevice.service has begun execution.
--
-- The job identifier is 288886.
Feb 27 00:28:42 pve2 corosync-qdevice[677667]: Can't init nss (-8174): security library: bad database.
Feb 27 00:28:42 pve2 systemd[1]: corosync-qdevice.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- An ExecStart= process belonging to unit corosync-qdevice.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 1.
Feb 27 00:28:42 pve2 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit corosync-qdevice.service has entered the 'failed' state with result 'exit-code'.
Feb 27 00:28:42 pve2 systemd[1]: Failed to start Corosync Qdevice daemon.
-- Subject: A start job for unit corosync-qdevice.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit corosync-qdevice.service has finished with a failure.
--
-- The job identifier is 288886 and the job result is failed.

Is there a way to fix this problem?

I have look around for many hours, but I didn't find the solution.

Can you help me?
Tanks

Moayad · Mar 1, 2021

Hi,

please post the output of pveversion -v and Corosync config cat /etc/pve/corosync.conf as well

lukash · Mar 2, 2021

pve1 is the node with 3 votes
pve2 is the node with only 2 votes (the problem)

Code:

root@pve1:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.8-pve2
ceph-fuse: 15.2.8-pve2
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
root@pve1:~#

Code:

root@pve2:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.8-pve2
ceph-fuse: 15.2.8-pve2
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
root@pve2:~#

Code:

root@pve1:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pve1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.10.10.1
  }
  node {
    name: pve2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.10.10.2
  }
}

quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: 10.10.10.99
      tls: on
    }
    votes: 1
  }
  provider: corosync_votequorum
}

totem {
  cluster_name: server
  config_version: 7
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

root@pve1:~#

Code:

root@pve2:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pve1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.10.10.1
  }
  node {
    name: pve2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.10.10.2
  }
}

quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: 10.10.10.99
      tls: on
    }
    votes: 1
  }
  provider: corosync_votequorum
}

totem {
  cluster_name: server
  config_version: 7
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

root@pve2:~#

lukash · Mar 3, 2021

Doesn't anyone have an idea?

lukash · Mar 7, 2021

It seems, that the solution was to copy the folder /etc/corosync from the working pve to the not working pve.

mc74909 · Jan 23, 2023

For me it turned out that the problem node wasn't able to ssh passwordless to itself.
Once that was fixed, I could add the qdevice and all worked as expected.

Search

Search

Qdevice is not voting

lukash

New Member

Moayad

Proxmox Staff Member

lukash

New Member

lukash

New Member

lukash

New Member

mc74909

New Member

We value your privacy