SOLVED: 2-Node Cluster and QDevice not voting?

jlficken

Member
Sep 6, 2022
42
5
13
I removed the 3rd node from my cluster and wanted to add a QDevice.

Output from adding QDevice (set up on a Raspbian VM):
Code:
root@FSPVE1:~# pvecm qdevice setup 192.168.1.55 -f
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
                (if you think this is a mistake, you may want to use -f option)


INFO: initializing qnetd server
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db

INFO: copying CA cert and initializing on all nodes

node 'FSPVE1': Creating /etc/corosync/qdevice/net/nssdb
password file contains no data
node 'FSPVE1': Creating new key and cert db
node 'FSPVE1': Creating new noise file /etc/corosync/qdevice/net/nssdb/noise.txt
node 'FSPVE1': Importing CA
node 'FSPVE2': Creating /etc/corosync/qdevice/net/nssdb
password file contains no data
node 'FSPVE2': Creating new key and cert db
node 'FSPVE2': Creating new noise file /etc/corosync/qdevice/net/nssdb/noise.txt
node 'FSPVE2': Importing CA
INFO: generating cert request
Creating new certificate request


Generating key.  This may take a few moments...

Certificate request stored in /etc/corosync/qdevice/net/nssdb/qdevice-net-node.crq

INFO: copying exported cert request to qnetd server

INFO: sign and export cluster cert
Signing cluster certificate
Certificate stored in /etc/corosync/qnetd/nssdb/cluster-FSProxmox.crt

INFO: copy exported CRT

INFO: import certificate
Importing signed cluster certificate
Notice: Trust flag u is set automatically if the private key is present.
pk12util: PKCS12 EXPORT SUCCESSFUL
Certificate stored in /etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12

INFO: copy and import pk12 cert to all nodes

node 'FSPVE1': Importing cluster certificate and key
node 'FSPVE1': pk12util: PKCS12 IMPORT SUCCESSFUL
node 'FSPVE2': Importing cluster certificate and key
node 'FSPVE2': pk12util: PKCS12 IMPORT SUCCESSFUL
INFO: add QDevice to cluster configuration

INFO: start and enable corosync qdevice daemon on node 'FSPVE1'...
Synchronizing state of corosync-qdevice.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable corosync-qdevice
Created symlink /etc/systemd/system/multi-user.target.wants/corosync-qdevice.service -> /lib/systemd/system/corosync-qdevice.service.

INFO: start and enable corosync qdevice daemon on node 'FSPVE2'...
Synchronizing state of corosync-qdevice.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable corosync-qdevice
Created symlink /etc/systemd/system/multi-user.target.wants/corosync-qdevice.service -> /lib/systemd/system/corosync-qdevice.service.
Reloading corosync.conf...
Done
root@FSPVE1:~# pvecm status
Cluster information
-------------------
Name:             FSProxmox
Config Version:   21
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Sep 29 08:37:14 2022
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.248
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1   A,NV,NMW 192.168.1.52 (local)
0x00000002          1   A,NV,NMW 192.168.1.53
0x00000000          0            Qdevice (votes 1)

corosync.conf contents:
Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: FSPVE1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.1.52
  }
  node {
    name: FSPVE2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.1.53
  }
}

quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: 192.168.1.55
      tls: on
    }
    votes: 1
  }
  provider: corosync_votequorum
}

totem {
  cluster_name: FSProxmox
  config_version: 21
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

corosync-device service status:
Code:
root@FSPVE1:~# systemctl status corosync-qdevice.service
● corosync-qdevice.service - Corosync Qdevice daemon
     Loaded: loaded (/lib/systemd/system/corosync-qdevice.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2022-09-29 08:37:09 CDT; 2min 55s ago
       Docs: man:corosync-qdevice
   Main PID: 1441867 (corosync-qdevic)
      Tasks: 2 (limit: 77064)
     Memory: 1.5M
        CPU: 201ms
     CGroup: /system.slice/corosync-qdevice.service
             ├─1441867 /usr/sbin/corosync-qdevice -f
             └─1441868 /usr/sbin/corosync-qdevice -f

Sep 29 08:39:40 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:43 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:43 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:47 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:51 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:51 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:52 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:56 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:39:59 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server
Sep 29 08:40:00 FSPVE1 corosync-qdevice[1441867]: Unhandled error when reading from server. Disconnecting from server

I've tried Googling the above message but can't seem to find the answer. Any help would be appreciated.

ETA: I can ping the QDevice IP from both nodes.

ETA 2: I think I'll try creating the device again using these instructions - https://forum.proxmox.com/threads/qdevice-not-voting.108871/post-468136

ETA 3: The link above worked!! The only thing missing from his script was to set the password for the root account using "sudo passwd root".
 
Last edited: