PVE5 and quorum device

May 22, 2016
599
20
23
36
Can someone point me to proper docs about creating an HA environment with 2 nodes plus an external quorum device ? I know this is possible (even with a raspberry), but I don't remember how to make this.
 

Jeff Billimek

New Member
Feb 16, 2018
8
5
3
I'd like to report that I got the corosync-qdevice thing to work for my 2-node cluster.

Previously I was using the raspberry-pi-as-a-third-node approach which seemed like a hacky solution. The dummy node shows up in the proxmox cluster info as unusable nodes (because they are) and it blocks me from creating a new VM until I temporarily remove those dummy nodes from the corosync config and restart corosync. It wasn't really ideal.

Based on this nugget of information from pve mail post, I did the following to make this work in my 2-node cluster environment.

For context this is my environment:
  • host: proxmox (one of the nodes in my cluster)
  • host: proxmox-b (the other node in my cluster)
  • host: witness (the non-proxmox raspberry pi node that I use as a corosync 'witness' for quorum votes)
On all three hosts, I installed corosync-qdevice & corosync-qnetd (I think that qnetd is only needed on the non-proxmox host but not sure):
Code:
apt-get install corosync-qdevice
apt-get install corosync-qnetd

Next I made sure that proxmox, proxmox-b, and witness could all ssh to each-other cleanly as root.

On proxmox, ran the following (where the last three arguments):
Code:
corosync-qdevice-net-certutil -Q -n <cluster name> <ip address for witness> <ip address for proxmox> <ip address for proxmox-b>
(you can determine the cluster name by looking at the cluster_name value in /etc/corosync/corosync.conf

Edited /etc/corosync/corosync.conf and added the following to the quorum section:
Code:
quorum {
  provider: corosync_votequorum
     device {
         model: net
         votes: 1
         net {
           tls: on
           host: <ip address for witness>
           algorithm: ffsplit
         }
     }
}

Then restarted corosync service and corosync-qdevice service:
Code:
service corosync restart
service corosync-qdevice start

Did the same steps of editing corosync.conf and restarting stuff on proxmox-b

Afterwards, corosync-quorumtool shows the following:
Code:
root@proxmox:/etc/corosync# corosync-quorumtool
Quorum information
------------------
Date:             Sun May 27 00:54:41 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/4656
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
         1          1    A,V,NMW <ip address of proxmox> (local)
         2          1    A,V,NMW <ip address of proxmox-b>
         0          1            Qdevice
Tested this by rebooting witness. Everything was fine. While witness was rebooting, I observed that corosync-quorumtool showed that witness did not have any votes but the cluster was otherwise healthy.

Further tested this by rebooting proxmox-b. While this was rebooting, corosync-quorumtool showed that the node dropped off but still had 2 votes and quorum and was otherwise healthy.

This may or may not work for you, so beware if you start tinkering!
 
Last edited:

de Thysebaert

Member
Mar 12, 2017
28
2
8
62
I'd like to report that I got the corosync-qdevice thing to work for my 2-node cluster.

Previously I was using the raspberry-pi-as-a-third-node approach which seemed like a hacky solution. The dummy node shows up in the proxmox cluster info as unusable nodes (because they are) and it blocks me from creating a new VM until I temporarily remove those dummy nodes from the corosync config and restart corosync. It wasn't really ideal.

Based on this nugget of information from pve mail post, I did the following to make this work in my 2-node cluster environment.

For context this is my environment:
  • host: proxmox (one of the nodes in my cluster)
  • host: proxmox-b (the other node in my cluster)
  • host: lb (the non-proxmox raspberry pi node that I use as a corosync 'witness' for quorum votes)
On all three hosts, I installed corosync-qdevice & corosync-qnetd (I think that qnetd is only needed on the non-proxmox host but not sure):
Code:
apt-get install corosync-qdevice
apt-get install corosync-qnetd

Next I made sure that proxmox, proxmox-b, and lb could all ssh to each-other cleanly as root.

On proxmox, ran the following (where the last three arguments):
Code:
corosync-qdevice-net-certutil -Q -n proxmox <ip address for lb> <ip address for proxmox> <ip address for proxmox-b>

Edited /etc/corosync/corosync.conf and added the following to the quorum section:
Code:
quorum {
  provider: corosync_votequorum
     device {
         model: net
         votes: 1
         net {
           tls: on
           host: 10.0.7.16
           algorithm: ffsplit
         }
     }
}

Then restarted corosync service and corosync-qdevice service:
Code:
service corosync restart
service corosync-qdevice start

Did the same steps of editing corosync.conf and restarting stuff on proxmox-b

Afterwards, corosync-quorumtool shows the following:
Code:
root@proxmox:/etc/corosync# corosync-quorumtool
Quorum information
------------------
Date:             Sun May 27 00:54:41 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/4656
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
         1          1    A,V,NMW <ip address of proxmox> (local)
         2          1    A,V,NMW <ip address of proxmox-b>
         0          1            Qdevice
Tested this by rebooting lb. Everything was fine. While lb was rebooting, I observed that corosync-quorumtool showed that lb did not have any votes but the cluster was otherwise healthy.

Further tested this by rebooting proxmox-b. While this was rebooting, corosync-quorumtool showed that the node dropped off but still had 2 votes and quorum and was otherwise healthy.

This may or may not work for you, so beware if you start tinkering!

Thanks for this doc very interesting.
For me at this time a always have errors when I ran this command :
corosync-qdevice-net-certutil -Q -n proxmox <ip address for lb> <ip address for proxmox> <ip address for proxmox-

In the doc, the firts IP is the IP for box with the quorum disk, the others ip for the nodes.
The first error is :
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db
the after
Can't open certificate file /tmp/qnetd-cacert.crt

....

unable to create the qdevie at this time

thx if you have an idea

fr
 

de Thysebaert

Member
Mar 12, 2017
28
2
8
62
Thanks for this doc very interesting.
For me at this time a always have errors when I ran this command :
corosync-qdevice-net-certutil -Q -n proxmox <p addresis for lb> <ip address for proxmox> <ip address for proxmox-

In the doc, the firts IP is the IP for box with the quorum disk, the others ip for the nodes.
The first error is :
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db
the after
Can't open certificate file /tmp/qnetd-cacert.crt

....

unable to create the qdevie at this time

thx if you have an idea

fr
Hi again
after many time .. The solution seams now to be functionning as expected.
For me I just has setup the authenticationwith a public key (thanwithout password) between each node and the box with qdevice . After this setup, the command corosync-qdevice-net-certutil -Q -n proxmox ran without any error and the rest of your setup ran also without any issues.
Thx
 
  • Like
Reactions: Yuan Ren

Michal Michalac

New Member
Jul 11, 2018
1
3
1
Thanks for your article. I used it and it works as expected.

Have some minor notes:
On all three hosts, I installed corosync-qdevice & corosync-qnetd (I think that qnetd is only needed on the non-proxmox host but not sure):
corosync-qdevice need to be installed only on real proxmox nodes (proxmox and proxmox-b).
corosync-qnetd need to be installed only on quorum device (witness).
Both can be installed on all servers, but it's useless.

After configuring corosync-qnetd (corosync-qdevice-net-certutil ...), this service must be manually enabled and started on witness:
Code:
systemctl enable corosync-qnetd
systemctl start corosync-qnetd

Edited /etc/corosync/corosync.conf and added the following to the quorum section:
...

Then restarted corosync service and corosync-qdevice service:
Code:
service corosync restart
service corosync-qdevice start
Did the same steps of editing corosync.conf and restarting stuff on proxmox-b
Better way is to edit shared version of corosync.conf at /etc/pve/corosync.conf on one of proxmox nodes. (Don't forget to raise totem.config_version). Changes are then automatically distributed to all nodes, copied to /etc/corosync/corosync.conf and applied.
There is no need to restart corosync.
corosync-qdevice is required to enable and start on all nodes (proxmox and proxmos-b) manually.
Code:
systemctl enable corosync-qdevice
systemctl start corosync-qdevice
 

de Thysebaert

Member
Mar 12, 2017
28
2
8
62
Hi,
I come back with this post, I forget if the command "corosync-qdevice-net-certutil -Q -n proxmox <ip address for lb> <ip address for proxmox> <ip address for proxmox-b" must be run on each proxmox ???
thx for your advise
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
2,721
418
103
South Tyrol/Italy
shop.maurer-it.com
Hi,
I come back with this post, I forget if the command "corosync-qdevice-net-certutil -Q -n proxmox <ip address for lb> <ip address for proxmox> <ip address for proxmox-b" must be run on each proxmox ???
thx for your advise
author of the "nugget" this tutorial bases on here...

no, only once on one PVE node. It connects to all passed hosts through SSH (thus SSH public keyauth is recommended to have set up) and sets the certificates up for all of them. Just swap 'proxmox' with your respective clustername and ensure that the first one is the "witness", not a PVE node.
 

de Thysebaert

Member
Mar 12, 2017
28
2
8
62
author of the "nugget" this tutorial bases on here...

no, only once on one PVE node. It connects to all passed hosts through SSH (thus SSH public keyauth is recommended to have set up) and sets the certificates up for all of them. Just swap 'proxmox' with your respective clustername and ensure that the first one is the "witness", not a PVE node.
thanks
 

de Thysebaert

Member
Mar 12, 2017
28
2
8
62
A another question about this configuration.
A want to change my "witness" server. After the prerequisites installed on the new one, may I just run the "corosync-qdevice-net-certutil -Q -n proxmox <ip address for lb> <ip address for proxmox> <ip address for proxmox-b" to update the corosync-qdevice ???

thx
 
Sep 27, 2018
6
0
1
50
I'm having a hard time getting this to work.
No matter what I try I always end up with the same errors:

Code:
root@pve1:~# corosync-qdevice-net-certutil -Q -n vdbcluster 10.0.200.159 10.0.200.153 10.0.200.156
Creating /etc/corosync/qnetd/nssdb
Creating new key and cert db
password file contains no data
Creating new noise file /etc/corosync/qnetd/nssdb/noise.txt
Creating new CA


Generating key.  This may take a few moments...

Is this a CA certificate [y/N]?
Enter the path length constraint, enter to skip [<0 for unlimited path]: > Is this a critical extension [y/N]?


Generating key.  This may take a few moments...

Notice: Trust flag u is set automatically if the private key is present.
QNetd CA certificate is exported as /etc/corosync/qnetd/nssdb/qnetd-cacert.crt
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /tmp/qnetd-cacert.crt
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /tmp/qnetd-cacert.crt
Certificate database doesn't exists. Use /usr/sbin/corosync-qdevice-net-certutil -i to create it
/etc/corosync/qdevice/net/nssdb/qdevice-net-node.crq: No such file or directory
Can't open certificate file /tmp/qdevice-net-node.crq
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /etc/corosync/qdevice/net/nssdb/cluster-vdbcluster.crt
/etc/corosync/qdevice/net/nssdb//qdevice-net-node.p12: No such file or directory
Can't open certificate file /etc/corosync/qdevice/net/nssdb//qdevice-net-node.p12
ssh public key auth is used:

Code:
root@pve1:~# ssh 10.0.200.159
Linux raspberrypi 4.14.50+ #1122 Tue Jun 19 12:21:21 BST 2018 armv6l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sun Feb 17 19:50:49 2019 from 10.0.200.153

root@raspberrypi:~#
Does someone have a clue on what might be wrong?

Thanks in advance
 

JTRealms

New Member
Feb 18, 2019
2
0
1
34
I'm having a hard time getting this to work.
No matter what I try I always end up with the same errors:

Code:
root@pve1:~# corosync-qdevice-net-certutil -Q -n vdbcluster 10.0.200.159 10.0.200.153 10.0.200.156
Creating /etc/corosync/qnetd/nssdb
Creating new key and cert db
password file contains no data
Creating new noise file /etc/corosync/qnetd/nssdb/noise.txt
Creating new CA


Generating key.  This may take a few moments...

Is this a CA certificate [y/N]?
Enter the path length constraint, enter to skip [<0 for unlimited path]: > Is this a critical extension [y/N]?


Generating key.  This may take a few moments...

Notice: Trust flag u is set automatically if the private key is present.
QNetd CA certificate is exported as /etc/corosync/qnetd/nssdb/qnetd-cacert.crt
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /tmp/qnetd-cacert.crt
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /tmp/qnetd-cacert.crt
Certificate database doesn't exists. Use /usr/sbin/corosync-qdevice-net-certutil -i to create it
/etc/corosync/qdevice/net/nssdb/qdevice-net-node.crq: No such file or directory
Can't open certificate file /tmp/qdevice-net-node.crq
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).
lost connection
Can't open certificate file /etc/corosync/qdevice/net/nssdb/cluster-vdbcluster.crt
/etc/corosync/qdevice/net/nssdb//qdevice-net-node.p12: No such file or directory
Can't open certificate file /etc/corosync/qdevice/net/nssdb//qdevice-net-node.p12
ssh public key auth is used:

Code:
root@pve1:~# ssh 10.0.200.159
Linux raspberrypi 4.14.50+ #1122 Tue Jun 19 12:21:21 BST 2018 armv6l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sun Feb 17 19:50:49 2019 from 10.0.200.153

root@raspberrypi:~#
Does someone have a clue on what might be wrong?

Thanks in advance
I was actually having the same issue, turned out I was simply adding the wrong key (for a user account, not root) to PVE host's authorized_keys. Once corrected, it worked straight away.
 

JTRealms

New Member
Feb 18, 2019
2
0
1
34
Thats not the case here. I can ssh to the qdevice as root without entering any password: ssh root@10.0.200.159
I think the problem is (at least in my case) root@pi (qdevice) was unable to ssh into the two PVE hosts using public key. Can you confirm the qdevice can successfully ssh into root via public key on each PVE host?
 
Sep 27, 2018
6
0
1
50
I think the problem is (at least in my case) root@pi (qdevice) was unable to ssh into the two PVE hosts using public key. Can you confirm the qdevice can successfully ssh into root via public key on each PVE host?
That is one of my problems. Tried all possible solutions i could google for, but no luck so far, keeps asking me for password...
 
Sep 27, 2018
6
0
1
50
I got a lot further, but now i can not start the qdevice on the raspberry pi

Code:
root@pve1:~# pvecm status
Quorum information
------------------
Date:             Tue Feb 19 15:35:19 2019
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1/16
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1   A,NV,NMW 10.0.200.220 (local)
0x00000002          1   A,NV,NMW 10.0.200.221
0x00000000          0            Qdevice (votes 1)
Start qnetd on raspberry pi:
Code:
Feb 19 15:35:23 pvew systemd[1]: Starting Corosync Qdevice Network daemon...
-- Subject: Unit corosync-qnetd.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit corosync-qnetd.service has begun starting up.
Feb 19 15:35:23 pvew corosync-qnetd[799]: Feb 19 15:35:23 crit    NSS error (-8015): The certificate/key database is in an old, unsupported format.
Feb 19 15:35:23 pvew systemd[1]: corosync-qnetd.service: Main process exited, code=exited, status=1/FAILURE
Feb 19 15:35:23 pvew systemd[1]: Failed to start Corosync Qdevice Network daemon.
-- Subject: Unit corosync-qnetd.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit corosync-qnetd.service has failed.
--
-- The result is failed.
Feb 19 15:35:23 pvew systemd[1]: corosync-qnetd.service: Unit entered failed state.
Feb 19 15:35:23 pvew systemd[1]: corosync-qnetd.service: Failed with result 'exit-code'.
Anyone an idea?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!