[SOLVED] qdevice setup always fails with "Certificate database doesn't exist"

Nov 5, 2021
12
5
8
44
I recently rebuilt my proxmox cluster. I have 2 proxmox nodes, and this time I am using a raspberry pi as the qnetd server. I followed the guide.

For the purposes of this post, I am going to use these names:

1st proxmox node: NODE1
2nd proxmox node: NODE2
cluster name: CLUSTER

As soon as I try to run pvecm qdevice setup <QDEVICE-IP> I get nearly the same results on both nodes.

I get this when I execute it on NODE1:

Code:
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
(if you think this is a mistake, you may want to use -f option)


INFO: initializing qnetd server
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db

INFO: copying CA cert and initializing on all nodes
Host key verification failed.

node 'NODE2': Creating /etc/corosync/qdevice/net/nssdb
password file contains no data
node 'NODE2': Creating new key and cert db
node 'NODE2': Creating new noise file /etc/corosync/qdevice/net/nssdb/noise.txt
node 'NODE2': Importing CA
INFO: generating cert request
Certificate database doesn't exists. Use /sbin/corosync-qdevice-net-certutil -i to create it
command 'corosync-qdevice-net-certutil -r -n CLUSTER' failed: exit code 1

And I get this when I execute it on NODE2:

Code:
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
(if you think this is a mistake, you may want to use -f option)


INFO: initializing qnetd server
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db

INFO: copying CA cert and initializing on all nodes

node 'NODE1': Creating /etc/corosync/qdevice/net/nssdb
password file contains no data
node 'NODE1': Creating new key and cert db
node 'NODE1': Creating new noise file /etc/corosync/qdevice/net/nssdb/noise.txt
node 'NODE1': Importing CAHost key verification failed.

INFO: generating cert request
Certificate database doesn't exists. Use /sbin/corosync-qdevice-net-certutil -i to create it
command 'corosync-qdevice-net-certutil -r -n CLUSTER' failed: exit code 1

  • In the results of both, Host key verification failed. appears. I have no idea what this could be in reference to because both hosts can SSH into each other just fine as well as into the raspberry pi qdevice. If I simply execute "ssh IP_ADDRESS" as root from one of the proxmox nodes where IP_ADDRESS equals either the address for the other node or the qdevice, I am logged into the other host as root without asking for verification.

  • The guide says "If you receive an error such as Host key verification failed. at this stage, running pvecm updatecerts could fix the issue." I ran pvecm updatecerts on both nodes and it had no effect.

  • I have turned up the firewall logs to debug level on both nodes and am seeing that the SSH connections from both hosts are being allowed by the firewall, and the syslog for both nodes show that they accepted the pubkey for root from the other node.

  • The error Certificate database doesn't exists. Use /sbin/corosync-qdevice-net-certutil -i to create it appears in both as well. If I try to execute the /sbin/corosync-qdevice-net-certutil -i command as root on both nodes I get the same error: "Can't open certificate file".

  • If I try to start the corosync-qdevice service on either proxmox node, they fail to start with the error Can't read quorum.device.model cmap key

  • The corosync-qnetd service on the qdevice is running without error.
I am happy to provide any additional information.


*EDIT* - I figured it out. I had all of my nodes and my qdevice configured correctly to use a non-standard ssh port (ssh_config files, sshd_config files, and firewalls), but some of the commands that are executed as part of pvecm qdevice setup <QDEVICE-IP> are hard-coded to use the default ssh port. No amount of configuring to use different ports can work around this. Once I switched all of the servers back to using port 22 for ssh, cleared out the nssdb path on the proxmox nodes (rm -R /etc/corosync/qdevice/net/nssdb/), cleared out the nssdb path on the qdevice (rm -R /etc/corosync/qnetd/nssdb/), re-ran pvecm qdevice setup <QDEVICE-IP>, restarted corosync-qdevice.service on the proxmox nodes, and restarted corosync-qnetd.service on the qdevice, it all wored correctly.
 
Last edited:
  • Like
Reactions: Lukas Wagner
Great that you could fix your problem - and thanks for including the solution and "Solved" tag! :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!