[SOLVED] Error when adding QDevice to existing cluster

framf

New Member
May 8, 2024
1
0
1
Hi everyone,

I'm currently facing a problem when adding a QDevice to a 2-node cluster.

Current cluster: 2x Proxmox VE 8.2.2
QDevice: 1x Proxmox Backup-Server 3.2-2

- I can remote from both nodes into PBS as root
- corosync-qdevice is installed on both nodes
- corosync-qnetd and corosync-qdevice is installed on PBS

when I try to add the QDevice to my cluster i get the following error:

Code:
root@proxmox-n1:~# pvecm qdevice setup 10.0.0.3
user config - ignore invalid acl token 'user1@pve!migrate'
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed


/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
                (if you think this is a mistake, you may want to use -f option)




INFO: initializing qnetd server
Certificate database (/etc/corosync/qnetd/nssdb) already exists. Delete it to initialize new db


INFO: copying CA cert and initializing on all nodes
Host key verification failed.
Host key verification failed.


INFO: generating cert request
Certificate database doesn't exists. Use /sbin/corosync-qdevice-net-certutil -i to create it
command 'corosync-qdevice-net-certutil -r -n proxmox1' failed: exit code 1

I discovered that the corosync-qdevice.service isn't properly running on both nodes.

Code:
root@proxmox-n1:~# systemctl status corosync-qdevice.service
× corosync-qdevice.service - Corosync Qdevice daemon
     Loaded: loaded (/lib/systemd/system/corosync-qdevice.service; disabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-05-27 11:47:02 CEST; 1h 10min ago
       Docs: man:corosync-qdevice
   Main PID: 74240 (code=exited, status=1/FAILURE)
        CPU: 8ms

May 27 11:47:02 proxmox-n1 systemd[1]: corosync-qdevice.service: Scheduled restart job, restart counter is at 5.
May 27 11:47:02 proxmox-n1 systemd[1]: Stopped corosync-qdevice.service - Corosync Qdevice daemon.
May 27 11:47:02 proxmox-n1 systemd[1]: corosync-qdevice.service: Start request repeated too quickly.
May 27 11:47:02 proxmox-n1 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
May 27 11:47:02 proxmox-n1 systemd[1]: Failed to start corosync-qdevice.service - Corosync Qdevice daemon.

When I try to run corosync-qdevice manually, I get this error:

Code:
root@proxmox-n1:~# /usr/sbin/corosync-qdevice -f -d
May 27 12:57:19 debug   Initializing votequorum
May 27 12:57:19 debug   Initializing local socket
May 27 12:57:19 debug   Registering qdevice models
May 27 12:57:19 debug   Configuring qdevice
May 27 12:57:19 error   Can't read quorum.device.model cmap key.

I already tried purging and reinstalling the packages on both nodes and the PBS. I also rebooted every device just to make sure.
On my PBS corosync-qnetd is running and corosync-qdevice.service is inactive.

Maybe some of you have already encountered this problem and can help me fix this issue.

Greetings!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!