[SOLVED] QDevice setup issue/bug

mmhx

Member
Nov 26, 2021
19
1
8
34
So, I set up a qdevice inside docker on a 3rd (non-proxmox) system.
The Cluster (2 nodes) was also just created today (nodes existed a bit longer).
I installed corosync-qdevice on the nodes and ran the `pvecm qdevice setup` command. Stuff ran through and then it fails:

Code:
INFO: import certificate
Importing signed cluster certificate
Notice: Trust flag u is set automatically if the private key is present.
pk12util: PKCS12 EXPORT SUCCESSFUL
Certificate stored in /etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12


INFO: copy and import pk12 cert to all nodes
Host key verification failed.
command 'ssh -o 'BatchMode=yes' -lroot 192.168.2.112 corosync-qdevice-net-certutil -m -c /etc/pve/qdevice-net-node.p12' failed: exit code 255

.112 is the other node (not the one running the pvecm qdevice command from. I also tested it from the other node - with the same result, Host key verification fails on the other node. I've checked, and there are keys from each other's devices present in both authorized_keys files.

I've also tried to run the failing command manually - same result (obviously)

What I'm guessing is, it should be -l root instead of -lroot
I could run this command manually, but I have no idea where to adjust this so the following steps would run
(Actually - still getting "Host key verification failed" with that change, so no clue.)
 
Last edited:
Try running on one of your nodes:
Code:
pvecm updatecerts --force
systemctl restart pveproxy
 
That ran through, when trying pvecm qdevice setup <IP> I had to use the force parameter.
But I still end up with the same error. (And the -lroot). Manually entering it with -l root also still fails.

I'll try one more time with your commands and then just the ssh command... --> Host key verification failed

If I remove the Batchmode option it runs through... but then the qdevice is not yet added...
Okay and running qdevice setup afterwards goes through!

Code:
Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1   A,NV,NMW 192.168.2.111
0x00000002          1   A,NV,NMW 192.168.2.112 (local)
0x00000000          0            Qdevice (votes 1)

BUT!
the -lroot is definitely a typo in some script I believe, and needs fixing.
And something about handling/usage of the authorized keys is also broken.... or did I miss a step?
(It's not like I have password access or publickey access disabled anywhere (yet))

I guess I should open an issue on the bugtracker for that?
 
Last edited: