Joining a node to a cluster errors in authkey.key missing

proxuser10

New Member
May 14, 2025
10
0
1
I am creating a cluster. I have created the cluster on my first node

code_language.shell:
 # cat /etc/pve/.members
{
"nodename": "proxmox1",
"version": 3,
"cluster": { "name": "proxcluster", "version": 1, "nodes": 1, "quorate": 1 },
"nodelist": {
  "proxmox1": { "id": 1, "online": 1, "ip": "172.16.1.2"}
  }
}

I have checked that my second node has an auth key in priv folder
code_language.shell:
# ls /etc/pve/priv/
acme  authkey.key  authorized_keys  lock  pve-root-ca.key  pve-root-ca.srl

When I use assisted join on this second node it gets stuck on "request addition of node"


code_language.shell:
Establishing API connection with host '172.16.1.2'
Login succeeded.
check cluster join API version
No cluster network links passed explicitly, fallback to local node IP '172.16.1.3'
Request addition of this node

On node 1, proxmox2 was added but shows up as offline
code_language.shell:
# cat /etc/pve/.members
{
"nodename": "proxmox1",
"version": 4,
"cluster": { "name": "proxcluster", "version": 2, "nodes": 2, "quorate": 0 },
"nodelist": {
  "proxmox1": { "id": 1, "online": 1, "ip": "172.16.1.2"},
  "proxmox2": { "id": 2, "online": 0}
  }
}

On node 2, the priv folder got deleted by the join process but did not get recreated
code_language.shell:
# ls /etc/pve/
corosync.conf  local  lxc  openvz  qemu-server

On node 2, I see auth error on pvedaemon from my existing UI tab session
May 30 06:49:01 proxmox2 pvedaemon[1221]: authentication failure; rhost=::ffff:my.public.ip user=root@pam msg=Authenti>

code_language.shell:
# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 172.16.1.2
  }
  node {
    name: proxmox2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 172.16.1.3
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: proxcluster
  config_version: 2
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

code_language.shell:
 # journalctl -u pve-cluster.service --since "30min ago"
May 30 06:48:50 proxmox2 systemd[1]: Stopping pve-cluster.service - The Proxmox VE cluster filesystem...
May 30 06:48:50 proxmox2 pmxcfs[897]: [main] notice: teardown filesystem
May 30 06:48:50 proxmox2 pmxcfs[897]: [main] notice: exit proxmox configuration filesystem (0)
May 30 06:48:50 proxmox2 systemd[1]: pve-cluster.service: Deactivated successfully.
May 30 06:48:50 proxmox2 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
May 30 06:48:50 proxmox2 systemd[1]: pve-cluster.service: Consumed 1.082s CPU time.
May 30 06:48:51 proxmox2 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
May 30 06:48:51 proxmox2 pmxcfs[7400]: [main] notice: resolved node name 'proxmox2' to '172.16.1.3' for default node IP>
May 30 06:48:51 proxmox2 pmxcfs[7400]: [main] notice: resolved node name 'proxmox2' to '172.16.1.3' for default node IP>
May 30 06:48:51 proxmox2 pmxcfs[7413]: [quorum] crit: quorum_initialize failed: 2
May 30 06:48:51 proxmox2 pmxcfs[7413]: [quorum] crit: can't initialize service
May 30 06:48:51 proxmox2 pmxcfs[7413]: [confdb] crit: cmap_initialize failed: 2
May 30 06:48:51 proxmox2 pmxcfs[7413]: [confdb] crit: can't initialize service
May 30 06:48:51 proxmox2 pmxcfs[7413]: [dcdb] crit: cpg_initialize failed: 2
May 30 06:48:51 proxmox2 pmxcfs[7413]: [dcdb] crit: can't initialize service
May 30 06:48:51 proxmox2 pmxcfs[7413]: [status] crit: cpg_initialize failed: 2
May 30 06:48:51 proxmox2 pmxcfs[7413]: [status] crit: can't initialize service
May 30 06:48:52 proxmox2 systemd[1]: Started pve-cluster.service - The Proxmox VE cluster filesystem.
May 30 06:48:57 proxmox2 pmxcfs[7413]: [status] notice: update cluster info (cluster name  proxcluster, version = 2)
May 30 06:48:57 proxmox2 pmxcfs[7413]: [dcdb] notice: members: 2/7413
May 30 06:48:57 proxmox2 pmxcfs[7413]: [dcdb] notice: all data is up to date
May 30 06:48:57 proxmox2 pmxcfs[7413]: [status] notice: members: 2/7413
May 30 06:48:57 proxmox2 pmxcfs[7413]: [status] notice: all data is up to date

Any suggestions on how to fix this issue. Thank you for your time.
 
Hi!

Could you post the PVE versions of the 2 nodes? Also, please post the content of /etc/pve/corosync.conf of the other node as well.
 
Oh well.. the versions are slightly off. Does this need an update on node1?

On node 1
code_language.shell:
# pveversion
pve-manager/8.3.3/f157a38b211595d6 (running kernel: 6.8.12-8-pve)

On node 2
code_language.shell:
# pveversion
pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.8.12-11-pve)

On node 1
code_language.shell:
 # cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 172.16.1.2
  }
  node {
    name: proxmox2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 172.16.1.3
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: proxcluster
  config_version: 2
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}
 
Last edited:
I have updated both nodes. The node addition is still stuck at the same spot. Since I am not able to sign in, I am having to clear cluster info, do update certs and restart the cluster on node2. At which point I can see more detailed error on the UI from the join operation

On node1:
code_language.shell:
# pveversion
pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.8.12-11-pve)

On node2:
code_language.shell:
# pveversion
pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.8.12-11-pve)

code_language.shell:
Establishing API connection with host '172.16.1.2'
Login succeeded.
check cluster join API version
No cluster network links passed explicitly, fallback to local node IP '172.16.1.3'
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1748669426.sql.gz'
waiting for quorum...OK
(re)generate node files
generate new node certificate
genrsa: Can't open "/etc/pve/priv/authkey.key" for writing, Software caused connection abort
TASK ERROR: command 'openssl genrsa -out /etc/pve/priv/authkey.key 2048' failed: exit code 1
 
Last edited:
For what it is worth pve folder does not have write permission for www-data group. Could that be the reason for the error?

code_language.shell:
# ls -al /etc/pve/
total 1
drwxr-xr-x 2 root www-data    0 Jan  1  1970 ./