[SOLVED] pvecm add -> 401 authentication failure

francoisd

Renowned Member
Sep 10, 2009
60
9
73
I had to remove pve1 (motherboard failure), and I now want to re-add it. It has been completely removed from the proxmox cluster and the /etc/pve/* folders have been removed.

I can't re-join pve1 to the cluster.
I either have:

Code:
root@pve1:~# pvecm add pve2.home.hsh.org
Please enter superuser (root) password for 'pve2.home.hsh.org': *************
Establishing API connection with host 'pve2.home.hsh.org'
500 Can't connect to pve2.home.hsh.org:8006 (hostname verification failed)

Code:
root@pve1:~# pvecm add pve2.hsh.org
Please enter superuser (root) password for 'pve2.hsh.org': *************
Establishing API connection with host 'pve2.hsh.org'
401 authentication failure

* Both pve2.hsh.org and pve2.home.hsh.org point to the same IP, but the let's encrypt certificate is only for pve2.hsh.org hostname (that's why the second seems to go to the next step).
* Of course, the password I provide is the correct LInux PAM root password, and 2FA has been disabled for that user.
* pve1.hsh.org is still running the self signed certificate.

The commands to reset pve1 were:
Code:
systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service
systemctl stop corosync
systemctl stop pve-cluster

mkdir /root/$(date +%Y%m%d)
mv -f /var/lib/pve-cluster/config.db /root/$(date +%Y%m%d)
mv /etc/pve/* /root/$(date +%Y%m%d)
rm -rf /etc/corosync/*
rm /var/lib/corosync/*

Since pve1 is also part of the ceph storage, reinstalling pve1 is my last option, and I'm not sure it would solve the issue.

What should I do to re-join pve1 to the cluster ?
 
Last edited:
Replying to myself.

The problem was that the 2FA was still activated. It was shown as disabled on the cluster view, but it was sill in the `/etc/pve/domains.cfg`.

I just removed it for the pam: pam authentication, directly in the file, and I was able to join the node.