Adding node failure - maybe du to ssl cert?

ewuewu

Renowned Member
Sep 14, 2010
58
0
71
Hamburg
I’ve a problem adding a new node to our running cluster (Version 3.4-11/6502936f).
I assume it coheres with our geoTrust wildcard ssl cert.

Due to some probs in the past we decided to supply our cluster with the official Geotrust SSL Cert. We managed this by following this howto: https://pve.proxmox.com/wiki/HTTPSCertificateConfiguration

Every thing went fine til today when we tried to add the new node.

Code:
pvecm add 172.17.0.38
The authenticity of host '172.17.0.38 (172.17.0.38)' can't be established.
ECDSA key fingerprint is f9:1a:08:d3:fb:a0:f1:84:c9:75:35:78:03:78:11:44.
Are you sure you want to continue connecting (yes/no)? yes
root@172.17.0.38's password:
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
  Checking if cluster has been disabled at boot... [  OK  ]
  Checking Network Manager... [  OK  ]
  Global setup... [  OK  ]
  Loading kernel modules... [  OK  ]
  Mounting configfs... [  OK  ]
  Starting cman... [  OK  ]
  Waiting for quorum... [  OK  ]
  Starting fenced... [  OK  ]
  Starting dlm_controld... [  OK  ]
  Tuning DLM kernel config... [  OK  ]
  Unfencing self... [  OK  ]
waiting for quorum...OK
generating node certificates
Signature ok
subject=/OU=PVE Cluster Node/O=Proxmox Virtual Environment/CN=lx-vmhost-hh3.datamart.de
Getting CA Private Key
CA certificate and CA private key do not match
140356343514792:error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch:x509_cmp.c:330:
unable to generate pve ssl certificate:
command 'openssl x509 -req -in /tmp/pvecertreq-5041.tmp -days 3650 -out /etc/pve/nodes/lx-vmhost-hh3/pve-ssl.pem -CAkey /etc/pve/priv/pve-root-ca.key -CA /etc/pve/pve-root-ca.pem -CAserial /etc/pve/priv/pve-root-ca.srl -extfile /tmp/pvesslconf-5041.tmp' failed: exit code 1
root@lx-vmhost-hh3:~#

Afterwards the new node is visible in the web interface. But ssh connects without password from this node to the others an vice versa are not working. Even so migration of VMs are not possible. The error is ‘problem with mirgration tunnel’

How can we solve this behaviour?

Any help is appreciated.
 
The problem is that pvecm add did not complete because your SSL configuration is broken (pvecm add tries to generate a self-signed certificate for the node, which only works if the cluster ca key and certificate are available and match). You probably changed the cluster ca certificate (to the intermediate or root of your CA), so it's not possible to create a self-signed certificate and pvecm add dies with this error message (before creating ssh keys and configuration).

You will need to
  1. remove this failed node
  2. temporarily restore the original cluster ca certificate and key
  3. add the failed node again using pvecm add, which should work now
  4. restore your changed SSL files again (on the cluster / CA level and for your newly added node)
 
We've also had this issue lately, after our SSL stopped working on Proxmox for no apprent reason.

Has something changed with updates in the last 30 days or so that affects the way SSL on PM works?

Perhaps someone who is more knowledgeable on this issue could review the wiki documentation to ensure it's still all correct.

Long-term, Proxmox could really use an area in the web UI to manage certificates - this would make things MUCH easier!

Jon
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!