tls_process_certificate: certificate verify failed

Feb 8, 2023
13
2
8
I have a three node cluster with all three nodes using Lets Encrypt certificates. These nodes are fresh install of PVE 7.3 and have all of the latest updates. Both salt-cloud and terraform fail to create containers with a "tls_process_certificate: certificate verify failed" error. However, they successfully call a few APIs but fail when calling the API to actually create the container. From the terraform log, it appears that PVE receives the request, calls another API internally, and that's where the verification is failing, NOT between the client and the server.

Code:
2023-02-12T19:40:22.579-0800 [INFO]  provider.terraform-provider-proxmox_v2.9.11: 2023/02/12 19:40:22 >>>>>>>>>> REQUEST:
GET /api2/json/cluster/nextid HTTP/1.1
Host: [REDACTED]:8006
User-Agent: Go-http-client/1.1
Accept: application/json
Authorization: [REDACTED]
Accept-Encoding: gzip
: timestamp=2023-02-12T19:40:22.578-0800
2023-02-12T19:40:22.595-0800 [INFO]  provider.terraform-provider-proxmox_v2.9.11: 2023/02/12 19:40:22 <<<<<<<<<< RESULT:
HTTP/1.1 200 OK
Content-Length: 14
Cache-Control: max-age=0
Connection: Keep-Alive
Content-Type: application/json;charset=UTF-8
Date: Mon, 13 Feb 2023 03:40:22 GMT
Expires: Mon, 13 Feb 2023 03:40:22 GMT
Pragma: no-cache
Server: pve-api-daemon/3.0

{"data":"100"}: timestamp=2023-02-12T19:40:22.595-0800
2023-02-12T19:40:22.595-0800 [INFO]  provider.terraform-provider-proxmox_v2.9.11: 2023/02/12 19:40:22 >>>>>>>>>> REQUEST:
POST /api2/json/nodes/[REDACTED]/lxc HTTP/1.1
Host: [REDACTED]:8006
User-Agent: Go-http-client/1.1
Content-Length: 434
Accept: application/json
Authorization: [REDACTED]
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip

arch=amd64&cmode=tty&console=1&cores=2&cpulimit=0&cpuunits=100&hostname=salt&memory=2048&net0=bridge%3Dvmbr0%2Cname%3Deth0%2Cip6%3Dauto%2Cip%3Ddhcp&onboot=1&ostemplate=nas%3Adebian-11-standard_11.6-1_amd64.tar.zst&protection=0&rootfs=nvme%3A8&searchdomain=[REDACTED]&ssh-public-keys=[REDACTED]&start=1&storage=local&swap=2048&tty=2&unique=1&unprivileged=1&vmid=100: timestamp=2023-02-12T19:40:22.595-0800
2023-02-12T19:40:22.606-0800 [INFO]  provider.terraform-provider-proxmox_v2.9.11: 2023/02/12 19:40:22 <<<<<<<<<< RESULT:
HTTP/1.1 596 tls_process_server_certificate: certificate verify failed
Connection: close
Cache-Control: max-age=0
Date: Mon, 13 Feb 2023 03:40:22 GMT
Expires: Mon, 13 Feb 2023 03:40:22 GMT
Pragma: no-cache
Server: pve-api-daemon/3.0
: timestamp=2023-02-12T19:40:22.606-0800
 
can you check that all three nodes are actually using the (expected) certificates? could you try issuing systemctl reload-or-restart pveproxy on all nodes and see if that solves the problem?
 
  • Like
Reactions: Stoiko Ivanov
Thanks Fabian.

1. Running
Code:
openssl s_client -connect [REDACTED]:8006 -showcerts
against all three nodes I get back the Let's Encrypt certificate for each.
1. I ran
Code:
systemctl restart pveproxy
on all three nodes (I've restarted the nodes multiple times, and rebuilt them from scratch once, and this problem has persisted) and terraform still fails.

I haven't done much in the way of configuration on these nodes, they're pretty close to an out-of-the-box install.
 
is the [REDACTED] part of the endpoint correct? is the request a proxied one (i.e., the node you are connected to is not the one that is handling the request - this is decided based on the {node} parameter from the request URL)? do all nodes correctly resolve eachother's hostname?
 
Just worked out what the issue was! In the API call I was making the target node to deploy the container on was a fully qualified domain name, rather than just the node name. When I changed that from an FQDN to just the node name, it worked!
 
  • Like
Reactions: fabian