New nodes joined to existing cluster, 8.2 updates making SSH key trusts not auto fixed and manual steps needed

BloodyIron · Jul 9, 2024

So I have two new PVE Nodes that were recently provisioned with PVE v8.2, but were later updated to latest as of today (v8.2.4?).

Anyways, I'm now having the silly SSH Key management junk that's been going around with the PVE V8.1/v8.2 changes. The other cluster nodes are on older versions of PVE (v8.1 ish) as they need updating love (tell me about it), but threads like this are demonstrating to me there's no proper solution, nor documentation, on scenarios like this : https://forum.proxmox.com/threads/host-key-verification-failed-when-trying-to-migrate-vm.146299/

running "pve updatecerts -f" (or without -f) on the "older" nodes does nothing to fix the situation.

So I'm probably going to have to go to each PVE "old" node and try to SSH to the new PVE Nodes just to skirt this silly situation. This is the very kind of thing the cluster automations should handle and this update shouldn've have left under "RELEASE" state, and considering it's a known issue for months now, it blows me away that this still doesn't have a nice fix in the non-sub repos.

Now I'm going to do the hacky-work around for now because I have work to do, and if there's something I should do instead to correct this, please let me know.

BloodyIron · Jul 9, 2024

omfg and that method doesn't even actually solve the problem, this is just such a shit show... I just wanted to add two new PVE Nodes to this cluster and I'm burning way too much time on this SSH BS that should've been solved months ago >:|

BloodyIron · Jul 9, 2024

Okay I actually need to ssh FROM every node in the cluster TO every node in the cluster to generate the known_hosts trust. guh this is a cluster-truck.

BloodyIron · Jul 9, 2024

For future human purposes:

I created a list of commands containing all the nodes in it, and executed this on every node via CLI. THIS IS NOT THE IDEAL WAY TO DO THIS AND I KNOW IT IS A BAD PRACTICE BUT FOR NOW THIS IS GOOD ENOUGH:

ssh -o StrictHostKeyChecking=accept-new -t HostName1 'exit'
ssh -o StrictHostKeyChecking=accept-new -t HostName2 'exit'
ssh -o StrictHostKeyChecking=accept-new -t HostName3 'exit'

etc..

So what this does is force-accept the fingerprint of what it is connecting to and issuing the exit command to disconnect. Pasting all the lines means this happens in rapid succession. IF YOU DO NOT UNDERSTAND THE SECURITY RAMIFICATIONS OF WHY THIS MAY NOT BE A GOOD IDEA PLEASE SLOW DOWN AND MAYBE NOT USE THIS METHOD! Blindly accepting SSH Server key fingerprints has security implications if it is connecting to a system that is compromised or untrusted you might blindly trust something you don't want to!

I'm posting this to help other humans and hopefully light more fire stuff under the devs to correct this silliness because PVE is awesome and I'd love for this to be fixed.

BloodyIron · Jul 9, 2024

And yes I have each node even connect to itself because I don't currently see a reason not to do that...

BloodyIron · Jul 9, 2024

Ugh this didn't even properly fix it anyways... randomly logged into webGUI of a rando node and tried to webCLI Shell to one of the new nodes and still asked me for fingerprint... fuck this is so stupid

BloodyIron · Jul 9, 2024

oh it's because the webGUI uses IP not hostname... yay okay redo this work then....

BloodyIron · Jul 10, 2024

Yeah doing the same method by IPs for all nodes on all nodes did the trick. Hoping this has a proper solution soon that is automated by the cluster. Hopefully someone gets helped by this work-around for now.

Search

Search

New nodes joined to existing cluster, 8.2 updates making SSH key trusts not auto fixed and manual steps needed

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

BloodyIron

Renowned Member

We value your privacy