ssh keys across nodes

jhmc93

Member
Feb 22, 2022
177
1
23
31
Hello,
is it possible to generate your own ssh key and use it across your proxmox cluster, would it work or would the cluster break due to the ssh key change?
looking at generating my own ssh key in a linux environment and having just one key for all of my ssh devices so i was wondering if it would work on proxmox,
as each time i update i cant access my vms on my other cluster because of ssh key error comes up.
regards
J
 
is it possible to generate your own ssh key and use it across your proxmox cluster, would it work or would the cluster break due to the ssh key change?
Clustering itself is unrelated from the SSH key, so that would not break in any way due to SSH key changes, but the commands that are currently using SSH as data-channel (live-migration, replication are the most prominent ones) could in deed need to have some adaption made.

looking at generating my own ssh key in a linux environment and having just one key for all of my ssh devices so i was wondering if it would work on proxmox,
What do you mean here, that you would create a single key and then use the same private key for all nodes?
as each time i update i cant access my vms on my other cluster because of ssh key error comes up.
What happens when you update and what breaks exactly?
Because accessing the VMs themselves should not be influenced by the Proxmox VE cluster, and it's ssh configs or whatever.
 
What do you mean here, that you would create a single key and then use the same private key for all nodes?
yes that's exactly what I mean, I was also gonna use the same key for my Linux VMs.

What happens when you update and what breaks exactly?
Because accessing the VMs themselves should not be influenced by the Proxmox VE cluster, and it's ssh configs or whatever.
The problem is in the below screenshot:
1700077444221.png
 
This is presumably - you may wish to confirm - you attempting access SSH of one node via GUI access to another.

Your ssh configs get corrupted by running pvecm updatecerts, which is undocumented (in general docs) but since filed as a bug:
https://bugzilla.proxmox.com/show_bug.cgi?id=4886

If you also ran ssh-keygen -f as advised that is another bug which makes it even worse (for the node which you run it on):
https://bugzilla.proxmox.com/show_bug.cgi?id=4252

From the reactions of staff, you may notice the plan is to move over everything possible to SSL (although NoVNC access accross will be interesting). It affects mostly migrations and replications and anything SSH-related which is as of now also undocumented.

If you don't have own SSH keys there you need in the nodes, you may as well follow advice from:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

If you do have own keys you'd best back them up and append them to the /etc/pve/priv/known_hosts (from any single node, it's a shared location). If you end up having even duplicates (old and new alike) it is actually not a problem, SSH would accept any matching found key from any file it goes through, this is often resulting in going on tangents looking for duplicates, which are not a problem, problem is the pvecm updacerts corrupting (removing) your fresh keys, but retaining the old ones. If it had correctly removed all the keys for the node, it would simply ask you if you want to accept the new one.

If you want When I wanted to push this further, you are I felt as if I was advised to propose a patch yourself myself, but SSH certificates are deemed too risky for upgrades across PVE versions to be taken in, so it won't be accepted. You may wish to just DIY, e.g. see [1] below and you will never have to worry about the broken implementation again, just add it manually once to any joining node at the time of joining. Note you are best to use a modern Ed25519 keys, which may also avoid pvecm updatecerts corrupting your files as it ignores anything but RSA (or rather includes non-RSA ones as-is). If you have many existing keys, you may wish to have a look at [2], note unlike the author there I would refrain from commenting on 2048bit RSA keys (that PVE uses) being too weak, as you are not supposed to have PVE exposed to any public access anyhow.

If any of this resolved your issues, you may want to mark this thread as solved for others to find looking to fix the same. You may not get much support from staff, as the code is very old and they are aspiring to move on rather than dig back in. The person who had replied to you is the CTO (I did not know before either), so no one would hijack it further.

[1] If you’re not using SSH certificates you’re doing SSH wrong

https://smallstep.com/blog/use-ssh-certificates/

[2]
Upgrade Your SSH Key to Ed25519
https://medium.com/risan/upgrade-your-ssh-key-to-ed25519-c6e8d60d3c54
 
Last edited:
This is presumably - you may wish to confirm - you attempting access SSH of one node via GUI access to another.

Your ssh configs get corrupted by running pvecm updatecerts, which is undocumented (in general docs) but since filed as a bug:
https://bugzilla.proxmox.com/show_bug.cgi?id=4886

If you also ran ssh-keygen -f as advised that is another bug which makes it even worse (for the node which you run it on):
https://bugzilla.proxmox.com/show_bug.cgi?id=4252

From the reactions of staff, you may notice the plan is to move over everything possible to SSL (although NoVNC access accross will be interesting). It affects mostly migrations and replications and anything SSH-related which is as of now also undocumented.

If you don't have own SSH keys there you need in the nodes, you may as well follow advice from:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

If you do have own keys you'd best back them up and append them to the /etc/pve/priv/known_hosts (from any single node, it's a shared location). If you end up having even duplicates (old and new alike) it is actually not a problem, SSH would accept any matching found key from any file it goes through, this is often resulting in going on tangents looking for duplicates, which are not a problem, problem is the pvecm updacerts corrupting (removing) your fresh keys, but retaining the old ones. If it had correctly removed all the keys for the node, it would simply ask you if you want to accept the new one.

If you want to push this further, you are advised to propose a patch yourself, but SSH certificates are deemed too risky for upgrades across PVE versions to be taken in, so it won't be accepted. You may wish to just DIY, e.g. see [1] below and you will never have to worry about the broken implementation again, just add it manually once to any joining node at the time of joining. Note you are best to use a modern Ed25519 keys, which may also avoid pvecm updatecerts corrupting your files as it ignores anything but RSA (or rather includes non-RSA ones as-is). If you have many existing keys, you may wish to have a look at [2], note unlike the author there I would refrain from commenting on 2048bit RSA keys (that PVE uses) being too weak, as you are not supposed to have PVE exposed to any public access anyhow.

If any of this resolved your issues, you may want to mark this thread as solved for others to find looking to fix the same. You may not get much support from staff, as the code is very old and they are aspiring to move on rather than dig back in. The person who had replied to you is the CTO (I did not know before either), so no one would hijack it further.

[1] If you’re not using SSH certificates you’re doing SSH wrong

https://smallstep.com/blog/use-ssh-certificates/

[2]
Upgrade Your SSH Key to Ed25519
https://medium.com/risan/upgrade-your-ssh-key-to-ed25519-c6e8d60d3c54
Are you just running forum threads through ChatGPT and posting its answers without regard for the context in which it is asked? Please read and understand the thread before adding off-topic noise.

EDIT: I was completely wrong about this. Sorry for the unnecessary noise in this thread.
 
Last edited:
Are you just running forum threads through ChatGPT and posting its answers without regard for the context in which it is asked? Please read and understand the thread before adding off-topic noise.

Please, have a look at this and other forum posts related to pvecm updatecerts and ssh-keygen -f related issues. They do boil all down to the same as do the three inter-related issues in Bugzilla. So I compiled what I had found over 4 weeks of tedious debugging into a single post, the issue discussed there in detail. It was me who referred to this new post from https://bugzilla.proxmox.com/show_bug.cgi?id=4252#c20. There were also some ill-advised posts from regular staff before to people https://bugzilla.proxmox.com/show_bug.cgi?id=4252#c21 which were not aware of these issues which they cannot be aware if they are undocumented as they are unrecognized.

I will not detour from the subject matter anymore as before (when I did not know who I was talking to) so that I won't be myself accused of flaming just because it's hard to get the point across by civilised conversation means.

For all it's worth, the forum is moderated, so feel free to censor, but this is the one thing I would give to the folks here including Thomas that they do not censored disagreeing posts.

Feel free to comment in on the Bugzilla, I am happy to reply there, it's watched by fewer people too. Also, I am too old to have to do much with GPT anything.
 
NB: I did pay attention exactly to what he asked, I waited for further input and look at the ssh error output - "nuc-i3" is hardly a VM's key. I then let 24 hours pass for anyone from staff to debug this further from him (as from single screenshot it's hard to dig into anything). I believe I saved the OP few weeks time.
 
NB: I did pay attention exactly to what he asked, I waited for further input and look at the ssh error output - "nuc-i3" is hardly a VM's key. I then let 24 hours pass for anyone from staff to debug this further from him (as from single screenshot it's hard to dig into anything). I believe I saved the OP few weeks time.
Apologies, I did not read your answer in the way you meant it. I might be a little over sensitive. Thank you for explaining.
 
  • Like
Reactions: temaccount392742
Apologies, I did not read your answer in the way you meant it. I might be a little over sensitive. Thank you for explaining.
No worries, when I have a point to make, I typically get focused on the subject matter so much I might not be the best in communicating it to others, but I really did wish to get to the bottom of this one. I really do not mean to incite anyone, but I do think it needs documenting at the least.
 
  • Like
Reactions: leesteken
yes that's exactly what I mean, I was also gonna use the same key for my Linux VMs.
You'd be putting all your eggs into a single basket then, if any one of the keys leaks you'd need to upgraded quite a few servers then.
So in general, I'd recommend against that. If, you indeed should use certificates, or see what causes this for you.
In any way, VM and Proxmox VE host keys should be rather unrelated, and not interfere with each other (well at least if you do not use the Proxmox VE hosts as jump hosts).
The problem is in the below screenshot:
Again, can you please tell me exactly what you do between all working and this state? As it shouldn't really happen with just doing an update. Because this might happen if you re-install nodes and re-add them to the cluster again, but should not happen by just "updating" - or doesn't this constantly break again on you doing update, but rather you meant to say that this is broken since a while and thus shell access to other nodes won't work?

If it gets re-broken constantly, could you please try if just calling pvecm updatecerts actually breaks it for you?
Because if this isn't the not so well-supported re-add different node under same name as a previous one case, then the merge ssh code might get thrown off; there are many ideas for improving that.

No worries, when I have a point to make, I typically get focused on the subject matter so much I might not be the best in communicating it to others, but I really did wish to get to the bottom of this one. I really do not mean to incite anyone, but I do think it needs documenting at the least.
Actually I actively put this on my back burner due to communication being a bit to nerve-racking for my taste, and the walls of text and demoting everything as messy, old, or stretching the words of staff (like just here in this thread, I never said you must patch it yourself, but that calling things trivial without a single patch send that actually proofs that it is the case is not for me, but I can already imagine that this again will be picked a part and a lengthy text will add ten meanings, that then are linked to from three other posts on three other platforms with, again, words and intentions laid into my sayings - some links to the actual issue are naturally thrown in too, otherwise it wouldn't look like one is actually wanting to fix something not just argue in circles without adding information (the first posts of the bug reports had already all that's needed to know)) was rather off-putting and diverting from the cause, so I did not see it as efficient to continue working on this.
The reason I (and others, I'm not the sole of my colleagues one being put off) could afford to do so was because it isn't a recent regression, as we naturally want to fix things, especially regressions, in a timely manner, but rather a long-standing design drawback (that certainly can, and will be, improved) that one cannot just shoot from the hips at (well at least should not), and most of the time even shows in niche scenarios, so no timely need for spending time there.
For all it's worth, the forum is moderated, so feel free to censor, but this is the one thing I would give to the folks here including Thomas that they do not censored disagreeing posts.
We never did censor anything of yours, and all our channels, including Bugzilla are moderated – we need to comply with the legal frameworks after all, like GDPR right-to-be-forgotten requests or simply removing illegal content.
 
I won't hijack the thread of the OP as I cannot second-guess his case, I just have a hunch, I will see if I was right, it's not important at this point.

EDIT: Keeping this thread for the OP, pulled out most of the side remarks out. @t.lamprecht Please see direct reply from me.

We never did censor anything of yours, and all our channels, including Bugzilla are moderated – we need to comply with the legal frameworks after all, like GDPR right-to-be-forgotten requests or simply removing illegal content.

Yes and that is what I recognise and RESPECT, however impertinent you might have found my input at least it is searchable there for the references, (not my communication style). Apologies to any offended by bringing this up this particular manner, I do however not hide that I am happy it got the attention I hope it deserved.

I will provide further input in Bugzilla on this if asked, but there's plenty already I believe in #4886, i.e. not the one where we had our disagreements.
 
Last edited:
Last edited:
Again, can you please tell me exactly what you do between all working and this state? As it shouldn't really happen with just doing an update. Because this might happen if you re-install nodes and re-add them to the cluster again, but should not happen by just "updating" - or doesn't this constantly break again on you doing update, but rather you meant to say that this is broken since a while and thus shell access to other nodes won't work?

If it gets re-broken constantly, could you please try if just calling pvecm updatecerts actually breaks it for you?
Because if this isn't the not so well-supported re-add different node under same name as a previous one case, then the merge ssh code might get thrown off; there are many ideas for improving that.
it breaks everytime i run the dist-upgrade, all im trying to do is access vnc on one of the vms off a different machine on the pc running in my cluster for exampled im trying to use vnc console on the vm thats running on node-2 from node-1 but it shows that error
 
it breaks everytime i run the dist-upgrade, all im trying to do is access vnc on one of the vms off a different machine on the pc running in my cluster for exampled im trying to use vnc console on the vm thats running on node-2 from node-1 but it shows that error

Related to: https://forum.proxmox.com/threads/safely-changing-ssh-keys.133065/

Interestingly, same day post, same issue after "upgrade": https://forum.proxmox.com/threads/p...-in-cluster-after-upgrade.133030/#post-606358

Might have been something in the underlying Debian or installer that caused ~/.ssh/id_rsa to regenerate or triggered the pvecm updatecerts codepath which, if you had similar issue ever before, caused it to come back.
 
Last edited:
So is there a solution?
Related to: https://forum.proxmox.com/threads/safely-changing-ssh-keys.133065/

Interestingly, same day post, same issue after "upgrade": https://forum.proxmox.com/threads/p...-in-cluster-after-upgrade.133030/#post-606358

Might have been something in the underlying Debian or installer that caused ~/.ssh/id_rsa to regenerate or triggered the pvecm updatecerts codepath which, if you had similar issue ever before, caused it to come back.
So is there a solution
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!