ssh key changed nightly

scottrus71

New Member
Sep 7, 2023
9
0
1
It seems that my ssh key is changed nightly. It's not clear why. My setup is Proxmox 8.0.4 Cluster with 3 identical nodes. The nodes are pve-node-01, 02, and 03. I've observed this consistent pattern when using the node shell from the cluster web ui:

Code:
https://pve-node-01:
    shell pve-node-01 - yes
    shell pve-node-02 - yes
    shell pve-node-03 - no
https://pve-node-02:
    shell pve-node-01 - yes
    shell pve-node-02 - yes
    shell pve-node-03 - no
https://pve-node-03:
    shell pve-node-01 - yes
    shell pve-node-02 - yes
    shell pve-node-03 - yes

Read this as, if I'm on the web-ui pve-node-01, then I can access pve-node-01 and pve-node-02 fine via the node shell, but pve-node-03 does not work. The only time I can shell to all three hosts in the cluster is when I access the web-ui from pve-node-03. The error message about the host key is shown below.

I can resolve this temporarily by ssh into pve-node-03 and running pvecm updatecerts. After that pve-node-03 is reachable via the shell from pve-node-01 and pve-node-02 again.

Code:
root@pve-node-03:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 pve-node-03 (local)
         2          1 pve-node-02
         3          1 pve-node-01

This is what I see from the shell when trying to access pve-node-03 from the web-ui of pre-node-01
Code:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:mYBW+vBgs2iPwZzrOtPTI0R5Gq5SYtzMDXM6743LQtk.
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending RSA key in /etc/ssh/ssh_known_hosts:2
  remove with:
  ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "192.168.200.103"
Host key for 192.168.200.103 has changed and you have requested strict checking.
Host key verification failed.
 
That is really not good. Can you try to find out when the key is changed (timestamp of change of the file) or is only the IP changed? Normally the SSH key setup is done automatically from PVE and never changed after that.
 
I’m using static IPs for all 3 nodes, so I don’t think it’s the IP address changing. I’m fairly confident that this is not a bad actor at play since none of these nodes are exposed outside my network. Of course, who knows what the kids download .

I’ve already updated keys for the day but I’ll make note of the ssh_known_hosts file time stamp. Any other files I should be checking? Everything in the /etc/ssh/ directory on all nodes maybe?
 
So I was wrong about the when. I was able to trigger this by doing a reboot too. Last night I shut down the entire cluster and moved it to a new physical location. On boot up, pve-node-03 was again not responding to the shell when connected to web-ui on pve-node-01 or ave-node-02. The time stamps in the /etc/pve/priv directory align with the reboot timestamp.

Notice the timestamp on authorized_keys and known_hosts match the last system reboot time.

For some reason on reboot, pve-node-03 isn't setting the right keys?

Code:
root@pve-node-03:/etc/pve/priv# ls -l
total 4
drwx------ 2 root www-data    0 Aug 19 11:24 acme
-rw------- 1 root www-data 1675 Sep  7 20:41 authkey.key
-rw------- 1 root www-data 4379 Sep  7 19:59 authorized_keys
-rw------- 1 root www-data 3402 Sep  7 19:59 known_hosts
drwx------ 2 root www-data    0 Aug 19 11:24 lock
drwx------ 2 root www-data    0 Sep  6 10:01 metricserver
-rw------- 1 root www-data 3272 Aug 19 11:24 pve-root-ca.key
-rw------- 1 root www-data    3 Aug 25 15:35 pve-root-ca.srl
-rw------- 1 root www-data    2 Sep  7 13:27 tfa.cfg
-rw------- 1 root www-data  118 Sep  7 13:28 token.cfg

root@pve-node-03:/etc/pve/priv# last reboot
reboot   system boot  6.2.16-12-pve    Thu Sep  7 19:59   still running
reboot   system boot  6.2.16-12-pve    Wed Sep  6 13:49 - 18:42 (1+04:52)
 
I was able to trigger this by doing a reboot too.
Yes, the program behind the /etc/pve mount will restart on this, so maybe you can also trigger it by restarting it.

Now the question: What is the difference in those files? Are there just the old entries missing or how does it look like?
 
Yes, the program behind the /etc/pve mount will restart on this, so maybe you can also trigger it by restarting it.
Now the question: What is the difference in those files? Are there just the old entries missing or how does it look like?

There should be no difference across cluster nodes? The directory /etc/pve is fuse mounted and shared across all nodes in the cluster. What's not clear is how the known_hosts and authorized_keys is getting constructed on boot vs when running pvecm updatecerts
 
No, I checked all three nodes. The /etc/pve/priv directory files are all identical. I'll double check again next time it happens.
 
Not sure, but I seem to have it resolved for now? Steps I did:

  1. Used pvecm updatecerts and tested shell from the UI on all nodes. Everything was good.
  2. Backed up the /etc/pve/priv known_hosts and authorized_keys files
  3. Rebooted pve-node-03 and verified that after that no changes were found between the active and the backed up known_hosts and authorized_keys.
  4. Rebooted pve-node-01 and identified that known_hosts was updated after reboot.
  5. Examine of known_hosts showed 2 entries for node pve-node-03 with incorrect keys as compared to the known_hosts.bak made in step 2
  6. Manually cleaned up known_hosts to remove the incorrect keys rebooted pve-node-01 again. Everything survived the reboot.
 
It's not clear why a reboot of pve-node-01 was adding bad entries to the known_hosts file for pve-node-03, or where it was sourcing them from. There is no /root/.ssh/known_hosts on that pve-node-01 so not sure where it was picking up bad entries from.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!