Can anyone please help me! I am burning out! Issues after deleting and upgrading node.

jtwicker23

New Member
Mar 9, 2024
5
0
1
TLDR: Deleted and upgraded a node and now I get Host key verification failed on old nodes and cannot view the console on my VM due to failed to connect to server error.

Back story:
I setup my config many moons ago, but always wanted to have 3 nodes for some replication/HA. Worst timing ever, but I came down with COVID, so I get to spend the holidays mostly alone which is beyond depressing for me. Now that I am feeling well enough to get out of bed, I am taking this alone time getting everything configured correctly and I have ran into issue after issue further depressing me. I realized I should have slowed down and read more documentation before going forward. but I am hopeful someone is jolly enough to help! :)

What I did was backed up my VMs to external NAS and restored them onto one of my ZFS Nodes. It took me a while to figure that out and I am not 100% I did it to the books, but all the VMs are working correctly and most importantly my Pi-hole is running.

Next, I took the original node offline, and "reimaged' the computer with the latest version of prox via USB. Once I got that setup, I realized I didn't remove the old node fully and had to research that. Finally got that removed.

Now when I try to go to the Shell via the GUI I get Host Key verification failed. I also cannot see my VMs console as I get a failed to connect to server error. I can reach the nodes via SSH though. I have tried all the commands I can find online to edit the host files and what not, but nothing seems to work. I then thought it was because of a mismatch of the node versions because the newest one works fine. It's just the oldest two that are not.

I kept trying to get updates, but they would fail. I think it might have been my pi hole, so I set the network interface gateway on all nodes to 8.8.8.8 and ran apt-get dist-upgrade and this thing has been going forever. Hopefully this is the fix, but I am starting to get dangerously tired and need to rest before my symptoms kick my ass even more, so I figured I would make this post and hope someone can tell me if I am going down the right path or if there are other suggestions I should try when I get up tomorrow.

It would really mean a lot. Happy Holidays!
 
Last edited:
So I assume:

1. You have already added the new node correctly to the cluster.
2. You have finally deleted/removed that old node.
3. All nodes have been rebooted.

Have you tried the following:

Code:
pvecm updatecerts
systemctl restart pveproxy

As a further step you could try:
Code:
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "<new.node.ip>" on every node

so I set the network interface gateway on all nodes to 8.8.8.8
I don't understand - you should choose the actual GW of that NW e.g. 192.168.1.1 etc.
 
Sounds like your newly installed host has dns resolution issues.

check that:
/etc/hosts entry for your hostname matches hostname, as well as the hostnames of the rest of the cluster resolving the ip addresses you use for corosync
your /etc/resolv.conf has valid dns servers. use dig to verify.
that both systemctl status pvestatd and pveproxy show up and running.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!