Proxmox SSH Keys & Migrations

C

Chris Rivera

Guest
With us having to take some nodes offline due to bad hardware and getting new nodes online to replace the old ones we are having issues migrating from 1 node to another.


some nodes complain:

Dec 05 10:24:16 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@63.***.***.153 /bin/true
Dec 05 10:24:16 Host key verification failed.
Dec 05 10:24:16 ERROR: migration aborted (duration 00:00:01): Can't connect to destination address using public key
TASK ERROR: migration aborted

#################

Dec 05 10:23:45 starting migration of CT 440 to node 'proxmox6' (63.***.***.156)
Dec 05 10:23:45 starting rsync phase 1
Dec 05 10:23:45 # /usr/bin/rsync -aHAX --delete --numeric-ids --sparse /var/lib/vz/private/440 root@63.***.***.156:/var/lib/vz/private
Dec 05 10:23:46 dump 2nd level quota
Dec 05 10:23:46 # vzdqdump 440 -U -G -T > /var/lib/vz/dump/quotadump.440
Dec 05 10:23:46 ERROR: Failed to dump 2nd level quota: sh: cannot create /var/lib/vz/dump/quotadump.440: Directory nonexistent
Dec 05 10:23:46 aborting phase 1 - cleanup resources
Dec 05 10:23:46 removing copied files on target node
Dec 05 10:23:46 start final cleanup
Dec 05 10:23:46 ERROR: migration aborted (duration 00:00:02): Failed to dump 2nd level quota: sh: cannot create /var/lib/vz/dump/quotadump.440: Directory nonexistent
[FONT=tahoma, arial, verdana, sans-serif]TASK ERROR: migration aborted[/FONT]

[FONT=tahoma, arial, verdana, sans-serif]####################[/FONT]



[FONT=tahoma, arial, verdana, sans-serif]Is it possible to run a command to remove all old keys and have the cluster regenerate the ssh keys to clear this up?[/FONT]

[FONT=tahoma, arial, verdana, sans-serif]This is holding up back from migrating and alleviating client issues.[/FONT]
 
if you rejoin hosts you need to use the --force flag.

see 'man pvecm'

how did YOU re-join your nodes?
 
i did the -force option.


all nodes are online and in sync but i cannot migrate anything. its weird cause some nodes can migrate to specific nodes while other nodes can migrate to other specific nodes.... so its kinda a guess game... find out what node can migrate to what node to the figure out what needs to be done to get a vm from node 1 to node 3.


node 1 -> node 8
node 8 -> node 3

sometimes this can be more complicated and take 3 or 4 steps.


Is it possible to have the cluster drop all keys and regenerate all keys again or will i need to remove all nodes from the cluster 1 at a time and re add them back using the -force option again?
 
don´t panic. read all logs (syslog) and error messages on node startup and gui, analyse your issues step by step. if you have errors, find the reason. if you run a cluster, make sure the cluster communication is up and running, check quorum.
 
i did that... then removed node 2 from the cluster.

trying to add it back but it will not work:


root@proxmox2:~# pvecm add 63.***.***.158
authentication key already exists


root@proxmox2:~# pvecm add 63.***.***.158 -force
unable to copy ssh ID
 
Last edited by a moderator:
Since connecting to node 1 failed i thought bout connecting it to node 3



root@proxmox2:~# pvecm add 63.***.***.158
authentication key already exists
root@proxmox2:~# pvecm add 63.***.***.158 -force
unable to copy ssh ID
root@proxmox2:~# pvecm add 63.***.***.160 -force
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-clusterfuse: failed to access mountpoint /etc/pve: Transport endpoint is not connected
[main] crit: fuse_mount error: Transport endpoint is not connected
[main] notice: exit proxmox configuration filesystem (-1)
(warning).
starting pve-cluster failed
root@proxmox2:~# service pve-cluster restart
Restarting pve cluster filesystem: pve-clusterstart-stop-daemon: warning: failed to kill 573902: No such process
fuse: failed to access mountpoint /etc/pve: Transport endpoint is not connected
[main] crit: fuse_mount error: Transport endpoint is not connected
[main] notice: exit proxmox configuration filesystem (-1)
(warning).
 
aftre 2 days of being down.... we re ran the commands and finally was able to have it join the cluster.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!