Node in Cluster keeps going offline

LunarMagic

Member
Mar 14, 2024
43
4
8
I have been experiencing an issue with one of the nodes i just joined to my cluster. It was in an old cluster and worked perfectly fine. For some reason in this new cluster i keep getting this issue. I am able to get the error to explain itself when i'm migrating a vm so that's why in the beginning its showing about a migration failing.


Code:
drive-scsi0: transferred 13.3 GiB of 80.0 GiB (16.64%) in 1m 23s
client_loop: send disconnect: Broken pipe

drive-scsi0: Cancelling block job
drive-scsi0: Done.
2024-11-03 21:34:13 ERROR: online migrate failure - block job (mirror) error: drive-scsi0: Input/output error (io-status: ok)
2024-11-03 21:34:13 aborting phase 2 - cleanup resources
2024-11-03 21:34:13 migrate_cancel
2024-11-03 21:34:27 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=R830-4' -o 'UserKnownHostsFile=/etc/pve/nodes/R830-4/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.50.4 qm stop 103 --skiplock --migratedfrom R830-2' failed: exit code 255
2024-11-03 21:34:28 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=R830-4' -o 'UserKnownHostsFile=/etc/pve/nodes/R830-4/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.50.4 pvesm free Virtual-Machines:vm-103-disk-0' failed: exit code 255
2024-11-03 21:34:28 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=R830-4' -o 'UserKnownHostsFile=/etc/pve/nodes/R830-4/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.50.4 pvesm free Virtual-Machines:vm-103-disk-1' failed: exit code 255
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

Someone could be eavesdropping on you right now (man-in-the-middle attack)!

It is also possible that a host key has just been changed.

The fingerprint for the RSA key sent by the remote host is
SHA256:hLTfEMqqm7y8Z/NIsXR2tXHQq8AHG5XMTacxMGMG3vY.

Please contact your system administrator.

Add correct host key in /etc/pve/nodes/R830-4/ssh_known_hosts to get rid of this message.

Offending RSA key in /etc/pve/nodes/R830-4/ssh_known_hosts:1

  remove with:

  ssh-keygen -f "/etc/pve/nodes/R830-4/ssh_known_hosts" -R "r830-4"

Host key for r830-4 has changed and you have requested strict checking.

Host key verification failed.

2024-11-03 21:34:29 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=R830-4' -o 'UserKnownHostsFile=/etc/pve/nodes/R830-4/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.50.4 rm -f /run/qemu-server/103_nbd.migrate /run/qemu-server/103.migrate' failed: exit code 255
2024-11-03 21:34:29 ERROR: migration finished with problems (duration 00:01:50)
TASK ERROR: migration problems
 
I think that the key in the known_hosts file (line 1) is no longer valid and needs to be removed.

I would do this:
Code:
remove with:

  ssh-keygen -f "/etc/pve/nodes/R830-4/ssh_known_hosts" -R "r830-4"
 
So do i replace the key there with hLTfEMqqm7y8Z/NIsXR2tXHQq8AHG5XMTacxMGMG3vY ?

The command gives me this issue but i can edit it with nano

1730729075842.png
I think that the key in the known_hosts file (line 1) is no longer valid and needs to be removed.

I would do this:
Code:
remove with:

  ssh-keygen -f "/etc/pve/nodes/R830-4/ssh_known_hosts" -R "r830-4"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!