I've upgraded my cluster to Proxmox 5 from 4.4, following the upgrade instructions verbatim. I had no errors encountered during the upgrade. Before the upgrade, my cluster was fully working with no issues, including the use of pve-zsync. Hostnames and IPs on the cluster members have never changed since Proxmox was installed on them. After upgrading I tried to set up Storage Replication, and am running into the following error whenever a job tries to run (full text from the replication log):
This is on a replication job from "KGPE-D16" (host A, 192.168.10.27) to "813MTQ" (host B, 192.168.10.14). Any replication jobs in the opposite direction have the same error. If I log into either host (KGPE-D16 in this example) and run the command manually, I get:
2017-07-05 03:36:01 100-0: start replication job
2017-07-05 03:36:01 100-0: guest => VM 100, running => 0
2017-07-05 03:36:01 100-0: volumes => SSDs:vm-100-disk-1
2017-07-05 03:36:01 100-0: (remote_prepare_local_job) Host key verification failed.
2017-07-05 03:36:01 100-0: (remote_prepare_local_job)
2017-07-05 03:36:01 100-0: end replication job with error: command '/usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=813MTQ' root@192.168.10.14 -- pvesr prepare-local-job 100-0 SSDs:vm-100-disk-1 --last_sync 0' failed: exit code 255
2017-07-05 03:36:01 100-0: guest => VM 100, running => 0
2017-07-05 03:36:01 100-0: volumes => SSDs:vm-100-disk-1
2017-07-05 03:36:01 100-0: (remote_prepare_local_job) Host key verification failed.
2017-07-05 03:36:01 100-0: (remote_prepare_local_job)
2017-07-05 03:36:01 100-0: end replication job with error: command '/usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=813MTQ' root@192.168.10.14 -- pvesr prepare-local-job 100-0 SSDs:vm-100-disk-1 --last_sync 0' failed: exit code 255
This is on a replication job from "KGPE-D16" (host A, 192.168.10.27) to "813MTQ" (host B, 192.168.10.14). Any replication jobs in the opposite direction have the same error. If I log into either host (KGPE-D16 in this example) and run the command manually, I get:
root@KGPE-D16:~# /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=813MTQ' root@192.168.10.14
Host key verification failed.
But if I run it without the HostKeyAlias option I get:Host key verification failed.
root@KGPE-D16:~# /usr/bin/ssh -o 'BatchMode=yes' root@192.168.10.14
Linux 813MTQ 4.10.15-1-pve #1 SMP PVE 4.10.15-15 (Fri, 23 Jun 2017 08:57:55 +0200) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Jul 5 02:12:53 2017 from 192.168.10.199
root@813MTQ:~#
I have looked up the "Host key verification failed" message and have run a few commands that came up in the results, including "ssh_keygen -R 192.168.10.14" and adding a host section (including HostKeyAlias) into /root/.ssh/config - nothing has seemed to help. Here is my apt (using no-subscription repo, this is the test cluster):Linux 813MTQ 4.10.15-1-pve #1 SMP PVE 4.10.15-15 (Fri, 23 Jun 2017 08:57:55 +0200) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Jul 5 02:12:53 2017 from 192.168.10.199
root@813MTQ:~#
root@813MTQ:~# apt update
Ign:1 http://ftp.us.debian.org/debian stretch InRelease
Hit:2 http://security.debian.org stretch/updates InRelease
Hit:3 http://ftp.us.debian.org/debian stretch Release
Hit:5 http://download.proxmox.com/debian stretch InRelease
Reading package lists... Done
Building dependency tree
Reading state information... Done
All packages are up to date.
This seems to be either a bug with the ssh command that Storage Replication is using (HostKeyAlias), or possibly with the upgrade from 4.4 to 5.0, but if there is anything I can do to fix this locally I'm willing to try.Ign:1 http://ftp.us.debian.org/debian stretch InRelease
Hit:2 http://security.debian.org stretch/updates InRelease
Hit:3 http://ftp.us.debian.org/debian stretch Release
Hit:5 http://download.proxmox.com/debian stretch InRelease
Reading package lists... Done
Building dependency tree
Reading state information... Done
All packages are up to date.