TASK ERROR: Failed to run vncproxy.

mhayhurst

Renowned Member
Jul 21, 2016
111
7
83
43
Hello everyone!

I have two Proxmox machines in a cluster (Promox1 and Proxmox2) both running Proxmox 5.3-5. If I log into Proxmox1's web UI and select any VM console in Proxmox2 then I receive this error:

Code:
Permission denied (publickey).

TASK ERROR: Failed to run vncproxy.

upload_2018-12-20_14-10-26.png


The same thing happens if I go to Proxmox2's web UI and select any VM console in Proxmox1. This appears to only be related to the console as everything else is working correctly.

I've also tried restarting pveproxy on both Proxmox1 and Proxmox2 and I can SSH from Proxmox1 to Proxmox2 and vice versa.

Would anyone be able to tell me how to correct this issue?
 
can you ssh without any password as root from Proxmox1 to Proxmox2 (and the other direction)?
PVE relies on ssh-public-key auth for some of its operations (including the vncproxy).

In any case you can try running `pvecm updatecerts` and see if this helps.
 
can you ssh without any password as root from Proxmox1 to Proxmox2 (and the other direction)?
PVE relies on ssh-public-key auth for some of its operations (including the vncproxy).

In any case you can try running `pvecm updatecerts` and see if this helps.

Yes, I can SSH from Proxmox1 --> Proxmox2 and vice-versa as root without using a password. I executed:
Code:
pvecm updatecerts
on both Proxmox1 and Proxmox2 and restarted both of them but that did not fix the issue. I see this in: /var/log/pveproxy/access.log

Code:
"GET /api2/json/nodes/proxmox2/qemu/109/vncwebsocket?port=5900&vncticket=PVEVNC%3A5C24...(random key characters) HTTP/1.1" 101 -
It does not appear there are any errors (at least in that log file) regarding the VNC connection.
 
The access.log line indicates that the websocket connection seems successful (HTTP code 101).

You could try running the ssh-command the invokes the vncproxy (on Proxmox1):
/usr/bin/ssh -e none -T -o BatchMode=yes 10.10.10.10 /usr/sbin/qm vncproxy $VMID

(replace 10.10.10.10 with Proxmox2 IP)

What's the output you get?
 
The access.log line indicates that the websocket connection seems successful (HTTP code 101).

You could try running the ssh-command the invokes the vncproxy (on Proxmox1):
/usr/bin/ssh -e none -T -o BatchMode=yes 10.10.10.10 /usr/sbin/qm vncproxy $VMID

(replace 10.10.10.10 with Proxmox2 IP)

What's the output you get?

I added some verbosity as it only showed:

Code:
root@proxmox1:~/.ssh# /usr/bin/ssh -e none -T -o BatchMode=yes 192.168.1.5 /usr/sbin/qm vncproxy 104
Permission denied (publickey).

Code:
root@proxmox1:~# /usr/bin/ssh -v -e none -T -o BatchMode=yes 192.168.1.5 /usr/sbin/qm vncproxy 104
OpenSSH_7.4p1 Debian-10+deb9u4, OpenSSL 1.0.2q  20 Nov 2018
debug1: Reading configuration data /root/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to 192.168.1.5 [192.168.1.5] port 22.
debug1: Connection established.
debug1: permanently_set_uid: 0/0
debug1: identity file /root/.ssh/id_rsa type 1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u4
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.4p1 Debian-10+deb9u4
debug1: match: OpenSSH_7.4p1 Debian-10+deb9u4 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 192.168.1.5:22 as 'root'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: aes128-ctr MAC: umac-64-etm@openssh.com compression: none
debug1: kex: client->server cipher: aes128-ctr MAC: umac-64-etm@openssh.com compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:H9ESbHJ9QEJpzJOBhpVs5tJ9skK3LWsp3r1uuuMxNVA
debug1: Host '192.168.1.5' is known and matches the ECDSA host key.
debug1: Found key in /root/.ssh/known_hosts:2
debug1: rekey after 4294967296 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey after 4294967296 blocks
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,ssh-rsa,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /root/.ssh/id_rsa
debug1: Server accepts key: pkalg ssh-rsa blen 279
debug1: Authentications that can continue: publickey
debug1: Trying private key: /root/.ssh/id_dsa
debug1: Trying private key: /root/.ssh/id_ecdsa
debug1: Trying private key: /root/.ssh/id_ed25519
debug1: No more authentication methods to try.
Permission denied (publickey).

The private key: id_rsa DOES exist in: /root/.ssh/

Code:
root@proxmox1:~# ls -alh /root/.ssh/id_rsa
-rw------- 1 root root 1.7K May 22  2018 /root/.ssh/id_rsa

If I change the id_rsa private key filename to: id_dsa then I can view any VM's console on Proxmox2 from Proxmox1's web UI (as they are both in a cluster), so that fixed it! But why? A few questions:
  1. It's not accepting the id_rsa (even though the key works) so I'm assuming that's a config setting?
  2. If so, I wonder why this changed as I did not alter any config settings?
  3. Why all the complaints of: debug1: key_load_public: No such file or directory is that an issue?
 
Last edited:
I guess something in the `sshd_config` on the servers might be the culprit - It seems there were some modifications from the shipped defaults
(password-authentication is disabled in your config, but enabled in the default config)

* check `/etc/ssh/sshd_config` for differences from the defaults
* try to let sshd log at level DEBUG (or DEBUG2) and see why it refuses the key
* try with more verbosity (`ssh -vvv` vs. `ssh -v`)
 
I guess something in the `sshd_config` on the servers might be the culprit - It seems there were some modifications from the shipped defaults
(password-authentication is disabled in your config, but enabled in the default config)

* check `/etc/ssh/sshd_config` for differences from the defaults
* try to let sshd log at level DEBUG (or DEBUG2) and see why it refuses the key
* try with more verbosity (`ssh -vvv` vs. `ssh -v`)


I apologize it's taken so long to get back to this but I was able to dedicate some time yesterday to troubleshooting this but the issue still remains.

I decided to start fresh so I set both Proxmox1 and Proxmox2 to allow password login: PasswordAuthentication yes. I then deleted everything in: ~/.ssh/, created new key pairs, copied the public keys to the appropriate box, made sure I was able to login using the key pairs and set: PasswordAuthentication no and rebooted. One thing I did notice is that after a reboot Proxmox created both:
id_rsa.pub and id_rsa key pairs in the ~/.ssh directory...which I found a bit odd and not sure why Proxmox did that?



However, here is what I discovered:

Proxmox1 can SSH to Proxmox2 and vice versa using the new SSH keypairs

Executing both:
Code:
/usr/bin/ssh -vvvv -e none -T -o BatchMode=yes proxmox2 /usr/sbin/qm vncproxy 104
Code:
/usr/bin/ssh -vvvv -e none -T -o BatchMode=yes 192.168.1.5 /usr/sbin/qm vncproxy 104

from Proxmox1 works:

Code:
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
RFB 003.008

Executing both:
Code:
/usr/bin/ssh -vvvv -e none -T -o BatchMode=yes proxmox1 /usr/sbin/qm vncproxy 100
Code:
/usr/bin/ssh -vvvv -e none -T -o BatchMode=yes 192.168.1.4 /usr/sbin/qm vncproxy 100

from Proxmox2 works:

Code:
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
RFB 003.008

I should mention, I remember these complaining about the /etc/ssh/ssh_known_hosts at first so I removed that file thinking Proxmox would recreate it but it has not. This is also odd as I initially removed this file and it was recreated after some time...not sure what's going on there?

Logging into Proxmox1's Web UI I'm able to successfully connect to Proxmox2's console but none of the console's for any of Proxmox2's VM's

Code:
Host key verification failed.
TASK ERROR: Failed to run vncproxy.

Logging into Proxmox2's Web UI I'm able to successfully connect to Proxmox1's console but none of the console's for any of Proxmox1's VM's

Code:
Host key verification failed.
TASK ERROR: Failed to run vncproxy.

This is extremely confusing at this point as I feel Proxmox has built in layers of complication to something that should be simple and straight forward. It appears the CLI SSH, CLI VNProxy SSH and Web UI VNC SSH are all doing something different. Do you or does anyone know what the Web UI is doing differently to connect via VNC to a VM's console?
 
This is extremely confusing at this point as I feel Proxmox has built in layers of complication to something that should be simple and straight forward. It appears the CLI SSH, CLI VNProxy SSH and Web UI VNC SSH are all doing something different. Do you or does anyone know what the Web UI is doing differently to connect via VNC to a VM's console?

The layers of complication PVE adds are the price for the usually quite comfortable cluster you get:
* /root/.ssh/authorized_keys is symlinked into `/etc/pve/priv` so that they are synchronized across the cluster - please don't change it or make them regular files, else debugging is far harder and issues like the current happen
* the key-pair /root/.ssh/id_rsa(.pub) is the one PVE uses for communicating within the cluster - please make sure that the public-key of that pair is in /etc/pve/priv/authorized_keys (and that this in turn is where /root/.ssh/authorized_keys points
* try logging with user root and the id_rsa private key from/to both nodes

The WebUI should do the same as the qm cli.
please also post the /etc/ssh/sshd_config from both servers
 
The layers of complication PVE adds are the price for the usually quite comfortable cluster you get:
* /root/.ssh/authorized_keys is symlinked into `/etc/pve/priv` so that they are synchronized across the cluster - please don't change it or make them regular files, else debugging is far harder and issues like the current happen
* the key-pair /root/.ssh/id_rsa(.pub) is the one PVE uses for communicating within the cluster - please make sure that the public-key of that pair is in /etc/pve/priv/authorized_keys (and that this in turn is where /root/.ssh/authorized_keys points
* try logging with user root and the id_rsa private key from/to both nodes

The WebUI should do the same as the qm cli.
please also post the /etc/ssh/sshd_config from both servers


Thank you for your reply! Looks like those files are not symlinked:

Code:
root@proxmox1:/etc/pve/priv# ls -alh
total 2.5K
drwx------ 2 root www-data    0 Dec 19  2017 .
drwxr-xr-x 2 root www-data    0 Dec 31  1969 ..
-rw------- 1 root www-data 1.7K Dec 19  2017 authkey.key
-rw------- 1 root www-data 1.6K Jan 11 18:37 authorized_keys
-rw------- 1 root www-data 1.6K Jan 11 18:37 known_hosts
drwx------ 2 root www-data    0 Dec 20  2017 lock
-rw------- 1 root www-data 3.2K Dec 19  2017 pve-root-ca.key
-rw------- 1 root www-data    3 Dec 27 08:57 pve-root-ca.srl

I will correct that and try again but had a couple of questions:

1. What is the authkey.key and does it need symlinked?

2. Is what I deleted: /etc/ssh/ssh_known_hosts the symlink of: /etc/pve/priv/known_hosts? If so, should I delete:
/etc/pve/priv/known_hosts as well and let Proxmox recreate it then symlink it?

3. Are the pve-root-ca.key and pve-root-ca.srl the certificates for the Web UI?

4. You said the WebUI should do the same as the qm cli but they appear to do different things as I'm getting different results (you can see that from my previous post). You would think the qm cli would fail since the Web UI fails and I'm missing symlinks but it does not. Am I overlooking something?

Also, here are my /etc/ssh/sshd_config from both Proxmox1 and Proxmox2

Proxmox1:

Code:
#    $OpenBSD: sshd_config,v 1.100 2016/08/15 12:32:04 naddy Exp $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented.  Uncommented options override the
# default value.

#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key

# Ciphers and keying
#RekeyLimit default none

# Logging
#SyslogFacility AUTH
#LogLevel INFO

# Authentication:

#LoginGraceTime 2m
PermitRootLogin yes
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

PubkeyAuthentication yes

# Expect .ssh/authorized_keys2 to be disregarded by default in future.
AuthorizedKeysFile    .ssh/authorized_keys .ssh/authorized_keys2

#AuthorizedPrincipalsFile none

#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody

# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes

# To disable tunneled clear text passwords, change to no here!
PasswordAuthentication no
PermitEmptyPasswords no

# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no

# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication.  Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes

#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#UseLogin no
#UsePrivilegeSeparation sandbox
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none

# no default banner path
#Banner none

# Allow client to pass locale environment variables
AcceptEnv LANG LC_*

# override default of no subsystems
Subsystem    sftp    /usr/lib/openssh/sftp-server

# Example of overriding settings on a per-user basis
#Match User anoncvs
#    X11Forwarding no
#    AllowTcpForwarding no
#    PermitTTY no
#    ForceCommand cvs server


Proxmox2:

Code:
#    $OpenBSD: sshd_config,v 1.100 2016/08/15 12:32:04 naddy Exp $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented.  Uncommented options override the
# default value.

#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key

# Ciphers and keying
#RekeyLimit default none

# Logging
#SyslogFacility AUTH
#LogLevel INFO

# Authentication:

#LoginGraceTime 2m
PermitRootLogin yes
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

PubkeyAuthentication yes

# Expect .ssh/authorized_keys2 to be disregarded by default in future.
AuthorizedKeysFile    .ssh/authorized_keys .ssh/authorized_keys2

#AuthorizedPrincipalsFile none

#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody

# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes

# To disable tunneled clear text passwords, change to no here!
PasswordAuthentication no
PermitEmptyPasswords no

# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no

# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication.  Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes

#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#UseLogin no
#UsePrivilegeSeparation sandbox
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none

# no default banner path
#Banner none

# Allow client to pass locale environment variables
AcceptEnv LANG LC_*

# override default of no subsystems
Subsystem    sftp    /usr/lib/openssh/sftp-server

# Example of overriding settings on a per-user basis
#Match User anoncvs
#    X11Forwarding no
#    AllowTcpForwarding no
#    PermitTTY no
#    ForceCommand cvs server
 
Thank you for your reply! Looks like those files are not symlinked:
it's the other way around - /root/.ssh/authorized_keys is a symlink to /etc/pve/priv/authorized_keys , i.e. /root/.ssh/autorized_keys -> /etc/pve/priv/authorized_keys.
* pve-root-ca.key, pve-root-ca.srl - are for the CA, which PVE sets up for the certificates for the nodes and their communication
* authkey.key - is for the webGUI - auth-tickets
* /etc/ssh/ssh_known_hosts is a symlink to /etc/pve/priv/known_hosts - just recreate that -and make sure all your cluster-nodes host-keys are in that file

* the ssh config looks ok.

does logging in as root (with id_rsa) work ?

EDIT: rephrased the symlink explanation
 
Last edited:
it's the other way around - /root/.ssh/authorized_keys is a symlink to /etc/pve/priv/authorized_keys , i.e. /root/.ssh/autorized_keys -> /etc/pve/priv/authorized_keys.
* pve-root-ca.key, pve-root-ca.srl - are for the CA, which PVE sets up for the certificates for the nodes and their communication
* authkey.key - is for the webGUI - auth-tickets
* /etc/ssh/ssh_known_hosts is a symlink to /etc/pve/priv/known_hosts - just recreate that -and make sure all your cluster-nodes host-keys are in that file

* the ssh config looks ok.

does logging in as root (with id_rsa) work ?

EDIT: rephrased the symlink explanation


Hello again Stoiko,

Believe it or not, starting with a clean slate again and by this I mean:

Code:
rm ~/.ssh/*
rm /etc/ssh/
rm /etc/ssh/ssh_known_hosts
rm /etc/pve/priv/authorized_keys
rm /etc/pve/priv/known_hosts

and then executing

Code:
pvecm updatecerts

created new SSH keypairs as well as symlinked the appropriate files on both Proxmox1 and Proxmox2

Code:
root@proxmox1:~# ls -alh ~/.ssh/
total 45K
drwxr-xr-x 2 root root    7 Feb 21 09:39 .
drwx------ 6 root root   13 Feb 21 11:19 ..
lrwxrwxrwx 1 root root   29 Feb 21 09:39 authorized_keys -> /etc/pve/priv/authorized_keys
-rw-r----- 1 root root  117 Feb 21 09:39 config
-rw------- 1 root root 1.7K Feb 21 09:39 id_rsa
-rw-r--r-- 1 root root  395 Feb 21 09:39 id_rsa.pub

Code:
root@proxmox1:~# ls -alh /etc/ssh/ssh_known_hosts
lrwxrwxrwx 1 root root 25 Feb 21 09:39 /etc/ssh/ssh_known_hosts -> /etc/pve/priv/known_hosts


Code:
root@proxmox1:~# ls -alh /etc/pve/priv/
total 2.5K
drwx------ 2 root www-data    0 Dec 19  2017 .
drwxr-xr-x 2 root www-data    0 Dec 31  1969 ..
-rw------- 1 root www-data 1.7K Dec 19  2017 authkey.key
-rw------- 1 root www-data  791 Feb 21 09:39 authorized_keys
-rw------- 1 root www-data 1.6K Feb 21 09:39 known_hosts
drwx------ 2 root www-data    0 Dec 20  2017 lock
-rw------- 1 root www-data 3.2K Dec 19  2017 pve-root-ca.key
-rw------- 1 root www-data    3 Dec 27 08:57 pve-root-ca.srl


However, I get this message when I SSH from Proxmox1 to Proxmox2 and vice versa?

Code:
root@proxmox1:~# ssh proxmox2
Warning: the ECDSA host key for 'proxmox2.jam.lan' differs from the key for the IP address '192.168.1.5'
Offending key for IP in /etc/ssh/ssh_known_hosts:4
Matching host key in /root/.ssh/known_hosts:1
Are you sure you want to continue connecting (yes/no)? yes


Not sure why ssh_known_hosts has entries for the hostnames as well as the IP addresses for both Proxmox1 and Proxmox2? The only thing I can think of is it has something to do with how I set the cluster up. I remember Proxmox's guide stating that IP addresses had to be used instead of hostnames for clusters.


Code:
root@proxmox1:~# cat /etc/ssh/ssh_known_hosts
proxmox1 ssh-rsa AAAAB3NzaC1yc2EA..........
192.168.1.4 ssh-rsa AAAAB3NzaC1.........
proxmox2 ssh-rsa AAAAB3NzaC1yc2EAA........
192.168.1.5 ssh-rsa AAAAB3NzaC1yc2EAA.......
 
Not sure why ssh_known_hosts has entries for the hostnames as well as the IP addresses for both Proxmox1 and Proxmox2? The only
ssh always saves the host-key for both the ip-address and the DNS-name you connect to (to prevent someone hijacking your DNS-name and presenting you with the wrong box where you enter your password)

Warning: the ECDSA host key for 'proxmox2.jam.lan' differs from the key for the IP address '192.168.1.5' Offending key for IP in /etc/ssh/ssh_known_hosts:4 Matching host key in /root/.ssh/known_hosts:1
this is probably the root cause - delete the IP<->key matching and reconnect.
any question of that sort will make vnproxy fail.

Hope this helps!
 
ssh always saves the host-key for both the ip-address and the DNS-name you connect to (to prevent someone hijacking your DNS-name and presenting you with the wrong box where you enter your password)


this is probably the root cause - delete the IP<->key matching and reconnect.
any question of that sort will make vnproxy fail.

Hope this helps!


Yes, you've been a great help...thank you!

I believe that's initially what put me in this dilemma. I understand I could delete the offending key but I never put that key in there, it appears executing: pvecm updatecerts and a combination of SSH or logging into the Web UI did all of that. So is this something in Proxmox that needs corrected?
 
Hello forum,

I have a similar problem: TASK ERROR: Failed to run vncproxy.

BUT I also get this error when trying to start a console to a VM on the very proxmox server I am connected to. So I get the error when trying to start a console to ANY VM in my cluster.

Proxmox 5.3-11
4 server cluster with Ceph

Has been running smoothly BEFORE updating from 5.2 to 5.3 :-(

Would anybody be able to give a hint what the problem could be?

Greetings,

hans

P.S. ssh works from any server to any other server in the cluster without problem. Only thing I noticed is that host names are NOT resolved. I have to ssh to the IP-address. Is that normal?
 
This seems like an unrelated problem!
pve needs to be able to resolve the hostnames of the cluster-nodes to ip-addresses and expects that it can:
the first thing would be to add all nodes to '/etc/hosts' on all nodes and try again!
 
Hello. I too am running into the original problem posted in this thread, specifically I'm getting
Code:
Host key verification failed.
when testing using the vncproxy command given above, yet I can ssh between the nodes using their host names and get details of the remote node and the VMs on it via the Web UI just fine.
I have a dual-stack (IPv4 and IPv6) network and I think the problem may be that the systems prefer IPv6 and there are no entries in /etc/pve/priv/known_hosts for the IPv6 addresses of the nodes. Indeed, if I put the IPv4 address instead of the host name into the vncproxy command, it works correctly.
So is there an automated way for pvecm or another tool to add IPv6 host keys to the shared file or must I add them manually?

Also, /root/.ssh/known_hosts has ecdsa-sha2-nistp256 keys while /etc/pve/priv/known_hosts has only ssh-rsa ones, if that matters.
 
Last edited:
I just tried fixing this again and now even if I use an IPv6 address in the vncproxy command,
Code:
/usr/bin/ssh -e none -T -o BatchMode=yes IP:V6:address:of:other:node /usr/sbin/qm vncproxy $VMID
the connection from the command line succeeds. (I see RFB 003.008.) I also tried the command using the host name and the FQDN of the other node. Both also succeed (though I had to allow the key in the latter case.) So why the heck is the Web console connection failing?! What else is happening?? Is there a way to get the Web interface to show precisely which user and command it's trying to execute and on which host?
 
Last edited:
Thanks to some folks in the IRC channel, I finally fixed this (at least partially: the remote Web console works again) by doing the following on the node whose Web page I'm connected to:
Code:
ssh -o HostKeyAlias=IPv4.address.of.other.node IPv4.address.of.other.node
ssh -o HostKeyAlias=IPv6.address.of.other.node IPv6.address.of.other.node
ssh -o HostKeyAlias=HostnameOfOtherNode HostnameOfOtherNode
For each one, delete the entry in /root/.ssh/known_hosts it complains about
 
  • Like
Reactions: Maksel