[SOLVED] Proxmox VE Cluster: noVNC console does not work for other servers

jwacalex · Jan 24, 2022

Good Nighttime,
I've set up cluster with three proxmox ve servers and joined them as a datacenter. to secure ssh I've restricted the root login via ssh with the following setting

Code:

PermitRootLogin no

in combination with

Code:

Match Address 10.0.0.0/24
    PermitRootLogin yes
    PasswordAuthentication yes

As a result I'm unable to use the noVNC console on the other servers, however I'm able ssh'ing on the console as root from one to another host.

My questions here is how I've to adjust the setting so noVNC would work over two nodes.

Thank you in advance

Solution:

The proxmox services revolves the hostname on each node itself
each node itself had configured both fqdn and hostname for external ip
Proxmox always uses the external ip.
This issue can be prevented by setting the ip addresses explicitly: see here

fabian · Jan 25, 2022

do you have multiple addresses for your nodes? which one does the hostname resolve to? likely PVE doesn't use an address from your matched subnet..

jwacalex · Jan 26, 2022

I've configured in /etc/hosts the fqdn and a ping resolves to the local address

fabian · Jan 26, 2022

PVE uses the resolved hostname as ssh connection target. e.g. what the followng will return:

perl -e 'use strict; use warnings; use PVE::Cluster; my $ip = PVE::Cluster::remote_node_ip("HOST"); print "$ip\n";' (replace HOST with actual target hostame)

jwacalex · Jan 26, 2022

Thanks for your prompt reply. I'm getting here the local ip within the 10.0.0.0/24 subnet for the hostname and the fqdn

fabian · Jan 27, 2022

could you post the full sshd config and the exact error you get? any log messages visible on either node when you attempt to open the console?

jwacalex · Jan 27, 2022

the /etc/ssh/sshd_config is the following

Code:

PermitRootLogin no
ChallengeResponseAuthentication no
UsePAM yes
X11Forwarding yes
PrintMotd no
PrintLastLog yes
TCPKeepAlive yes
AcceptEnv LANG LC_*
Subsystem    sftp    /usr/lib/openssh/sftp-server

Match Address 10.0.0.0/24
    PermitRootLogin yes
    PasswordAuthentication yes

The issue is in noVNC "Failure to connect to server".

in the syslog on host1

Code:

Jpvedaemon[14064]: starting vnc proxy UPID::000036F0:0B19F30A:61F26125:vncproxy:163:root@pam:
pvedaemon[22069]: <root@pam> starting task UPID::000036F0:0B19F30A:61F26125:vncproxy:163:root@pam:
pvedaemon[14064]: Failed to run vncproxy.
pvedaemon[22069]: <root@pam> end task UPID::000036F0:0B19F30A:61F26125:vncproxy:163:root@pam: Failed to run vncproxy.

on host2

Code:

sshd[3021]: ROOT LOGIN REFUSED FROM $EXTERNAL_IP_HOST1 port 36204
sshd[3021]: ROOT LOGIN REFUSED FROM $EXTERNAL_IP_HOST1 port 36204 [preauth]
sshd[3021]: Connection closed by authenticating user root $EXTERNAL_IP_HOST1 port 36204 [preauth]

fabian · Jan 27, 2022

sounds like it does use the external IP for some reason (any routing or ssh client config peculiarities that might explain it?).. you could try dumping the full command by adding

Code:

use Data::Dumper;
warn Dumper($cmd), "\n";

before the run_command here: https://git.proxmox.com/?p=qemu-ser...6af48cc51d04ac408895d01d9f9594a;hb=HEAD#l1904 (in /usr/share/perl5/PVE/API2/Qemu.pm) and reloading pveproxy/pvedaemon afterwards (systemctl reload pveproxy pvedaemon). re-installing qemu-server will revert to the stock code again (apt install --reinstall qemu-server).

jwacalex · Jan 27, 2022

fabian said:
sounds like it does use the external IP for some reason (any routing or ssh client config peculiarities that might explain it?).. you could try dumping the full command by adding

There is no special routing. I've rebooted the server already to clean any caches.

Regarding the dump: the ssh command itself looks fine, but there is the external IP in it.

Code:

          '/usr/bin/ssh',
          '-e',
          'none',
          '-o',
          'BatchMode=yes',
          '-o',
          'HostKeyAlias=host2',
          '-T',
          'root@EXTERNAL_IP',
          '/usr/sbin/qm',
          'vncproxy',
          '167'
        ];

fabian · Jan 27, 2022

could you try this updated command:

perl -e 'use strict; use warnings; use PVE::Cluster; PVE::Cluster::cfs_update(); my $ip = PVE::Cluster::remote_node_ip("HOST"); print "$ip\n";'

jwacalex · Jan 27, 2022

This command returns the interal ip for the fqdn, but the external one for the hostname

fabian · Jan 27, 2022

I guess that means your hostname doesn't resolve to the internal, but the external IP.. if you change that (and possibly restart pve-cluster) it should work.

jwacalex · Jan 28, 2022

Yeah, but on host level it resolves correctly. I've restarted pve cluster explicitly and as a result I'm unable to use the webconsole too, since it connects to the external ip.

the updated per command shows in both cases the local ip.

fabian · Jan 28, 2022

the IP is collected by pmxcfs on startup by resolving the local part of the hostname before the first '.':
https://git.proxmox.com/?p=pve-clus...614ee5fedd71bf440c6;hb=refs/heads/master#l726
https://git.proxmox.com/?p=pve-clus...614ee5fedd71bf440c6;hb=refs/heads/master#l845

and then broadcasted to the other nodes.. the current state can also be viewed in /etc/pve/.members - if you have the external IP(s) there, then your hostname doesn't resolve to your internal, but to your external IP..

jwacalex · Jan 28, 2022

Thanks for your answer - I think I've the problem, since each host resolves the ip on its own* and then it's joined, the /etc/hosts is bascially ignored…

* hostname -i returns an external ip

jwacalex · Jan 30, 2022

After some investigation, I've resolved the issue the following way:

explicit mappings in /etc/hosts for each local node with the local address
explicit mappings for fqdn in /etc/hosts for all nodes
distribution of /etc/hosts on all nodes
restarting pve-cluster.service on all nodes

Holt · Jun 10, 2022

I know this thread is a little older.
But I stumbled upon this problem as well after changing my /etc/ssh/sshd_config.
I fixed it by checking the output of the command /usr/bin/ssh -e none -T -o BatchMode=yes <IP of other VM> /usr/sbin/qm vncproxy <id of vm on that machine>
It returned, Host key verification failed..

So I checked the file /root/.ssh/known_hosts.
This file had all the correct entries, but it was ignored.
After trying to ssh directly to one of the nodes, I noticed that the file /etc/ssh/ssh_known_hosts was used instead of the file /root/.ssh/known_hosts.
I made sure that the entries in the file /root/.ssh/known_hosts were indeed correct and that it included all the entries for all nodes of the cluster. This also includes the node where the file is stored.

I then fixed it by copying the file /root/.ssh/known_hosts to /etc/ssh/ssh_known_hosts and then distributed that file to all the other nodes with rsync.
rsync -avh --info=progress2 /etc/ssh/ssh_known_hosts <IP of proxmox-node>:/etc/ssh/ssh_known_hosts
rsync -avh --info=progress2 /etc/ssh/ssh_known_hosts <IP of proxmox-node>:/root/.ssh/known_hosts

Then I finally restarted the service "sshd" by using systemctl restart sshd.service on every node and I ran the first command again.
Now I got the correct output and when checking the web gui on every node, everything worked normally.

Search

Search

[SOLVED] Proxmox VE Cluster: noVNC console does not work for other servers

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

fabian

Proxmox Staff Member

jwacalex

Member

jwacalex

Member

Holt

New Member

We value your privacy