vnc does not work in cluster

sherbmeister

Member
Dec 7, 2022
49
3
8
Hi, I have three nodes in a cluster and when I try to VNC in the other two nodes's VMs, it doesn't work. Only the host node works. From my understanding, it's something to do with ssh? Can someone give a few pointers on how and what to set to have this working properly? Thanks
 
Hi, I have three nodes in a cluster and when I try to VNC in the other two nodes's VMs, it doesn't work. Only the host node works. From my understanding, it's something to do with ssh? Can someone give a few pointers on how and what to set to have this working properly? Thanks
Does the console of the other nodes (accessed from other than their own GUI) also fail to load up?
 
  • Like
Reactions: sherbmeister
Does the console of the other nodes (accessed from other than their own GUI) also fail to load up?
Thanks for your swift reply.

If I access the nodes from their own IPs GUI, it loads just fine. It only happens when I'm logged into the host node GUI for the cluster.
 
tried the updatecerts thing, it didn't really fix it for me.
I know, there's three options you have till it gets fixed officially:

1) Follow this "clean up all" manual intervention:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

2) Apply manually pending patch for the affected codepath:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-3#post-606413

3) Wait for the official patch.

I will just dropname @t.lamprecht here to summarize it affects not only migration/replication but also ssh proxy-ing across nodes.
 
Last edited:
Went ahead and tried
1) Follow this "clean up all" manual intervention:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

but it did not fix my problem.

I suppose I'll have to wait for a fix or a new version.
Oh, that's odd to be honest.

Can you confirm you actually did:

1) SSH in from outside of the cluster (e.g. your workstation) only.
2) Ran everywhere separately:
rm -rf ~/.ssh/known_hosts rm -rf /etc/ssh/ssh_known_hosts
3) Then once ran (from any node):
rm -rf /etc/pve/priv/known_hosts
4) Then again everywhese separately:
pvecm updatecerts

The order and not using SSH across node or GUI to connect is important.

Because if you ran all of the above, there's literally no keys left but the (most recent only) nodes' ones. So if you were still getting the same key error that would be odd. Can you post your exact error shown? Even if it looks similar or same like the other poster's, can you post your full error?
 
There's no error to be shown tbh, it's just VNC not connecting to the node's VM. I will try again when I get home from work, maybe I've done it wrong.
 
There's no error to be shown tbh, it's just VNC not connecting to the node's VM. I will try again when I get home from work, maybe I've done it wrong.
So it is the blank screen with (guessing from back of my mind) Error 1006 momentarily shown in the top? You can look down in the list of TASKS on the last failed one and open it for more details. It would be nice to see what was failing exactly. There will be an output of the command run, even if it's just termproxy exitcode 1 or something like that.
 
so when I go to console for noVNC, it just says: "Failed to connect to server"

In the logs I have :


Code:
ov 21 11:52:05 newton systemd[1]: Stopping user@0.service - User Manager for UID 0...
Nov 21 11:52:05 newton systemd[1079468]: Activating special unit exit.target...
Nov 21 11:52:05 newton systemd[1079468]: Stopped target default.target - Main User Target.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target basic.target - Basic System.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target paths.target - Paths.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target sockets.target - Sockets.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target timers.target - Timers.
Nov 21 11:52:05 newton systemd[1079468]: Closed dirmngr.socket - GnuPG network certificate management daemon.
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browsers).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
Nov 21 11:52:05 newton systemd[1079468]: Removed slice app.slice - User Application Slice.
Nov 21 11:52:05 newton systemd[1079468]: Reached target shutdown.target - Shutdown.
Nov 21 11:52:05 newton systemd[1079468]: Finished systemd-exit.service - Exit the Session.
Nov 21 11:52:05 newton systemd[1079468]: Reached target exit.target - Exit the Session.
Nov 21 11:52:05 newton systemd[1]: user@0.service: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: Stopped user@0.service - User Manager for UID 0.
Nov 21 11:52:05 newton systemd[1]: Stopping user-runtime-dir@0.service - User Runtime Directory /run/user/0...
Nov 21 11:52:05 newton systemd[1]: run-user-0.mount: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: user-runtime-dir@0.service: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: Stopped user-runtime-dir@0.service - User Runtime Directory /run/user/0.
Nov 21 11:52:05 newton systemd[1]: Removed slice user-0.slice - User Slice of UID 0.
Nov 21 11:52:05 newton systemd[1]: user-0.slice: Consumed 2.952s CPU time.
 
so when I go to console for noVNC, it just says: "Failed to connect to server"

Just to clarify, in the bottom of the GUI with the list of all TASKS performed, when you open that particular task having failed, you only see the above mentioned in the entire OUTPUT (example - note the task selected at the bottom to show up details):
1700571386308.png
In the logs I have :

Thanks for these, however there won't be anything in the machine logs for this issue unfortunately - if it is the one suspected.
 
Last edited:
Just to clarify, in the bottom of the GUI with the list of all TASKS performed, when you open that particular task having failed, you only see the above mentioned in the entire OUTPUT (example - note the task selected at the bottom to show up details):

View attachment 58475



Thanks for these, however there won't be anything in the machine logs for this issue unfortunately - if it is the one suspected.
oh! There's nothing there, just says TASK OK eventhough noVNC fails
 
Just double click the line at the bottom starting Nov 21 13:23 ... the OUTPUT will be shown presumably just TASK OK, but there's a tab for STATUS too.

Also, is this the case only for accessing a VM or also when you try to open the console of the node itself (say "Shell" under "newton") - I suppose this is GUI of another node all along, correct?
 
Just double click the line at the bottom starting Nov 21 13:23 ... the OUTPUT will be shown presumably just TASK OK, but there's a tab for STATUS too.

Also, is this the case only for accessing a VM or also when you try to open the console of the node itself (say "Shell" under "newton") - I suppose this is GUI of another node all along, correct?
oh apologies, just got that. also, shell works just fine. Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.
 

Attachments

  • Screenshot 2023-11-21 at 13.49.57.png
    Screenshot 2023-11-21 at 13.49.57.png
    321.7 KB · Views: 12
oh apologies, just got that. also, shell works just fine. Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.
Sorry too, it's a bit weird (I have v8 here, not familiar with the 7.4 GUI), but anyhow, I hope this is not futile ... if you go in the tree to the NODE from which you were running this task from, then check the SYSLOG (collapsed under SYSTEM on my v8) and scroll to the same time mark when this command was run, there's nothing more specific there either?
 
Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.

One thing here, there's no "main" node, even if it was the first node you used to create a cluster and others then "joined" in, they are all equal. The only thing is, if you are on one node (accessing it's GUI), for instance Auriga, and you want to VNC to some VM running on Yautja it will have to proxy through that node, which is why I thought originally it must have to do with the SSH.

But you are saying that from the GUI of Auriga, you actually can see the SHELL of e.g. Yautja or Newton just fine? Just the VMs access is toast?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!