vnc does not work in cluster

sherbmeister · Nov 20, 2023

Hi, I have three nodes in a cluster and when I try to VNC in the other two nodes's VMs, it doesn't work. Only the host node works. From my understanding, it's something to do with ssh? Can someone give a few pointers on how and what to set to have this working properly? Thanks

tempacc375924 · Nov 20, 2023

sherbmeister said:
Hi, I have three nodes in a cluster and when I try to VNC in the other two nodes's VMs, it doesn't work. Only the host node works. From my understanding, it's something to do with ssh? Can someone give a few pointers on how and what to set to have this working properly? Thanks

Does the console of the other nodes (accessed from other than their own GUI) also fail to load up?

sherbmeister · Nov 20, 2023

tempacc375924 said:
Does the console of the other nodes (accessed from other than their own GUI) also fail to load up?

Thanks for your swift reply.

If I access the nodes from their own IPs GUI, it loads just fine. It only happens when I'm logged into the host node GUI for the cluster.

tempacc375924 · Nov 20, 2023

sherbmeister said:
Thanks for your swift reply.

If I access the nodes from their own IPs GUI, it loads just fine. It only happens when I'm logged into the host node GUI for the cluster.

This symptom? https://forum.proxmox.com/threads/ssh-keys-across-nodes.136437/#post-605388

sherbmeister · Nov 20, 2023

tried the updatecerts thing, it didn't really fix it for me.

tempacc375924 · Nov 20, 2023

sherbmeister said:
tried the updatecerts thing, it didn't really fix it for me.

I know, there's three options you have till it gets fixed officially:

1) Follow this "clean up all" manual intervention:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

2) Apply manually pending patch for the affected codepath:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-3#post-606413

3) Wait for the official patch.

I will just dropname @t.lamprecht here to summarize it affects not only migration/replication but also ssh proxy-ing across nodes.

tempacc375924 · Nov 20, 2023

Also! see https://forum.proxmox.com/threads/ssh-keys-across-nodes.136437/#post-605855 and the "If you have own keys" para if you have something else than cluster nodes there. This is only important to note if you take the path (1) above.

Patch for option (2) linked from here https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-606349 - it is the attachment to the comment in Bugzilla. It is NOT OFFICIAL.

(EDITs: Important notes regarding the options.)

sherbmeister · Nov 21, 2023

Went ahead and tried
1) Follow this "clean up all" manual intervention:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

but it did not fix my problem.

I suppose I'll have to wait for a fix or a new version.

tempacc375924 · Nov 21, 2023

sherbmeister said:
Went ahead and tried
1) Follow this "clean up all" manual intervention:
https://forum.proxmox.com/threads/pvecm-updatecert-f-not-working.135812/page-2#post-604699

but it did not fix my problem.

I suppose I'll have to wait for a fix or a new version.

Oh, that's odd to be honest.

Can you confirm you actually did:

1) SSH in from outside of the cluster (e.g. your workstation) only.
2) Ran everywhere separately:

rm -rf ~/.ssh/known_hosts
rm -rf /etc/ssh/ssh_known_hosts

3) Then once ran (from any node):
rm -rf /etc/pve/priv/known_hosts
4) Then again everywhese separately:
pvecm updatecerts

The order and not using SSH across node or GUI to connect is important.

Because if you ran all of the above, there's literally no keys left but the (most recent only) nodes' ones. So if you were still getting the same key error that would be odd. Can you post your exact error shown? Even if it looks similar or same like the other poster's, can you post your full error?

sherbmeister · Nov 21, 2023

There's no error to be shown tbh, it's just VNC not connecting to the node's VM. I will try again when I get home from work, maybe I've done it wrong.

tempacc375924 · Nov 21, 2023

sherbmeister said:
There's no error to be shown tbh, it's just VNC not connecting to the node's VM. I will try again when I get home from work, maybe I've done it wrong.

So it is the blank screen with (guessing from back of my mind) Error 1006 momentarily shown in the top? You can look down in the list of TASKS on the last failed one and open it for more details. It would be nice to see what was failing exactly. There will be an output of the command run, even if it's just termproxy exitcode 1 or something like that.

sherbmeister · Nov 21, 2023

so when I go to console for noVNC, it just says: "Failed to connect to server"

In the logs I have :

Code:

ov 21 11:52:05 newton systemd[1]: Stopping user@0.service - User Manager for UID 0...
Nov 21 11:52:05 newton systemd[1079468]: Activating special unit exit.target...
Nov 21 11:52:05 newton systemd[1079468]: Stopped target default.target - Main User Target.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target basic.target - Basic System.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target paths.target - Paths.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target sockets.target - Sockets.
Nov 21 11:52:05 newton systemd[1079468]: Stopped target timers.target - Timers.
Nov 21 11:52:05 newton systemd[1079468]: Closed dirmngr.socket - GnuPG network certificate management daemon.
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-browser.socket - GnuPG cryptographic agent and passphrase cache (access for web browsers).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-extra.socket - GnuPG cryptographic agent and passphrase cache (restricted).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent-ssh.socket - GnuPG cryptographic agent (ssh-agent emulation).
Nov 21 11:52:05 newton systemd[1079468]: Closed gpg-agent.socket - GnuPG cryptographic agent and passphrase cache.
Nov 21 11:52:05 newton systemd[1079468]: Removed slice app.slice - User Application Slice.
Nov 21 11:52:05 newton systemd[1079468]: Reached target shutdown.target - Shutdown.
Nov 21 11:52:05 newton systemd[1079468]: Finished systemd-exit.service - Exit the Session.
Nov 21 11:52:05 newton systemd[1079468]: Reached target exit.target - Exit the Session.
Nov 21 11:52:05 newton systemd[1]: user@0.service: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: Stopped user@0.service - User Manager for UID 0.
Nov 21 11:52:05 newton systemd[1]: Stopping user-runtime-dir@0.service - User Runtime Directory /run/user/0...
Nov 21 11:52:05 newton systemd[1]: run-user-0.mount: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: user-runtime-dir@0.service: Deactivated successfully.
Nov 21 11:52:05 newton systemd[1]: Stopped user-runtime-dir@0.service - User Runtime Directory /run/user/0.
Nov 21 11:52:05 newton systemd[1]: Removed slice user-0.slice - User Slice of UID 0.
Nov 21 11:52:05 newton systemd[1]: user-0.slice: Consumed 2.952s CPU time.

tempacc375924 · Nov 21, 2023

sherbmeister said:
so when I go to console for noVNC, it just says: "Failed to connect to server"

Just to clarify, in the bottom of the GUI with the list of all TASKS performed, when you open that particular task having failed, you only see the above mentioned in the entire OUTPUT (example - note the task selected at the bottom to show up details):

sherbmeister said:
In the logs I have :

Thanks for these, however there won't be anything in the machine logs for this issue unfortunately - if it is the one suspected.

sherbmeister · Nov 21, 2023

tempacc375924 said:
Just to clarify, in the bottom of the GUI with the list of all TASKS performed, when you open that particular task having failed, you only see the above mentioned in the entire OUTPUT (example - note the task selected at the bottom to show up details):

View attachment 58475

Thanks for these, however there won't be anything in the machine logs for this issue unfortunately - if it is the one suspected.

oh! There's nothing there, just says TASK OK eventhough noVNC fails

tempacc375924 · Nov 21, 2023

sherbmeister said:
oh! There's nothing there, just says TASK OK eventhough noVNC fails

Do you mind showing a screenshot of the STATUS tab of the same (where the OUTPUT says only TASK OK)?

sherbmeister · Nov 21, 2023

tempacc375924 said:
Do you mind showing a screenshot of the STATUS tab of the same (where the OUTPUT says only TASK OK)?

sure

tempacc375924 · Nov 21, 2023

Just double click the line at the bottom starting Nov 21 13:23 ... the OUTPUT will be shown presumably just TASK OK, but there's a tab for STATUS too.

Also, is this the case only for accessing a VM or also when you try to open the console of the node itself (say "Shell" under "newton") - I suppose this is GUI of another node all along, correct?

sherbmeister · Nov 21, 2023

tempacc375924 said:
Just double click the line at the bottom starting Nov 21 13:23 ... the OUTPUT will be shown presumably just TASK OK, but there's a tab for STATUS too.

Also, is this the case only for accessing a VM or also when you try to open the console of the node itself (say "Shell" under "newton") - I suppose this is GUI of another node all along, correct?

oh apologies, just got that. also, shell works just fine. Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.

tempacc375924 · Nov 21, 2023

sherbmeister said:
oh apologies, just got that. also, shell works just fine. Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.

Sorry too, it's a bit weird (I have v8 here, not familiar with the 7.4 GUI), but anyhow, I hope this is not futile ... if you go in the tree to the NODE from which you were running this task from, then check the SYSLOG (collapsed under SYSTEM on my v8) and scroll to the same time mark when this command was run, there's nothing more specific there either?

tempacc375924 · Nov 21, 2023

sherbmeister said:
Apprently, even if I select a VM in Newton node or Auriga Node, the cluster GUI itself tries to do it through Yautja node which is the main node of the cluster.

One thing here, there's no "main" node, even if it was the first node you used to create a cluster and others then "joined" in, they are all equal. The only thing is, if you are on one node (accessing it's GUI), for instance Auriga, and you want to VNC to some VM running on Yautja it will have to proxy through that node, which is why I thought originally it must have to do with the SSH.

But you are saying that from the GUI of Auriga, you actually can see the SHELL of e.g. Yautja or Newton just fine? Just the VMs access is toast?

vnc does not work in cluster

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Attachments

Member

Member

Attachments

Member

Member