i cant access my second node anymore. Only my first node seem to work and i cant update anything on it.

cyqpann

Member
Nov 18, 2021
29
0
6
44
Hello,

I have two nodes in a Proxmox cluster: alpha and bravo. Initially, I think I accidentally deleted the domain certificate on the alpha node by doing wierd shady things and experiement some stuffs... when i tried to log back in i had a err cert thing and wasnt able at all. i had to reset npm and do this below.

I reset the pveproxy-ssl.key and ssl.key files, then ran the following commands on alpha:

pvecm updatecerts --force
systemctl restart pveproxy

Everything went back to normal on alpha. I reapplied the certificate through NGINX Proxy Manager, and the node is now back online with ssl.


However, on bravo, I see a green checkmark and I can click on all its tabs, but it always says "server offline."

i cannot log via ssh into bravo, i don't know why, i never was able.. When I log into the datacenter then choose the bravo node, it also shows "server offline," but I still have shell access through the Datacenter view in the Proxmox web interface. SSH access is no longer working on 10.0.100.11
I tried running the same commands on bravo:

pvecm updatecerts --force
systemctl restart pveproxy

then rebooted the node, but the issue remains the same.

here my pvecm status of bravo.

root@bravo:/etc/ssl/certs# pvecm status
Cluster information
-------------------
Name: Koralie
Config Version: 2
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon May 19 13:43:58 2025
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000002
Ring ID: 1.51
Quorate: Yes

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 fd10:0:0:150::10%31849
0x00000002 1 fd10:0:0:150::11%31849 (local)
root@bravo:/etc/ssl/certs#

Can you help me bring back functionnal bravo without destroying alpha who hold all my lxc container.
 
If someone know what i can do to fix or need to do to get some help, i really think this cause me some problem atm, because i cannot update any lxc container or vm and my node doesnt ping 8.8.8.8

also my bad, it says: Connection error - server offline in every tab of this node except the shell.
 
Hi!

Are there any messages relevant to this in the syslog? Are there any errors when you open the Dev Tools (F12) when you're on one of the WebGUI in the browser? Have you tried restarting both the API server and its proxy (systemctl restart pvedaemon pveproxy)?
 
Hello,

i was able to gather this in the picture. If you need me to use the shell for bravo, i can do it via the access i have or physically, i did 3 physical reboot on it so far with all the systemctl restart possible. on physical monitor, it now offert me 2 kernels, the latest one and the newest one. Since the alpha and bravo upgrade saturday , for all update pending, i think it was working and i lost the connection to bravo when i removed the expired cert from alpha. im not quite sure, since i never go on bravo to make a 1:1 backup of all alpha lxc container and vm.

this part never work so far but
alpha internal is 10.0.100.10/24
bravo is 10.0.100.11/24
there a bond0 created between them with sfp card
let me know if you need more informations

thx
 

Attachments

  • msedge_j8JiowHglP.png
    msedge_j8JiowHglP.png
    309.9 KB · Views: 6
  • msedge_N41p9DR6RD.png
    msedge_N41p9DR6RD.png
    801.7 KB · Views: 7
  • msedge_OqQ6eQG6i1.png
    msedge_OqQ6eQG6i1.png
    275 KB · Views: 7
  • msedge_OrQ0Rcmdal.png
    msedge_OrQ0Rcmdal.png
    384.3 KB · Views: 7
  • msedge_S59YOBQ4s4.png
    msedge_S59YOBQ4s4.png
    196.1 KB · Views: 6
  • msedge_sCxGiSrTZn.png
    msedge_sCxGiSrTZn.png
    52.4 KB · Views: 6
bump :) i honestly dont know how to fix this. if you have an idea, let me know because, thoses logs and stuffs are totaly new stuff for me