Search results

  1. A

    Reboot initiated from Proxmox crashed the node.

    I suspect the bootloader in PVE is gone. I took out the SSD of the problematic node and place it into another similar server. It doesn’t boot up as well. It looks like the PVE boot loader is missing somehow. Any way we can resolve it?
  2. A

    Reboot initiated from Proxmox crashed the node.

    I did a reboot for one of the Proxmox nodes from GUI. The node is not booting up ( What I saw from IPMI is a boot error). But it was working fine till I initiated the reboot. Is it because the Reboot from Proxmox was incomplete? Boot error PXE-E51: No DHCP or proxyDHCP offers were received...
  3. A

    Network change caused the node to be down

    Hi @aaron , we have managed to revert the network changes made. The changes were done from the Proxmox GUI but the interfaces file at the backend was empty somehow. We have copied an interfaces file from another node and mold it to work for N7. N7 back online now. The change made earlier was...
  4. A

    Network change caused the node to be down

    Hi @aaron , Thank you for your response. I know the ID is 7and the name is 'satapx-xxx-n7'. So, executing the command "pvecm delnode satapx-xxx-n7" will completely remove the node from this Proxmox cluster? (after power down the node N7). Is there anything else that I should follow to keep...
  5. A

    Network change caused the node to be down

    Hi, By mistake, we made a network change to one of our proxmox nodes and it is totally down now. After referring to a few articles, I come to know that the network changes after the cluster setup may break the connection and we need to remove the node and set it up as a fresh one again. The...
  6. A

    Which network should use for Corosync?

    Thank you guys for your explanation. Regarding RING/LINK - If we want to have another ring/link, how should we do it?
  7. A

    Which network should use for Corosync?

    Hi, Currently, our Proxmox corosync in on public IP. Is this a best practice? A few articles mentioning about 'RING' while discussing corosync. # corosync-cfgtool -s Printing link status. Local node ID 5 LINK ID 0 addr = 117.xxx.x.x status: nodeid 1...
  8. A

    [SOLVED] 2 nodes stopped responding on port 8006

    I just moved back the existing SSL files and regenerated 'node files' & 'node certificate' using 'pvecm updatecerts --force'. Then after I did a 'service pvedaemon restart' which resolved the 'communication failure error'. The web GUI is accessible now.
  9. A

    [SOLVED] 2 nodes stopped responding on port 8006

    Hi @tburger, I think I got the issue. I ran 'systemctl status pveproxy' on both the nodes and both displaying an error like, Mar 16 01:25:59 sg1-n1 pveproxy[97079]: /etc/pve/local/pve-ssl.pem: failed to use local certificate chain (cert_file or cert) at...
  10. A

    [SOLVED] 2 nodes stopped responding on port 8006

    No, I cleared the cache already. Not working. And both node1 and 3 not displaying any options like 'summary'. Attached is a screenshot from node1
  11. A

    [SOLVED] 2 nodes stopped responding on port 8006

    Hi @tburger, Thanks for your reply. I examined two nodes and below are the results. Node1 ----------- :~# netstat -ntlp |grep 8006 tcp 129 0 0.0.0.0:8006 0.0.0.0:* LISTEN 85935/pveproxy work :~# telnet localhost 8006 Trying 127.0.0.1... Node3...
  12. A

    [SOLVED] 2 nodes stopped responding on port 8006

    Hi, Among 6 nodes, 2 nodes suddenly stopped responding on port 8006 (web interface is not working). Recently we had done updates on each node. Does anyone have this same experience? Any idea how to troubleshoot? Thanks in advance.