Regarding the automatic offline issue of the Proxmox VE cluster

zml

New Member
Sep 1, 2025
4
0
1
As shown in the figure, after logging into the web interface, it can be seen that some nodes are in an offline state. However, the SSH connection to the nodes is possible. The system time has been checked and all have been synchronized. Even after restarting Proxmox VE, the failure persisted. Even at certain times, one couldn't log in to the web and received a message saying "permission denied", but after some time, it returned to normal.
1756707480362-png.90110
 

Attachments

  • 1756707480362.png
    1756707480362.png
    81.8 KB · Views: 34
Last edited:
This is normal and by design. You have lost your quorum (need more than half of nodes available) and can therefore not change anything.
I didn't quite understand your point. After I set up the cluster, it was able to function normally. However, this disconnection issue often occurs. Is this due to the design? What should I do to solve this problem?
 
Verify pvestatd status and pvecm status. Restart it if necessary: systemctl restart pvestatd.service.

Re-check your corosync connection for low latency. If possible establish a second "ring". The wires, that corosync uses shall never get saturated. Note that using separate VLANs does not help at all in this regard.
 
  • Like
Reactions: Johannes S
Verify pvestatd status and pvecm status. Restart it if necessary: systemctl restart pvestatd.service.

Re-check your corosync connection for low latency. If possible establish a second "ring". The wires, that corosync uses shall never get saturated. Note that using separate VLANs does not help at all in this regard.
Are you referring to the network interface? If so, I currently have two network interfaces connected and have joined them into a bond interface. By executing the "pvestatd status" command on the CDX-1 node, I found it to be in the "running" state. However, the "pvecm status" command only shows the current three online nodes. When executing "pvecm status" on the CDN-1 node, it only displays the node of CDN-1. Attempting to restart the "pvestatd" service on the faulty CDN-1 node also failed to restore it.
 
Last edited:
  • Like
Reactions: Johannes S
不建议对 corosync 使用这种方法。请尝试为 corosync 提供另一个单独的连接。

如果只有两个接口,则可能需要解除粘合......

编辑,另请参阅:https ://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_redundancy
After one night, the node status changed again, as shown in the picture.
1756777517428.png


The second picture shows the binding interface I created. I also made corresponding configurations on the switch. Will this also have an impact?
However, CDN-1 did not have a binding interface and thus also encountered problems.
1756777543634.png