Node showing as status unknown

  • Thread starter Thread starter Deleted member 188440
  • Start date Start date
D

Deleted member 188440

Guest
I have a small 3 node cluster. About a week or so ago I noticed randomly Node 2 will change statuses to unknown (the ? mark) along with any lxc or vm running on it. I can still access the cluster from Node2 mgmt IP while it's in this state along with and tabs and dashboards for node 2 but anything that is a running lxc on that node i can not access it's options.

I've upgraded this node to proxmox 8 to see if it would correct it but same issue persists. I've tried restarting the following services
pve-cluster
corosync
pvestatd
pveproxy
pvedaemon
which makes the node online for a few minutes before switching back to offline. lxc and storage remains as unknown after running. Restarting the node fixes the issue for a few hours to days
 

Attachments

  • Screenshot_20231004_091251.png
    Screenshot_20231004_091251.png
    64.8 KB · Views: 50
The unknown status is the result of network probe/exchange failure. You need to examine a few places to get a good picture:

a) cluster status at the time of the issue, from each node: pvecm status
b) ability to ping/ssh from each node to each node
c) log output from 10-15min prior to issue and forward, ie : journalctl --since 8:00

good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thanks. I restarted the node so i can have it working for now but I'll check in on it throughout the day until I can pinpoint the next status change
 
update
Found the source of the issue to be one of my containers. all journalctl logs from when the node went to ? status mentioned the particular container. I moved it to another host by itself to troubleshoot and this morning it made that new node change to ? status. will be looking into source more today