After Updating, grey question marks on UI. VMs LCXs and Ceph continue to run in the background properly.

mtesm

New Member
Apr 22, 2025
5
0
1
Hey, after updating to the latest 8.4 (I didn't realize I was doing this when performing updates via the UI) my entire cluster now shows grey question marks instead of the status and no longer shows the VM and LXC names or statuses.

If I run

- service pvedaemon restart
- service pvestatd restart

They change to green checkmarks again for a bit, but then resume being grey question marks.

Any ideas as to what I can do to troubleshoot or resolve this? I'd be happy to provide any information I can.

(Definitely not doing any updates blindly again in the future)
 
look at the logs of pvestatd, they likely contain a clue why it's unable to collect status information
 
I took one node out of the cluster already so I have a useful node to work with so that's why only two out of three are showing up.

Where do I find the logs for pvestatd?

Code:
root@small1:~# pvecm status
Cluster information
-------------------
Name:             bigcluster
Config Version:   7
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Fri Apr 25 17:11:48 2025
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000003
Ring ID:          1.211
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.1.9
0x00000003          1 192.168.1.5 (local)
 
journalctl -b -u pvestatd and/or systemctl status pvestatd would be a starting point
 
I just see when I was starting and stopping things a little while ago to try to troubleshoot the issue.

Code:
root@small1:~# journalctl -b -u pvestatd
Apr 22 02:29:42 small1 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 02:29:42 small1 pvestatd[1249]: starting server
Apr 22 02:29:42 small1 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 02:29:58 small1 pvestatd[1249]: storage 'shelfDownloads' is not online
Apr 22 02:30:05 small1 pvestatd[1249]: storage 'shelfMedia' is not online
Apr 22 02:41:24 small1 systemd[1]: Reloading pvestatd.service - PVE Status Daemon...
Apr 22 02:41:24 small1 pvestatd[4153]: send HUP to 1249
Apr 22 02:41:24 small1 pvestatd[1249]: received signal HUP
Apr 22 02:41:24 small1 systemd[1]: Reloaded pvestatd.service - PVE Status Daemon.
Apr 22 02:51:42 small1 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Apr 22 02:51:43 small1 pvestatd[1249]: received signal TERM
Apr 22 02:51:43 small1 pvestatd[1249]: server closing
Apr 22 02:51:43 small1 pvestatd[1249]: server stopped
Apr 22 02:51:44 small1 systemd[1]: pvestatd.service: Deactivated successfully.
Apr 22 02:51:44 small1 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Apr 22 02:51:44 small1 systemd[1]: pvestatd.service: Consumed 1.343s CPU time.
Apr 22 02:51:44 small1 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 02:51:44 small1 pvestatd[5565]: starting server
Apr 22 02:51:44 small1 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 11:05:06 small1 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Apr 22 11:05:07 small1 pvestatd[5565]: received signal TERM
Apr 22 11:05:07 small1 pvestatd[5565]: server closing
Apr 22 11:05:07 small1 pvestatd[5565]: server stopped
Apr 22 11:05:08 small1 systemd[1]: pvestatd.service: Deactivated successfully.
Apr 22 11:05:08 small1 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Apr 22 11:05:08 small1 systemd[1]: pvestatd.service: Consumed 2.679s CPU time.
Apr 22 11:05:08 small1 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 11:05:08 small1 pvestatd[173462]: starting server
Apr 22 11:05:08 small1 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 11:05:18 small1 pvestatd[173462]: modified cpu set for lxc/101: 0
Apr 22 11:05:18 small1 pvestatd[173462]: modified cpu set for lxc/905: 1-2

root@small1:~# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
     Active: active (running) since Tue 2025-04-22 11:05:08 EDT; 6 days ago
    Process: 173458 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 173462 (pvestatd)
      Tasks: 3 (limit: 115464)
     Memory: 104.4M
        CPU: 31.904s
     CGroup: /system.slice/pvestatd.service
             ├─173462 pvestatd
             ├─173536 /bin/mount -t nfs box:/mnt/user/media /mnt/pve/media
             └─173537 /sbin/mount.nfs box:/mnt/user/media /mnt/pve/media -o rw

Apr 22 11:05:08 small1 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 11:05:08 small1 pvestatd[173462]: starting server
Apr 22 11:05:08 small1 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 11:05:18 small1 pvestatd[173462]: modified cpu set for lxc/101: 0
Apr 22 11:05:18 small1 pvestatd[173462]: modified cpu set for lxc/905: 1-2
 
what about the other nodes? could you also check the "pve-cluster" service?
 
omitted some information to make it fit into a post.

It was just
small3 pmxcfs[1209]: [status] notice: received log
small3 pmxcfs[1209]: [dcdb] notice: data verification successful
over and over again.

Code:
root@small3:~# journalctl -b -u pvestatd
Apr 22 02:30:38 small3 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 02:30:39 small3 pvestatd[1453]: starting server
Apr 22 02:30:39 small3 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 02:30:49 small3 pvestatd[1453]: mount error: Job failed. See "journalctl -xe" for details.
Apr 22 02:53:43 small3 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Apr 22 02:53:43 small3 pvestatd[1453]: received signal TERM
Apr 22 02:53:43 small3 pvestatd[1453]: server closing
Apr 22 02:53:43 small3 pvestatd[1453]: server stopped
Apr 22 02:53:44 small3 systemd[1]: pvestatd.service: Deactivated successfully.
Apr 22 02:53:44 small3 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Apr 22 02:53:44 small3 systemd[1]: pvestatd.service: Consumed 1.036s CPU time.
Apr 22 02:53:44 small3 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 02:53:45 small3 pvestatd[6153]: starting server
Apr 22 02:53:45 small3 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Apr 22 02:53:55 small3 pvestatd[6153]: modified cpu set for lxc/999: 0-1
Apr 22 11:09:54 small3 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Apr 22 11:09:54 small3 pvestatd[6153]: received signal TERM
Apr 22 11:09:54 small3 pvestatd[6153]: server closing
Apr 22 11:09:54 small3 pvestatd[6153]: server stopped
Apr 22 11:09:55 small3 systemd[1]: pvestatd.service: Deactivated successfully.
Apr 22 11:09:55 small3 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Apr 22 11:09:55 small3 systemd[1]: pvestatd.service: Consumed 2.467s CPU time.
Apr 22 11:09:55 small3 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 11:09:56 small3 pvestatd[72520]: starting server
Apr 22 11:09:56 small3 systemd[1]: Started pvestatd.service - PVE Status Daemon.
root@small3:~# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
     Active: active (running) since Tue 2025-04-22 11:09:56 EDT; 1 week 0 days ago
   Main PID: 72520 (pvestatd)
      Tasks: 3 (limit: 115464)
     Memory: 104.5M
        CPU: 32.482s
     CGroup: /system.slice/pvestatd.service
             ├─72520 pvestatd
             ├─72684 /bin/mount -t nfs box:/mnt/user/media /mnt/pve/media
             └─72685 /sbin/mount.nfs box:/mnt/user/media /mnt/pve/media -o rw

Apr 22 11:09:55 small3 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Apr 22 11:09:56 small3 pvestatd[72520]: starting server
Apr 22 11:09:56 small3 systemd[1]: Started pvestatd.service - PVE Status Daemon.
root@small3:~# pve-cluster
-bash: pve-cluster: command not found
root@small3:~# systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
     Active: active (running) since Tue 2025-04-22 02:30:38 EDT; 1 week 0 days ago
   Main PID: 1209 (pmxcfs)
      Tasks: 10 (limit: 115464)
     Memory: 67.4M
        CPU: 9min 39.800s
     CGroup: /system.slice/pve-cluster.service
             └─1209 /usr/bin/pmxcfs

Apr 29 19:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
Apr 29 20:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
Apr 29 21:16:06 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:14 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:23 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:24 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:25 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:25 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:26 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:17:57 small3 pmxcfs[1209]: [status] notice: received log
root@small3:~# journalctl -b -u pve-cluster
Apr 22 02:30:37 small3 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Apr 22 02:30:37 small3 pmxcfs[1191]: [main] notice: resolved node name 'small3' to '192.168.1.9' for default node IP ad>
Apr 22 02:30:37 small3 pmxcfs[1191]: [main] notice: resolved node name 'small3' to '192.168.1.9' for default node IP ad>
Apr 22 02:30:37 small3 pmxcfs[1209]: [quorum] crit: quorum_initialize failed: 2
Apr 22 02:30:37 small3 pmxcfs[1209]: [quorum] crit: can't initialize service
Apr 22 02:30:37 small3 pmxcfs[1209]: [confdb] crit: cmap_initialize failed: 2
Apr 22 02:30:37 small3 pmxcfs[1209]: [confdb] crit: can't initialize service
Apr 22 02:30:37 small3 pmxcfs[1209]: [dcdb] crit: cpg_initialize failed: 2
Apr 22 02:30:37 small3 pmxcfs[1209]: [dcdb] crit: can't initialize service
Apr 22 02:30:37 small3 pmxcfs[1209]: [status] crit: cpg_initialize failed: 2
Apr 22 02:30:37 small3 pmxcfs[1209]: [status] crit: can't initialize service
Apr 22 02:30:38 small3 systemd[1]: Started pve-cluster.service - The Proxmox VE cluster filesystem.
Apr 22 02:30:43 small3 pmxcfs[1209]: [status] notice: update cluster info (cluster name  bigcluster, version = 7)
Apr 22 02:30:43 small3 pmxcfs[1209]: [dcdb] notice: members: 1/1209
Apr 22 02:30:43 small3 pmxcfs[1209]: [dcdb] notice: all data is up to date
Apr 22 02:30:43 small3 pmxcfs[1209]: [status] notice: members: 1/1209
Apr 22 02:30:43 small3 pmxcfs[1209]: [status] notice: all data is up to date
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: members: 1/1209, 3/1024, 4/1007
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: starting data syncronisation
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: cpg_send_message retried 1 times
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: node has quorum
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: members: 1/1209, 3/1024, 4/1007
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: starting data syncronisation
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: received sync request (epoch 1/1209/00000002)
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: received sync request (epoch 1/1209/00000002)
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: received all states
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: leader is 3/1024
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: synced members: 3/1024, 4/1007
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: waiting for updates from leader
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: received all states
Apr 22 02:30:44 small3 pmxcfs[1209]: [status] notice: all data is up to date
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: update complete - trying to commit (got 8 inode updates)
Apr 22 02:30:44 small3 pmxcfs[1209]: [dcdb] notice: all data is up to date
Apr 22 02:30:50 small3 pmxcfs[1209]: [status] notice: received log
...
Apr 22 03:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
...
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: members: 1/1209, 3/1024
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: starting data syncronisation
Apr 22 18:07:23 small3 pmxcfs[1209]: [status] notice: members: 1/1209, 3/1024
Apr 22 18:07:23 small3 pmxcfs[1209]: [status] notice: starting data syncronisation
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: received sync request (epoch 1/1209/00000003)
Apr 22 18:07:23 small3 pmxcfs[1209]: [status] notice: received sync request (epoch 1/1209/00000003)
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: received all states
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: leader is 1/1209
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: synced members: 1/1209, 3/1024
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: start sending inode updates
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: sent all (0) updates
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: cpg_send_message retried 1 times
Apr 22 18:07:23 small3 pmxcfs[1209]: [dcdb] notice: all data is up to date
Apr 22 18:07:23 small3 pmxcfs[1209]: [status] notice: received all states
Apr 22 18:07:23 small3 pmxcfs[1209]: [status] notice: all data is up to date
Apr 22 18:22:23 small3 pmxcfs[1209]: [status] notice: received log
Apr 22 18:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
...
Apr 29 18:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
Apr 29 19:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
Apr 29 20:30:37 small3 pmxcfs[1209]: [dcdb] notice: data verification successful
Apr 29 21:16:06 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:14 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:23 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:24 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:25 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:25 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:16:26 small3 pmxcfs[1209]: [status] notice: received log
Apr 29 21:17:57 small3 pmxcfs[1209]: [status] notice: received log
 
" └─72685 /sbin/mount.nfs box:/mnt/user/media /mnt/pve/media -o rw"

this one looks suspiciously like a hanging NFS mount..
 
Hm, weird. It definitely is.

I don't get why NFS would cause errors like this, or how it becomes "hanging".

I'm using those NFS shares inside of multiple LXC containers. Is there a better way to use network storage on LXC containers? I've seen a few posts where people switch to CIFS or SMB and have had better performance on proxmox.
 
CIFS/SMB is often also used because it has better support for authentication and ACLs ;) but yes, worth a shot, I suspect the culprit is flaky network connection or uptime of the server though, maybe the logs give you more details?