Grey Question Mark, after CIFS share offline

Still, another problem:
Allthough I am able to access the terminal via the PVE UI of other nodes, I am not able to login directly to node 3 and node 4.
1728572444893.png

but both nodes are shown green now in the ui of node 1 and node 2.
But also the system tab can't be shown (communication failure (0) )

Maybe another service needs to be restarted?
 
here a part of some more logs.
it is repeated in that way...
Code:
Oct 11 10:00:57 p-virt-sw-3 pveproxy[1821979]: proxy detected vanished client connection
Oct 11 10:00:59 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:00:59 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:04 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:04 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:09 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:09 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:14 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:14 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:19 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:19 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:24 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:24 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:29 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:29 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:31 p-virt-sw-3 pmxcfs[3294]: [status] notice: received log
Oct 11 10:01:34 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:34 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:39 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:39 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:44 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:44 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:49 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:49 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:54 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:54 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:01:59 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:01:59 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:01 p-virt-sw-3 pveproxy[1816718]: proxy detected vanished client connection
Oct 11 10:02:04 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:04 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:09 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:09 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:14 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:14 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:19 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:19 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:24 p-virt-sw-3 pmxcfs[3294]: [status] notice: received log
Oct 11 10:02:24 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:24 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:29 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:29 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:31 p-virt-sw-3 pveproxy[1821979]: proxy detected vanished client connection
Oct 11 10:02:34 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:34 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:39 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:39 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:44 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:44 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:49 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:49 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:54 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:54 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:02:59 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:02:59 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:04 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:04 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:09 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:09 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:14 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:14 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:19 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:19 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:24 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:24 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:29 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:29 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:34 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:34 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:36 p-virt-sw-3 pveproxy[1813610]: proxy detected vanished client connection
Oct 11 10:03:39 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:39 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:44 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:44 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:49 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:49 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:54 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:54 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:03:59 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:03:59 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:04 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:04 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:06 p-virt-sw-3 pveproxy[1816718]: proxy detected vanished client connection
Oct 11 10:04:09 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:09 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:14 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:14 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:19 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:19 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:24 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:24 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:29 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:29 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:34 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:34 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:39 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:39 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:44 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:44 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:49 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:49 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:54 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:54 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:04:59 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:04:59 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:05:04 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:05:04 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:05:09 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:05:09 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:05:10 p-virt-sw-3 pveproxy[1816718]: proxy detected vanished client connection
Oct 11 10:05:14 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:05:14 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:05:19 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:05:19 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349
Oct 11 10:05:24 p-virt-sw-3 qmeventd[2953]: error parsing vmid for 23349: no matching qemu.slice cgroup entry
Oct 11 10:05:24 p-virt-sw-3 qmeventd[2953]: could not get vmid from pid 23349

[/QUOTE]
 
to sum it up:
on 2 of 4 nodes the following problems appear:
1.) direct login on the web ui fails with an error: "Login failed. Please try again"
2.) cluster remote access works and it all green. Selecting one VM only configuration data shows up but the live data (CPU load, remote console, ...) ist just loading endlessly.
 
Hi,

Thank you for sharing more syslog!

Have you tried to restart pveproxy service? `systemctl restart pveproxy`

Have you tried to log in from a different/private browser as well? If the private login works you may have to clean the cache. If not, try to update the certificate using `pvecm updatecerts --force` command.
 
Hi,

Thank you for sharing more syslog!

Have you tried to restart pveproxy service? `systemctl restart pveproxy`

Have you tried to log in from a different/private browser as well? If the private login works you may have to clean the cache. If not, try to update the certificate using `pvecm updatecerts --force` command.
Thanks for your answer!

Tried all three steps but none of them helped.

Still no direct WebUI login on node 3 and node 4
And no live VM data is shown for those two nodes.

And another problematic thing is that I need to restore a VM today and the backup is on the storage that can't be mounted any more.
:(
Beside this, the whole mess has started (as far as I see it) when the mounted network share disappeard. If this can't be resolved without a reboot of the physical nodes, I am really not sure if I can keep the setup in that way.
Can you confirm that loosing a network share can have such a high impact on a cluster or some nodes?
Is it more stable to use shares just on non clustered nodes?
How can I improve?
 
It has got worse...

I executed pvecm updatecerts --force on the node 3.
Now I can't access it at all from the UI of an remote node.

1728889192782.png
 
Now I removed the node 3 from the cluster in the hope that the web ui is working again.
Nope.
I can access node 3 via SSH and I see the web UI login screen but when I try to login, it fails.
 
Tried to restart the processes pvedaemon and pveproxy

pvedaemon fails:

Code:
● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
     Active: deactivating (stop-sigterm) (Result: exit-code) since Mon 2024-10-14 09:50:23 CEST; 31min ago
    Process: 3715761 ExecStart=/usr/bin/pvedaemon start (code=exited, status=255/EXCEPTION)
      Tasks: 3 (limit: 629145)
     Memory: 163.1M
        CPU: 786ms
● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
     Active: deactivating (stop-sigterm) (Result: exit-code) since Mon 2024-10-14 09:50:23 CEST; 31min ago
    Process: 3715761 ExecStart=/usr/bin/pvedaemon start (code=exited, status=255/EXCEPTION)
      Tasks: 3 (limit: 629145)
     Memory: 163.1M
        CPU: 786ms
     CGroup: /system.slice/pvedaemon.service
             ├─1399185 "pvedaemon worker"
             ├─1399248 "pvedaemon worker"
             └─1399262 "pvedaemon worker"

Oct 14 10:20:12 p-virt-sw-3 systemd[1]: Starting pvedaemon.service - PVE API Daemon...
Oct 14 10:20:13 p-virt-sw-3 pvedaemon[3715761]: start failed - unable to create socket - Address already in use
Oct 14 10:20:13 p-virt-sw-3 pvedaemon[3715761]: start failed - unable to create socket - Address already in use
Oct 14 10:20:13 p-virt-sw-3 systemd[1]: pvedaemon.service: Control process exited, code=exited, status=255/EXCEPTION
 
Ok, I tried to get another try in this case.

I have another node that is for testing only and is not part of the productive cluster.
This node had also the same CIFS shares.

Same initial situation there:
Network drive dropped.
UI got unstable with the grey icons and no live data.
The old mount points for the shares are inaccessible and the console freezes when I try to access or remove the directory.
If I have to guess I would say these network drive mount points are the main cause of all this troubles.

I am not able to remount the network drive (although the share is up and running).
I can't remove the mount point directory and all types of access to this dir freezes apps.
Then I took the step, that I want to avoid on the productive system. I rebooted the test node.
Afterwards everything runs stable again. Network drive is mounted.
All fine.

So still, the question is if there is a way to heal a running system that is in this situation. I guess it is not acceptable to have a server grade software that can be killed by something simple as a network share drop.
Or maybe kill is the wrong word, the core functions are still available and the VMs continue running but the UI and network drive support breaks away and only a node reboot resolves it.

Still hoping for the big enlightenment.

Thanks.
 
Hi,

Have you tried to mount the CIFS shares manually using `umount -f /mnt/pve/SwFileServer-cifs` command? If not please can you try? after you unmount the CIFS, try to remove the storage, and re-add it from your PVE Web UI.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!