Proxmox gui and LXC containers no longer working after reboot of server

Mulled2368

Member
Sep 12, 2022
5
0
6
Hi

We just had to replace our electricitymeter at the house, so I had to shutdown my proxmox node. (single node installation)
Now the thing won't work anymore. Very annoying!

The GUI won't launch, and when I get it to launch using systemctl start pveproxy I only get questionmarks on the GUI.
Below some outputs from commands that are handy. ANy more information that you need to troubleshoot, let me know!

Bash:
[root@pve ~]$ journalctl -xe
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[1] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[2] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[3] failed: Connection refused
Sep 12 15:12:23 pve systemd[1]: pvescheduler.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pvescheduler.service has entered the 'failed' state with result 'exit-code'.
Sep 12 15:12:23 pve systemd[1]: Failed to start Proxmox VE scheduler.
░░ Subject: A start job for unit pvescheduler.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pvescheduler.service has finished with a failure.
░░
░░ The job identifier is 120 and the job result is failed.
Sep 12 15:12:26 pve pveproxy[1656]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1943.

Bash:
[root@pve ~]$ systemctl status pvescheduler.service
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-09-12 15:12:23 CEST; 18min ago
    Process: 1606 ExecStart=/usr/bin/pvescheduler start (code=exited, status=111)
        CPU: 658ms

Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[1] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[2] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[3] failed: Connection refused
Sep 12 15:12:23 pve systemd[1]: pvescheduler.service: Failed with result 'exit-code'.
Sep 12 15:12:23 pve systemd[1]: Failed to start Proxmox VE scheduler.
[root@pve ~]$ systemctl start pvescheduler.service
[root@pve ~]$ systemctl status pvescheduler.service
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-09-12 15:30:34 CEST; 1s ago
    Process: 3543 ExecStart=/usr/bin/pvescheduler start (code=exited, status=0/SUCCESS)
   Main PID: 3544 (pvescheduler)
      Tasks: 1 (limit: 18409)
     Memory: 94.7M
        CPU: 654ms
     CGroup: /system.slice/pvescheduler.service
             └─3544 pvescheduler

Code:
[root@pve ~]$ cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.100.10 pve.lietaert.be pve

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Bash:
[root@pve ~]$ journalctl -u pve-cluster
-- Journal begins at Tue 2022-03-15 08:53:04 CET, ends at Mon 2022-09-12 15:14:22 CEST. --
Mar 15 08:53:08 pve systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 15 08:53:09 pve systemd[1]: Started The Proxmox VE cluster filesystem.
Mar 15 08:56:24 pve systemd[1]: Stopping The Proxmox VE cluster filesystem...
Mar 15 08:56:24 pve pmxcfs[1158]: [main] notice: teardown filesystem
Mar 15 08:56:26 pve pmxcfs[1158]: [main] notice: exit proxmox configuration filesystem (0)
Mar 15 08:56:26 pve systemd[1]: pve-cluster.service: Succeeded.
Mar 15 08:56:26 pve systemd[1]: Stopped The Proxmox VE cluster filesystem.
Mar 15 08:56:26 pve systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 15 08:56:27 pve systemd[1]: Started The Proxmox VE cluster filesystem.
-- Boot f156b6f81e7648f888b98f03e7fb0a48 --
May 07 08:57:27 pve systemd[1]: pve-cluster.service: State 'stop-sigterm' timed out. Killing.
May 07 08:57:27 pve systemd[1]: pve-cluster.service: Main process exited, code=killed, status=9/KILL
May 07 08:57:27 pve systemd[1]: pve-cluster.service: Failed with result 'timeout'.
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found ordering cycle on pve-guests.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on mnt-ds718EmbyMedia.mount/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on remote-fs.target/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on rrdcached.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on pve-cluster.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on pve-firewall.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Job pve-guests.service/stop deleted to break ordering cycle starting with pve-firewall.service/stop
-- Boot 29bf268db44f4c65b7f7edb9bf264c61 --
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: State 'stop-sigterm' timed out. Killing.
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: Main process exited, code=killed, status=9/KILL
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: Failed with result 'timeout'.
 
Last edited:
Bash:
[root@pve ~]$ pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-10
pve-kernel-helper: 7.2-10
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.35-3-pve: 5.15.35-6
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.9-pve1
ceph-fuse: 16.2.9-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
 
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found ordering cycle on pve-guests.service/stop
that looks wrong, did you modify some systemd services ?

what does happen when you try to start the pmxcfs manually (only to debug):
Code:
pmxcfs -l
 
Hi Dominik

Not that I know of, I have added 2 services to mount my NAS shares for some LXC containers, but that's it.

Below the output of the command:

Code:
[root@pve ~]$ pmxcfs -l
[main] notice: unable to acquire pmxcfs lock - trying again
[main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
[main] notice: exit proxmox configuration filesystem (-1)

Broadcast message from systemd-journald@pve (Tue 2022-09-13 21:03:26 CEST):

pmxcfs[794404]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
 
ok it tries to flock the file '/var/lib/pve-cluster/.pmxcfs.lockfile' and it failes

is your filesystem ok? does the directory exist?
is there another pmxcfs process already running? (check with e.g. 'ps ax | grep pmxcfs')
 
Hi Dominik

Filesystem is ZFS on 2 SSD's in the PC. A scrub resulted in no errors:
Code:
[root@pve ~]$ zpool status -v rpool
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:08:19 with 0 errors on Sat Sep 17 17:42:27 2022
config:

    NAME                                                   STATE     READ WRITE CKSUM
    rpool                                                  ONLINE       0     0     0
      mirror-0                                             ONLINE       0     0     0
        nvme-eui.002538bb61b52f08-part3                    ONLINE       0     0     0
        ata-Samsung_SSD_870_EVO_1TB_S6PUNX0T103947M-part3  ONLINE       0     0     0

errors: No known data errors

The folder and file do exist:

Code:
[root@pve ~]$ cd /var/lib/pve-cluster/
[root@pve pve-cluster]$ ls -a
.  ..  .pmxcfs.lockfile  config.db  config.db-shm  config.db-wal

I'm not familiar with the ps output, so not sure if this is the output we are looking for:

Code:
[root@pve ~]$ ps ax | grep pmxcfs
   1886 ?        Ssl    4:08 /usr/bin/pmxcfs
3421942 pts/0    S+     0:00 grep pmxcfs
 
Meanwhile, everything is working on the server, all LXC containers have booted and the VM as well, but the interface stays the same...
SCR-20220917-ozg.png