Proxmox gui and LXC containers no longer working after reboot of server

Mulled2368 · Sep 12, 2022

Hi

We just had to replace our electricitymeter at the house, so I had to shutdown my proxmox node. (single node installation)
Now the thing won't work anymore. Very annoying!

The GUI won't launch, and when I get it to launch using systemctl start pveproxy I only get questionmarks on the GUI.
Below some outputs from commands that are handy. ANy more information that you need to troubleshoot, let me know!

Bash:

[root@pve ~]$ journalctl -xe
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[1] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[2] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[3] failed: Connection refused
Sep 12 15:12:23 pve systemd[1]: pvescheduler.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pvescheduler.service has entered the 'failed' state with result 'exit-code'.
Sep 12 15:12:23 pve systemd[1]: Failed to start Proxmox VE scheduler.
░░ Subject: A start job for unit pvescheduler.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pvescheduler.service has finished with a failure.
░░
░░ The job identifier is 120 and the job result is failed.
Sep 12 15:12:26 pve pveproxy[1656]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1943.

Bash:

[root@pve ~]$ systemctl status pvescheduler.service
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-09-12 15:12:23 CEST; 18min ago
    Process: 1606 ExecStart=/usr/bin/pvescheduler start (code=exited, status=111)
        CPU: 658ms

Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[1] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[2] failed: Connection refused
Sep 12 15:12:23 pve pvescheduler[1606]: ipcc_send_rec[3] failed: Connection refused
Sep 12 15:12:23 pve systemd[1]: pvescheduler.service: Failed with result 'exit-code'.
Sep 12 15:12:23 pve systemd[1]: Failed to start Proxmox VE scheduler.
[root@pve ~]$ systemctl start pvescheduler.service
[root@pve ~]$ systemctl status pvescheduler.service
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-09-12 15:30:34 CEST; 1s ago
    Process: 3543 ExecStart=/usr/bin/pvescheduler start (code=exited, status=0/SUCCESS)
   Main PID: 3544 (pvescheduler)
      Tasks: 1 (limit: 18409)
     Memory: 94.7M
        CPU: 654ms
     CGroup: /system.slice/pvescheduler.service
             └─3544 pvescheduler

Code:

[root@pve ~]$ cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.100.10 pve.lietaert.be pve

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Bash:

[root@pve ~]$ journalctl -u pve-cluster
-- Journal begins at Tue 2022-03-15 08:53:04 CET, ends at Mon 2022-09-12 15:14:22 CEST. --
Mar 15 08:53:08 pve systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 15 08:53:09 pve systemd[1]: Started The Proxmox VE cluster filesystem.
Mar 15 08:56:24 pve systemd[1]: Stopping The Proxmox VE cluster filesystem...
Mar 15 08:56:24 pve pmxcfs[1158]: [main] notice: teardown filesystem
Mar 15 08:56:26 pve pmxcfs[1158]: [main] notice: exit proxmox configuration filesystem (0)
Mar 15 08:56:26 pve systemd[1]: pve-cluster.service: Succeeded.
Mar 15 08:56:26 pve systemd[1]: Stopped The Proxmox VE cluster filesystem.
Mar 15 08:56:26 pve systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 15 08:56:27 pve systemd[1]: Started The Proxmox VE cluster filesystem.
-- Boot f156b6f81e7648f888b98f03e7fb0a48 --
May 07 08:57:27 pve systemd[1]: pve-cluster.service: State 'stop-sigterm' timed out. Killing.
May 07 08:57:27 pve systemd[1]: pve-cluster.service: Main process exited, code=killed, status=9/KILL
May 07 08:57:27 pve systemd[1]: pve-cluster.service: Failed with result 'timeout'.
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found ordering cycle on pve-guests.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on mnt-ds718EmbyMedia.mount/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on remote-fs.target/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on rrdcached.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on pve-cluster.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found dependency on pve-firewall.service/stop
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Job pve-guests.service/stop deleted to break ordering cycle starting with pve-firewall.service/stop
-- Boot 29bf268db44f4c65b7f7edb9bf264c61 --
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: State 'stop-sigterm' timed out. Killing.
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: Main process exited, code=killed, status=9/KILL
Jun 11 08:17:23 pve systemd[1]: pve-cluster.service: Failed with result 'timeout'.

Mulled2368 · Sep 12, 2022

Bash:

[root@pve ~]$ pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-10
pve-kernel-helper: 7.2-10
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.35-3-pve: 5.15.35-6
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.9-pve1
ceph-fuse: 16.2.9-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1

dcsapak · Sep 13, 2022

Mulled2368 said:
May 19 09:37:27 pve systemd[1]: pve-firewall.service: Found ordering cycle on pve-guests.service/stop

that looks wrong, did you modify some systemd services ?

what does happen when you try to start the pmxcfs manually (only to debug):

Code:

pmxcfs -l

Mulled2368 · Sep 13, 2022

Hi Dominik

Not that I know of, I have added 2 services to mount my NAS shares for some LXC containers, but that's it.

Below the output of the command:

Code:

[root@pve ~]$ pmxcfs -l
[main] notice: unable to acquire pmxcfs lock - trying again
[main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
[main] notice: exit proxmox configuration filesystem (-1)

Broadcast message from systemd-journald@pve (Tue 2022-09-13 21:03:26 CEST):

pmxcfs[794404]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable

dcsapak · Sep 15, 2022

ok it tries to flock the file '/var/lib/pve-cluster/.pmxcfs.lockfile' and it failes

is your filesystem ok? does the directory exist?
is there another pmxcfs process already running? (check with e.g. 'ps ax | grep pmxcfs')

Mulled2368 · Sep 17, 2022

Hi Dominik

Filesystem is ZFS on 2 SSD's in the PC. A scrub resulted in no errors:

Code:

[root@pve ~]$ zpool status -v rpool
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:08:19 with 0 errors on Sat Sep 17 17:42:27 2022
config:

    NAME                                                   STATE     READ WRITE CKSUM
    rpool                                                  ONLINE       0     0     0
      mirror-0                                             ONLINE       0     0     0
        nvme-eui.002538bb61b52f08-part3                    ONLINE       0     0     0
        ata-Samsung_SSD_870_EVO_1TB_S6PUNX0T103947M-part3  ONLINE       0     0     0

errors: No known data errors

The folder and file do exist:

Code:

[root@pve ~]$ cd /var/lib/pve-cluster/
[root@pve pve-cluster]$ ls -a
.  ..  .pmxcfs.lockfile  config.db  config.db-shm  config.db-wal

I'm not familiar with the ps output, so not sure if this is the output we are looking for:

Code:

[root@pve ~]$ ps ax | grep pmxcfs
   1886 ?        Ssl    4:08 /usr/bin/pmxcfs
3421942 pts/0    S+     0:00 grep pmxcfs

Mulled2368 · Sep 17, 2022

Meanwhile, everything is working on the server, all LXC containers have booted and the VM as well, but the interface stays the same...

Search

Search

Proxmox gui and LXC containers no longer working after reboot of server

Mulled2368

New Member

Mulled2368

New Member

dcsapak

Proxmox Staff Member

Mulled2368

New Member

dcsapak

Proxmox Staff Member

Mulled2368

New Member

Mulled2368

New Member