[SOLVED] Pvescheduler not starting

O1ez

New Member
May 5, 2022
3
0
1
Hello, I have been using Proxmox for a couple of days and yesterday I encountered a problem where the web interface does not accept my password. I can still log into the server via ssh or the android app. In the app I saw that the service "pvescheduler" is not running so. I tried starting it manually after connecting via ssh with "pvescheduler start" but it only shows "
trying to acquire cfs lock 'file-jobs_cfg' ...
trying to acquire cfs lock 'file-replication_cfg' ..."
and that repeats indefinitely. There was no update before this error occured. I just went to sleep and when I woke up it wasn't working anymore.
I use Proxmox on 4 PCs connected to a cluster. There are only two VMs running on it.
Does the pvescheduler have anything to do with the login error and does anyone have a fix for this?
 
Hi,

can you provide the output of systemctl status pvescheduler.service and pvecm status?
 
Sure. The first one:
Code:
● pvescheduler.service - Proxmox VE scheduler
     Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
The second one:
Code:
Cluster information
-------------------
Name:             MainServer
Config Version:   4
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu May  5 08:57:38 2022
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.77
Quorate:          No

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      2
Quorum:           3 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.178.129 (local)
0x00000004          1 192.168.178.123
Edit: If thats important: two nodes are turned off at the moment.
 
Last edited:
● pvescheduler.service - Proxmox VE scheduler
Loaded: loaded (/lib/systemd/system/pvescheduler.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Please consider putting output like this between [code][/code] tags to format it properly :)
Edit: If thats important: two nodes are turned off at the moment.
Yes that is indeed important, the cluster is currently not quorate. For a cluster to be quorate 50%+1 nodes need to be available to participate in a vote. Since two nodes are offline, they will not participate and every operation that requires a quorum will fail. Currently this includes being able to log in. So try bringing the nodes back online.

Just to be sure, can you also provide the output of journalctl --unit=pvescheduler.service? Thank you!
 
Last edited:
Thank you so much, I was not aware of that. Everything works now that I started one of the nodes that were turned off again.
 
pvescheduler does not start, hangs at startup on one of the hosts:
Code:
pve6:~# systemctl start pvescheduler.service

Code:
pve6:~# pvecm status
Cluster information
-------------------
Name:             rzn
Config Version:   20
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon Jul 18 10:44:05 2022
Quorum provider:  corosync_votequorum
Nodes:            8
Node ID:          0x00000006
Ring ID:          1.18d2
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   8
Highest expected: 8
Total votes:      8
Quorum:           5 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.33.51
0x00000002          1 192.168.33.42
0x00000003          1 192.168.33.43
0x00000004          1 192.168.33.41
0x00000005          1 192.168.33.54
0x00000006          1 192.168.33.46 (local)
0x00000008          1 192.168.33.40
0x00000009          1 192.168.33.44

Code:
journalctl --unit=pvescheduler.service
...
-- Boot 98477cbeaf8142178b32afbf8d25f45a --
Jun 09 15:15:33 rzn-pve6 systemd[1]: Starting Proxmox VE scheduler...
Jun 09 15:15:33 rzn-pve6 pvescheduler[6398]: starting server
Jun 09 15:15:33 rzn-pve6 systemd[1]: Started Proxmox VE scheduler.
Jun 11 08:31:26 rzn-pve6 pvescheduler[2116638]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeo>
Jun 11 08:31:26 rzn-pve6 pvescheduler[2116639]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 11 08:40:17 rzn-pve6 pvescheduler[2125636]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 11 08:40:17 rzn-pve6 pvescheduler[2125635]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 14 16:51:23 rzn-pve6 systemd[1]: Stopping Proxmox VE scheduler...
Jun 14 16:51:24 rzn-pve6 pvescheduler[6398]: received signal TERM
Jun 14 16:51:24 rzn-pve6 pvescheduler[6398]: got shutdown request, signal running jobs to stop
Jun 14 16:51:24 rzn-pve6 pvescheduler[6398]: server stopped
Jun 14 16:51:25 rzn-pve6 systemd[1]: pvescheduler.service: Succeeded.
Jun 14 16:51:25 rzn-pve6 systemd[1]: Stopped Proxmox VE scheduler.
Jun 14 16:51:25 rzn-pve6 systemd[1]: pvescheduler.service: Consumed 3min 11.671s CPU time.
-- Boot 32d4e119a72640228f4d582c81bfd85d --
Jun 14 16:54:09 rzn-pve6 systemd[1]: Starting Proxmox VE scheduler...
Jun 14 16:54:10 rzn-pve6 pvescheduler[6086]: starting server
Jun 14 16:54:10 rzn-pve6 systemd[1]: Started Proxmox VE scheduler.
Jun 18 08:05:10 rzn-pve6 pvescheduler[299504]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:07:10 rzn-pve6 pvescheduler[304410]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:09:14 rzn-pve6 pvescheduler[304981]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:09:14 rzn-pve6 pvescheduler[304982]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 18 08:12:13 rzn-pve6 pvescheduler[308000]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:14:17 rzn-pve6 pvescheduler[308564]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:25:11 rzn-pve6 pvescheduler[318307]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:25:11 rzn-pve6 pvescheduler[318312]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 18 08:32:11 rzn-pve6 pvescheduler[324485]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 18 08:33:10 rzn-pve6 pvescheduler[324857]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:34:13 rzn-pve6 pvescheduler[325142]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 18 08:34:13 rzn-pve6 pvescheduler[325143]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 25 08:06:12 rzn-pve6 pvescheduler[565696]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 25 08:06:12 rzn-pve6 pvescheduler[565697]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 25 08:07:11 rzn-pve6 pvescheduler[565983]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 25 08:12:11 rzn-pve6 pvescheduler[569575]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 25 08:13:11 rzn-pve6 pvescheduler[572032]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 25 08:32:19 rzn-pve6 pvescheduler[588305]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 25 08:32:19 rzn-pve6 pvescheduler[588307]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jun 25 08:34:14 rzn-pve6 pvescheduler[588887]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jun 26 05:16:20 rzn-pve6 pvescheduler[1655438]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 02 08:08:14 rzn-pve6 pvescheduler[835586]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 02 08:09:12 rzn-pve6 pvescheduler[835862]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 02 08:09:14 rzn-pve6 pvescheduler[835863]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Jul 02 08:11:14 rzn-pve6 pvescheduler[838590]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 02 08:13:14 rzn-pve6 pvescheduler[841334]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Jul 02 08:27:24 rzn-pve6 pvescheduler[851552]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout

at this time, there are no logs at startup
 
Code:
pve6:~# systemctl status pve-guests.service
● pve-guests.service - PVE guests
     Loaded: loaded (/lib/systemd/system/pve-guests.service; enabled; vendor preset: enabled)
     Active: activating (start) since Mon 2022-07-04 11:38:41 MSK; 2 weeks 0 days ago
   Main PID: 5837 (pvesh)
      Tasks: 2 (limit: 618971)
     Memory: 138.5M
        CPU: 2.523s
     CGroup: /system.slice/pve-guests.service
             └─5837 /usr/bin/perl /usr/bin/pvesh --nooutput create /nodes/localhost/startall