Hi!
I've just installed proxmox 5 and performed upgrade & dist-upgrade (to 5.0-31).
I'm using proxmox in a such way, that I have dozens of VMs which are in very frequent cycle of "start -> 5 minutes of running -> stop -> revert". I have a python script, which does that via public proxmox API on different machine. After some time of running (not even 24h), the API starts to get timeouts and generally fails.
/var/log/daemon.log looks like this:
The API is then completely unresponsible. Only thing that helps is "service pvedaemon restart".
Note that I've been using proxmox 4.4-13/7ea56165 in this way for a several months and this issue didn't occur.
I have suspection that the problem resides in Tools.pm, in function "lock_file_full" and the handling of "checkptr", but I'm not that fluent in perl so I can be mistaken.
I've just installed proxmox 5 and performed upgrade & dist-upgrade (to 5.0-31).
I'm using proxmox in a such way, that I have dozens of VMs which are in very frequent cycle of "start -> 5 minutes of running -> stop -> revert". I have a python script, which does that via public proxmox API on different machine. After some time of running (not even 24h), the API starts to get timeouts and generally fails.
/var/log/daemon.log looks like this:
Code:
Sep 17 11:25:58 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:25:59 hera pvestatd[3090]: status update time (10.955 seconds)
Sep 17 11:26:09 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:26:10 hera pvestatd[3090]: status update time (10.851 seconds)
Sep 17 11:26:20 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:26:21 hera pvestatd[3090]: status update time (10.838 seconds)
Sep 17 11:26:31 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:26:32 hera pvestatd[3090]: status update time (10.830 seconds)
Sep 17 11:26:42 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:26:43 hera pvestatd[3090]: status update time (10.797 seconds)
Sep 17 11:26:53 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:26:53 hera pvestatd[3090]: status update time (10.787 seconds)
Sep 17 11:27:03 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:04 hera pvestatd[3090]: status update time (10.862 seconds)
Sep 17 11:27:14 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:15 hera pvestatd[3090]: status update time (10.899 seconds)
Sep 17 11:27:25 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:26 hera pvestatd[3090]: status update time (10.824 seconds)
Sep 17 11:27:36 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:37 hera pvestatd[3090]: status update time (10.814 seconds)
Sep 17 11:27:47 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:48 hera pvestatd[3090]: status update time (10.800 seconds)
Sep 17 11:27:58 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:27:58 hera pvestatd[3090]: status update time (10.799 seconds)
Sep 17 11:28:08 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
Sep 17 11:28:09 hera pvestatd[3090]: status update time (10.891 seconds)
Sep 17 11:28:19 hera pvestatd[3090]: can't lock file '/var/log/pve/tasks/.active.lock' - got timeout
The API is then completely unresponsible. Only thing that helps is "service pvedaemon restart".
Note that I've been using proxmox 4.4-13/7ea56165 in this way for a several months and this issue didn't occur.
I have suspection that the problem resides in Tools.pm, in function "lock_file_full" and the handling of "checkptr", but I'm not that fluent in perl so I can be mistaken.