df -hl
and df -h
in that server. The first one will only list local filesystems and should run correctly, but the second one may never end if some remote filesystem is unreachable.I did have our QNAP was unresponsive... It is now responsive again and have rebooted this node, but still will not come up in the GUI.This can happend if that node can't reach some storage, typically a network one. Try to execute adf -hl
anddf -h
in that server. The first one will only list local filesystems and should run correctly, but the second one may never end if some remote filesystem is unreachable.
pvesm status
and paste it's output.pve-cluster.service
? Any error on one of the other nodes?pvesh get /cluster/resources
(both on PVE-05
and on another node).PVE-05
?vgdisplay
Displays first volumne group.
then
Giving up waiting for lock.
Can't get lock for pve.
Cannot process volume group pve
pvesm status
in your systemlsof | grep "/var/lock/lvm/"
to see if any process is holding the lock file for pve LV. It there's any, check if it should/can be killed. Then remove the lock file, which should be /var/lock/lvm/pvecould you also tryroot@Proxmox03:~# lsof | grep "/var/lock/lvm/"
root@Proxmox03:~#
returns nothing
also looked in the lvm lock folder
root@Proxmox03:/var/lock/lvm# ls
P_global V_pve V_pve:aux
lsof /var/lock/lvm
? (Since /var/lock/
is usually a symlink to /run/lock/
, the grep
command might not match).vgdisplay -ddd
(-d
for debug)?pveversion -v
?What is the output ofSo i noticed over the weekend the locks in seemed to drop so im able to access the VG again, however the machine still shows ? and the tasks are still showing as running but i cant stop them. the VMs are also running as expected.
cat /var/log/pve/tasks/active
on the node with the stuck tasks? Are there any errors in the output of journalctl -b0 -u pvestatd.service
? What about /var/log/syslog
?The second part of the UPID is the PID in hexadecimal. You can use e.g.root@Proxmox03:~# cat /var/log/pve/tasks/active
UPIDroxmox03:000500C0:1ACAC2E1:61CC82EF:qmstart:25510:root@pam: 0
UPIDroxmox03:00050B18:1ACB80C0:61CC84D5:qmstart:14023:root@pam: 0
UPIDroxmox03:0005416E:1AD4D69A:61CC9CBB:imgdel:14023@local-lvm:root@pam: 0
UPIDroxmox03:001D09BE:1E48B34B:61D5736B:aptupdate::root@pam: 1 61D5737B OK
there ones with 0 status seem to be the ones that art stuck, can those be manually killed somehow?
echo "ibase=16; 000500C0" | bc
to convert it to decimal. And I'd check if the processes is actually what you think it is (e.g. with ps --pid 327872
) before sending a SIGTERM/SIGKILL.