[SOLVED] task vgs: blocked for more than xxxx seconds

albindy

New Member
Sep 20, 2019
11
1
1
40
Hi!

I have a problem which appears on 3 different PVE Servers.
One of them was updated from PVE 5.4 to 6.1 the other 2 are new installed 6.1 Systems.

Always during Backup but also randomly the WebUI gets mostly unresponsive and shows Question Marks on the VMs and the Storages.
In dmesg and the systemlog i get the "task vgs: blocked for more than xxxx seconds" or "task lvs: blocked for more than xxxx seconds"
lvs pvs and vgs on the command line hang up and can not be killed.

The only thing that works is rebooting the PVE.
During the reboot the lvm hangs up to 10 minutes before it stops but then it reboots normally.

The Storages are local RAID disks one of them used for install, 2 of them with lvm created via WebUI, RDX via usb3 and NFS with a Synology. (Based on NFS 4.0, also testet 4.1).

Maybe an important information, the System that was upgraded worked with the same Setup for over a half year and other Systems based on the same Setup not upgraded to 6.1 are working like a charm.

Maybe someone has a hint for me because the systems are basically working but are not managable in this state.
Vzdump Backups are also not running when PVE is in this state.

Kind Regards
Alex
 
Last edited:
Additional Information:

Maybe related to
https://bugzilla.redhat.com/show_bug.cgi?id=1569431
Because on systemctl restart lvm2-monitor i get:
lvm2-monitor.service: Stopping timed out. Terminating.
lvm2-monitor.service: State 'stop-sigterm' timed out. Killing.
lvm2-monitor.service: Killing process 13015 (lvm) with signal SIGKILL.
...
lvm2-monitor.service: State 'stop-sigterm' timed out. Killing.
lvm2-monitor.service: Killing process 13015 (lvm) with signal SIGKILL.
...
lvm2-monitor.service: Processes still around after final SIGKILL. Entering failed mode.
lvm2-monitor.service: Failed with result 'timeout'.
Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
lvm2-monitor.service: Found left-over process 13015 (lvm) in control group while starting unit. Ignoring.
This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...

But still no working lvm2-monitor.
 
I have tested with pve-kernel-5.3.13-1-pve, pve-kernel-5.0.21-5-pve and pve-kernel-4.15.18-24-pve but with 4.15. the Windows
VMs are not starting.
With both kernel Versions the same Problem.
The main Problem is that i have no Backups from the systems and this is slowly getting a big problem.

Is there no one who has an advice or an idea?

KR
Alex
 
Hi All

In case someone has the same Problem, in my case there were 2 block devices causing the issues.

First:
HP has an internal Cardreader which can be disabled in the Bios and is causing the issues because it is recognized as a Blockdevice but has no partition and does not respond to any access. Disabling this device in the Bios solved this one and stopped the crashing of the gui and the blockdevice related programs.
This was the Problem on the 2 new installed machines. Seems to be Kernel related.

Second:
The problem on the upgraded machine besides the 1st mentioned problem was the attached RDX or better the storage cartridge. The filesystem on the cartridge was damaged.
excluding the Blockdevice in lvm.conf stopped the crashing of the gui and the lockups of the lvm2-monitor and the blockdevice related programs.
Just repairing the RDX cartridged filesystem would have been the 2nd option but excluding seems to be the better idea if a cartridge gets damaged again.
This problem was not related to PVE version or Kernel. Just a stupid coincidence because the RDX cartridge was changed the same time the Upgrade was made.

Excluding all devices from lvm which are not pvs seems to me as a good idea and would also have helped on the 1st problem i think, comments on this idea are welcome!

Hope this helps someone in a similar Situation.

KR
Alex
 
  • Like
Reactions: stefano.molinaro

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!