mdadm crash or kernel dump

NickH

Member
Aug 13, 2020
31
1
13
63
I have a 3 * 18TB RAID5 in Proxmox which is configured as an NFS disk and used by various VM's as a data disk. A bit earlier today I had a stack trace the but the two key VM continued to work. However their data folders became unresponsive. I managed to shut down one VM normally after a long time. The other one would not shut down. I tried shutting down the VM and then lost control of it and had to do a hard reset. Everything has come back up OK and I am doing a RAID check but it will take 1.5 days to complete.

I am running Proxmox 7.4 - pve-manager/7.4-17/513c62be (running kernel: 5.15.143-1-pve) on an HPE ML110 Gen 9 with 72GB RAM. The O/S is on a separate HDD and the VMs on an NVMe drive.

The stack dump I got is attached.

Also from the syslog:
Code:
INFO: task md127_raid5:534 blocked for more than 120 seconds.

Can anyone give any insight into the issue or help me fix it? I think I saw the same sometime late last year.
 

Attachments

  • messages.txt
    23.3 KB · Views: 1
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!