MegaRaid controller issues Kernel 5.13.19-4-pve

retrojp

New Member
Apr 29, 2021
17
2
3
44
Scotland
After upgrading the kernel, I was having some issues with my hardware raid controller LSI MegaRAID SAS 2008 [Falcon] (rev 03). I've not had any problems with this for the several months i've been using it.

I normally pass this through to a VM, and noticed the error message...


Code:
Error: Cannot bind 0000:01:00.0 to vfio

I ran some commands and lsmod was not loading the module, which i believe the right thing to do, since i want it to pass through, but more weird was one of my drives missing from fdisk -l or blkid. I would be able to find it with lsblock and smartctl --scan. The bug was not localized to any one drive. After each boot, it would be a different drive. There's no warning lights on the server, and the drives sound healthy. I was also having difficulty powering down, sometimes a black screen, i'd plug a keyboard and it would remain stuck with message of a keyboard being plugged in.

Anyway, i reverted back to 5.13.19-3-pve and everything seems ok again.

At the time, i never checked dmesg, i have looked through syslog and nothing jumps out at me (i'm no log specialist). I can include them if anyone wants them. There might have been one thing, but it might have been me doing a hot swap while checking the drives. I didn't get these messages 24hrs ago when it happened, so guessing it was hot swapping? Below messages.

These messages never appeared with the 5.13.19-3-pve kernel, but, they never appeared 24hrs ago either.

Code:
Feb  8 23:08:46 jp kernel: [80782.942506] blk_update_request: I/O error, dev sdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Feb  8 23:08:46 jp kernel: [80782.942642] Buffer I/O error on dev sdb, logical block 0, async page read
Feb  8 23:08:46 jp kernel: [80782.942781] sd 0:0:14:0: [sdb] tag#22 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
Feb  8 23:08:46 jp kernel: [80782.942918] sd 0:0:14:0: [sdb] tag#22 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
 
Last edited:
Hi, can you ensure that you upgrade to pve-kernel-5.13.19-4-pve in version 5.13.19-9, that is the same kernel ABI than the broken one (which was version 5.13.19-8), but with a regression reverted that sounds like your issue here.
 
Hi, can you ensure that you upgrade to pve-kernel-5.13.19-4-pve in version 5.13.19-9, that is the same kernel ABI than the broken one (which was version 5.13.19-8), but with a regression reverted that sounds like your issue here.
Thanks for getting back to me about this. Well, i can confirm broken one :)

Code:
root@jp:~# dpkg --list | egrep -i "5.13.19-|Architecture Description"
||/ Name                                 Version                        Architecture Description
ii  pve-kernel-5.13.19-2-pve             5.13.19-4                      amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.13.19-3-pve             5.13.19-7                      amd64        The Proxmox PVE Kernel Image
ri  pve-kernel-5.13.19-4-pve             5.13.19-8                      amd64        The Proxmox PVE Kernel Image
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!