Hello everyone,
I had installed a new LSI 9207-8e a couple days ago and everything was pretty good. A number of reboots were done since the hardware install. The plan was to eventually pass it through to a VM. I had done some reading and found many saying that I should add "mpt3sas.max_queue_depth=10000" to GRUB_CMDLINE_LINUX_DEFAULT.
I ran update-grub and all seemed good. I eventually rebooted the host and was surprised when it did not come back up some minutes later. I waited about 30 minutes and then I shut down the host. Grabbed my crash cart and powered on the host. The screen output was pretty simple.
I scratched my head and grabbed my livecd and was able to get a rescue boot. So, I went back to the grub file, reversed the change, ran update-grub, rebooted and same issue. Used the rescue boot again and noticed that when the system loaded, it was using Kernel 5.15.107-2-pve. This reminded me about something I had briefly seen in the last update-grub output.
I noticed that vmlinuz-5.15.108-1-pve was present but there was no initrd.img-5.15.108-1-pve. Feeling confident that this was part of the problem, I rebooted the host and did not use the rescue boot. Instead, I used the advanced options and selected 5.15.107-2-pve. The host happily booted.
I'm hoping that someone might be able to give me some advice on how to proceed from here. I've read that I could run "proxmox-boot-tool kernel pin 5.15.107-2-pve" to work around the problem but I would like to find out if I could actually fix it so 5.15.108-1-pve would work.
I apologize for my long post but I like to try giving as much (hopefully relevant) information as possible when asking for help. I also apologize because I'm pretty much a noob when it comes to GRUB and the Proxmox boot process.
Thank you very much for any advice and help that you can provide.
I had installed a new LSI 9207-8e a couple days ago and everything was pretty good. A number of reboots were done since the hardware install. The plan was to eventually pass it through to a VM. I had done some reading and found many saying that I should add "mpt3sas.max_queue_depth=10000" to GRUB_CMDLINE_LINUX_DEFAULT.
Code:
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINX_DEFAULT="quiet"
#GRUB_CMDLINE_LINX_DEFAULT="quiet intel_iommu=on"
#GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream,multifunction video=efifb:off"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on mpt3sas.max_queue_depth=10000 pcie_acs_override=downstream,multifunction video=efifb:off"
GRUB_CMDLINE_LINUX=""
# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"
# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console
# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480
# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true
# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"
# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
I ran update-grub and all seemed good. I eventually rebooted the host and was surprised when it did not come back up some minutes later. I waited about 30 minutes and then I shut down the host. Grabbed my crash cart and powered on the host. The screen output was pretty simple.
I scratched my head and grabbed my livecd and was able to get a rescue boot. So, I went back to the grub file, reversed the change, ran update-grub, rebooted and same issue. Used the rescue boot again and noticed that when the system loaded, it was using Kernel 5.15.107-2-pve. This reminded me about something I had briefly seen in the last update-grub output.
Code:
root@pve:/etc/default# update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.15.108-1-pve
Found linux image: /boot/vmlinuz-5.15.107-2-pve
Found initrd image: /boot/initrd.img-5.15.107-2-pve
Found linux image: /boot/vmlinuz-5.13.19-6-pve
Found initrd image: /boot/initrd.img-5.13.19-6-pve
Found linux image: /boot/vmlinuz-5.13.19-2-pve
Found initrd image: /boot/initrd.img-5.13.19-2-pve
Found memtest86+ image: /boot/memtest86+.bin
Found memtest86+ multiboot image: /boot/memtest86+_multiboot.bin
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
Adding boot menu entry for UEFI Firmware Settings ...
done
I noticed that vmlinuz-5.15.108-1-pve was present but there was no initrd.img-5.15.108-1-pve. Feeling confident that this was part of the problem, I rebooted the host and did not use the rescue boot. Instead, I used the advanced options and selected 5.15.107-2-pve. The host happily booted.
I'm hoping that someone might be able to give me some advice on how to proceed from here. I've read that I could run "proxmox-boot-tool kernel pin 5.15.107-2-pve" to work around the problem but I would like to find out if I could actually fix it so 5.15.108-1-pve would work.
I apologize for my long post but I like to try giving as much (hopefully relevant) information as possible when asking for help. I also apologize because I'm pretty much a noob when it comes to GRUB and the Proxmox boot process.
Thank you very much for any advice and help that you can provide.