Boot Failure on upgrade to VE 7.2 with kernel 5.15.35-1-pve

rotor-head

Member
Jul 10, 2020
10
2
8
60
Encountered this issue today on upgrading a stand-alone install. This is on a Dell T310 with an MPTSAS card with 2 SSDs attached.

I've pinned the 5.13.19-6-pve kernel using the proxmox-boot-tool in the meantime. Interested to know if anyone else has had this issue or if there's a way to correct the issue.

Output:

... DMAR: [DMA Read NO_PASID] Request device [05:00.0]000 [fault reason 0x06] PTE Read access is not set
... 3.521802] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
... ata5: COMRESET failed (errorno=-16)
repeats 3 times
ata5: reset failed, giving up

Fails to import rpool (presumably due to the drive failing to initialize)

dumps into BusyBox
(initramfs)
 
  • Like
Reactions: LouisianaGuy
HI,
I also have issue with HPE Smart Array P410 If I using the latest kernel 5.15.35-1-pve, with 5.13.19-6-pve works without no issue.
May 07 03:00:51kernel: DMAR: [DMA Read NO_PASID] Request device [01:00.2] fault addr 0xf363e000 [fault reason 0x06] PTE Read access is not set
Failed to start Ceph object storage daemon, all the ssd are seen but they are unaccesible and the ceph can't start and the logs are full with the kernel fault
 
I've seen the same issue on a Dell R340 - it was failing to find or load megaraid_sas. So it failed to boot further and went into busybox (unfortunately I didn't get a screenshot of the errors before rebooting).
 
I've seen the same issue on a Dell R340 - it was failing to find or load megaraid_sas. So it failed to boot further and went into busybox (unfortunately I didn't get a screenshot of the errors before rebooting).
That seems to be the common theme here. Either failed or non-existent drivers for the HBA/RAID PCI cards in the new kernel.
 
Encountered this issue today on upgrading a stand-alone install. This is on a Dell T310 with an MPTSAS card with 2 SSDs attached.

I've pinned the 5.13.19-6-pve kernel using the proxmox-boot-tool in the meantime. Interested to know if anyone else has had this issue or if there's a way to correct the issue.

Output:

... DMAR: [DMA Read NO_PASID] Request device [05:00.0]000 [fault reason 0x06] PTE Read access is not set
... 3.521802] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
... ata5: COMRESET failed (errorno=-16)
repeats 3 times
ata5: reset failed, giving up

Fails to import rpool (presumably due to the drive failing to initialize)

dumps into BusyBox
(initramfs)
I had the same error with a T320, the solution for the OS to boot is to enter the BIOS and turn off virtualization
 
Same issue with Dell T420.

Wouldn't turning off the Virtualization in the BIOS have a negative impact on the cpu extensions?

Update: Pinned previous version - 5.13.19-6 and works fine for now.
 
Last edited:
HI,
I also have issue with HPE Smart Array P410 If I using the latest kernel 5.15.35-1-pve, with 5.13.19-6-pve works without no issue.
May 07 03:00:51kernel: DMAR: [DMA Read NO_PASID] Request device [01:00.2] fault addr 0xf363e000 [fault reason 0x06] PTE Read access is not set
Failed to start Ceph object storage daemon, all the ssd are seen but they are unaccesible and the ceph can't start and the logs are full with the kernel fault
Same here, Two machines DL360 G6 with HPE Smart Array P410. Pinned 5.13.19-6-pve, that one works well. Another set of the same DL360 servers start well 5.15.35-1-pve, but these ones use glusterfs for VM's. I did no further investigation as I'm at 1000km from these machines.
 
Same problem here...

megaraid_sas 0000:01:00:0: Failed to init firmware / Failed to do reset
/dev/mapper/pve-root on /root failed: input/output error [ 1011.642]

Any solution? Temporary is fine.
 
Same issue with Dell T420.

Wouldn't turning off the Virtualization in the BIOS have a negative impact on the cpu extensions?

Update: Pinned previous version - 5.13.19-6 and works fine for now.
I would think disabling virtualization support would be a temporary solution to boot into the OS but not a viable work-a-round presuming Proxmox is being used to host VMs. It was faster to select the prior kernel in the grub load screen and pin it. Perhaps that was the intent of the post.
 
HI,
I also have issue with HPE Smart Array P410 If I using the latest kernel 5.15.35-1-pve, with 5.13.19-6-pve works without no issue.
May 07 03:00:51kernel: DMAR: [DMA Read NO_PASID] Request device [01:00.2] fault addr 0xf363e000 [fault reason 0x06] PTE Read access is not set
Failed to start Ceph object storage daemon, all the ssd are seen but they are unaccesible and the ceph can't start and the logs are full with the kernel fault
for the HPE P410 (or rather for hp servers which are a bit older) - please also try the suggestions from:
https://forum.proxmox.com/threads/kernel-5-15-30-2-break-hpe-smart-array-p222.109298/#post-469898

(as written there - I did not run into these issues on the one hp g8 we have here in our testlab)
 
Else - and in general - please try to update all the firmwares on the System which have the issues (especially Dell does provide them for quite a long time and quite comfortably in the lifecycle controller) - some of these issues are directly resolved with updated firmwares.
 
Else - and in general - please try to update all the firmwares on the System which have the issues (especially Dell does provide them for quite a long time and quite comfortably in the lifecycle controller) - some of these issues are directly resolved with updated firmwares.
Screenshot_3.png

hp gen 8 iops jumping although all vm is turned off

and intermittently the system drops, it was like this after 5.15


Screenshot_4.png


I reset the server, then it gets better. It happens again after 1 day.


I'm back at 5.13 for now
 
Last edited:
Else - and in general - please try to update all the firmwares on the System which have the issues (especially Dell does provide them for quite a long time and quite comfortably in the lifecycle controller) - some of these issues are directly resolved with updated firmwares.
My Dell T320 has the latest BIOS 2.9.0 and the latest idrac, and the problem booting with Proxmox 7.2.3 continues. the way to boot is just downgrading to a previous kernel at the grub screen.
 
  • Like
Reactions: LouisianaGuy
My Dell T320 has the latest BIOS 2.9.0 and the latest idrac, and the problem booting with Proxmox 7.2.3 continues. the way to boot is just downgrading to a previous kernel at the grub screen.
As stated previously: run as root:

proxmox-boot-tool kernel pin 5.13.19-6-pve

Then you won't get a bad surprise after an outage.
 
Last edited:
Same Issue on Dell R340 (multiple). Got me a nice drive to the datacenter yesterday evening.
Boot fine under 5.13.19-6-pve.
Is there an update available for this?
Can I execute "proxmox-boot-tool kernel pin 5.13.19-6-pve" via shell or ssh safely?
kernel.jpg
 
Last edited:
Same Issue on Dell R340 (multiple). Got me a nice drive to the datacenter yesterday evening.
Boot fine under 5.13.19-6-pve.
Is there an update available for this?
the r340 should be new enough - but I'd suggest to try the following:
* make sure the latest firmware updates are installed for all components (the lifecycle-manager on Dell servers is quite comfortable for this)
* boot into the 5.15 kernel - if the issue persists follow:
https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_7.2 (known issues part about turning intel_iommu=off)
* boot into the 5.15 kernel

let us know if any of this fixed your issues
Can I execute "proxmox-boot-tool kernel pin 5.13.19-6-pve" via shell or ssh safely?
yes this should work remotely as well
 
Hi, I have a similar issue on a PowerEdge R340 server with PERC H330. Had to downgrade to 5.13.19-6 kernel.
Using iDRAC I updated to the latest firmware, the issue still persists. Tried rootdelay but the result is the same.

I confirm proxmox-boot-tool kernel pin 5.13.19-6-pve works, for now.
 
  • Like
Reactions: LouisianaGuy

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!