Proxmox 8.1 | Linux 6.5.11-6 | Strange NVMe error prevents boot.

sondzack

New Member
Dec 9, 2023
4
0
1
Hello everyone,
I've run into an issue on my Lenovo m720q tiny, where proxmox fails to reboot. I've had the machine running for over a month, when overnight it randomly decided to stop working. It was unresponsive to ssh and when I plugged in a monitor I was met with a completely frozen session and a wall of nvme errors.

I had assumed there was an issue with the notorious power saving creep that's been haunting linux kernels (and I still assume the issue somewhat persists) but I had already taken the neccessary precautions and set the
Code:
nvme_core.default_ps_max_latency_us=0
flag in the linux kernel long ago.

I rebooted the system, but the system would always hang with the following message:

Code:
lvm:242 blocked for more than 120 seconds.
Not tainted 6.5.11-6-pve #1
nvme nvme0: Device not ready: aborting reset, CSTS=0x1

Odd, I booted into recovery mode and got the following errors (see attached images)
I check the status of the nvme using the lenovo built in bios-native test and the nvme passed without issue. The bios recognizes the drive without issue.

Any help would be extremely appreciated.
Kind regards.
 

Attachments

  • IMG_20240209_141540_117.jpg
    IMG_20240209_141540_117.jpg
    666.2 KB · Views: 20
  • IMG_20240209_142357_666.jpg
    IMG_20240209_142357_666.jpg
    648.2 KB · Views: 20
Hi,
I am currently encountering the same issue with Proxmox 7.4-17 (Linux 5.15.143-1).
It seems to have appeared with the last kernel/pve upgrade imo.
 
read the linux kernel manual. the 0 value is not a setting, specific drive require delay. you will find the formula to apply the correct one for your specific system.
 
same problem
6.5.13-5 kernel
Also tried
Code:
nvme_core.default_ps_max_latency_us=0,
but didn't work for me. System hang with
Code:
nvme nvme0: Device not ready: aborting reset, CSTS=0x1
 
Hi,
I am currently encountering the same issue with Proxmox 7.4-17 (Linux 5.15.143-1).
It seems to have appeared with the last kernel/pve upgrade imo.
Did you figured out how to fix this problem?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!