Crash with Intel 760p NVME

HighwayStar

New Member
Nov 12, 2019
3
1
3
37
We have ASUS RS700A-E9 platform with dual epyc 7501 and installed Proxmox 6.0 on 4 HDDs. We wanted to upgrade HDDs to 2xNVME but faced kernel panic.


Code:
[   13.738723] i40e 0000:21:00.0: PCI-Express: Speed 8.0GT/s Width x8
[   13.751726] i40e 0000:21:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 119 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[   13.753397] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 4
[   13.757378] {1}[Hardware Error]: event severity: fatal
[   13.757378] {1}[Hardware Error]:  Error 0, type: fatal
[   13.757378] {1}[Hardware Error]:  fru_text: PcieError
[   13.757378] {1}[Hardware Error]:   section_type: PCIe error
[   13.757378] {1}[Hardware Error]:   port_type: 4, root port
[   13.757378] {1}[Hardware Error]:   version: 0.2
[   13.757378] {1}[Hardware Error]:   command: 0x0407, status: 0x0010
[   13.757378] {1}[Hardware Error]:   device_id: 0000:40:01.2
[   13.757378] {1}[Hardware Error]:   slot: 238
[   13.757378] {1}[Hardware Error]:   secondary_bus: 0x41
[   13.757378] {1}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1453
[   13.757378] {1}[Hardware Error]:   class_code: 000406
[   13.757378] {1}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0010
[   13.757378] {1}[Hardware Error]:   aer_uncor_status: 0x00100000, aer_uncor_mask: 0x04500000
[   13.757378] {1}[Hardware Error]:   aer_uncor_severity: 0x004e2030
[   13.757378] {1}[Hardware Error]:   TLP Header: 00000000 00000000 00000000 00000000
[   13.757378] Kernel panic - not syncing: Fatal hardware error!
[   13.757378] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.0.21-4-pve #1
[   13.757378] Hardware name: ASUSTeK COMPUTER INC. RS700A-E9-RS4/KNPP-D32 Series, BIOS 1301 06/17/2019
[   13.757378] Call Trace:
[   13.757378]  <IRQ>
[   13.757378]  dump_stack+0x63/0x8a
[   13.757378]  panic+0x101/0x2a7
[   13.757378]  __ghes_panic.cold.32+0x21/0x21
[   13.757378]  ? ghes_irq_func+0x50/0x50
[   13.757378]  ghes_proc+0xe0/0x140
[   13.757378]  ghes_poll_func+0x2c/0x60
[   13.757378]  call_timer_fn+0x30/0x130
[   13.757378]  run_timer_softirq+0x38a/0x420
[   13.757378]  ? ktime_get+0x40/0xa0
[   13.757378]  ? lapic_next_event+0x20/0x30
[   13.757378]  ? clockevents_program_event+0x93/0xf0
[   13.757378]  __do_softirq+0xdc/0x2f3
[   13.757378]  irq_exit+0xc0/0xd0
[   13.757378]  smp_apic_timer_interrupt+0x79/0x140
[   13.757378]  apic_timer_interrupt+0xf/0x20
[   13.757378]  </IRQ>
[   13.757378] RIP: 0010:cpuidle_enter_state+0xbd/0x450
[   13.757378] Code: ff e8 17 9d 85 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff e8 2a d2 8b ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 89 cf 01 00 00 41 c7 44 24 08 00 00 00 00 48 83 c4 18
[   13.757378] RSP: 0018:ffffafe8c0217e60 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[   13.757378] RAX: ffff91f00e9621c0 RBX: ffffffff893629c0 RCX: 0000000333c38c26
[   13.757378] RDX: 0000000333c38c26 RSI: 0000000333c38bfe RDI: 0000000000000000
[   13.757378] RBP: ffffafe8c0217ea0 R08: ffffffffffc2f714 R09: 0000000000021a80
[   14.707128] scsi 0:0:0:0: CD-ROM            AMI      Virtual CDROM0   1.00 PQ: 0 ANSI: 0 CCS
[   14.714782] scsi 1:0:0:0: Direct-Access     AMI      Virtual Floppy0  1.00 PQ: 0 ANSI: 0 CCS
[   13.757378] R10: 00000037e4dac2dc R11: ffff91f00e961044 R12: ffff91f000b3c000
[   13.757378] R13: 0000000000000002 R14: ffffffff89362a98 R15: ffffffff89362a80
[   13.757378]  cpuidle_enter+0x17/0x20
[   13.757378]  call_cpuidle+0x23/0x40
[   13.757378]  do_idle+0x22c/0x270
[   13.757378]  cpu_startup_entry+0x1d/0x20
[   13.757378]  start_secondary+0x1ab/0x200
[   13.757378]  secondary_startup_64+0xa4/0xb0
[   13.757378] Kernel Offset: 0x6c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   13.757378] Rebooting in 30 seconds..

tried with 5.0.21-4-pve and test kernel 5.3.7-1-pve Full logs captured from serial console attached.
 

Attachments

Was it nvme_core.default_ps_max_latency_us=1500 or nvme_core.default_ps_max_latency_us=5500? I'm using 2x1.2TB NVME SSD's (INTEL SSDPE2MX012T7), but the details I found on disabling the unsupported lowest power saving state refers to using 5500 vs 1500 for the value.
 
I've cheched avaliable powersaving states first with nvme id-ctrl from nvme-cli package.
Code:
#nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid       : 0x8086
ssvid     : 0x8086
sn        : EDITED
mn        : INTEL SSDPEKKW010T8                    
fr        : 004C  
rab       : 6
ieee      : 5cd2e4
cmic      : 0
mdts      : 6
cntlid    : 1
ver       : 10300
rtd3r     : 7a120
rtd3e     : 1e8480
oaes      : 0x200
ctratt    : 0
rrls      : 0
oacs      : 0x17
acl       : 4
aerl      : 7
frmw      : 0x14
lpa       : 0xf
elpe      : 255
npss      : 4
avscc     : 0
apsta     : 0x1
wctemp    : 348
cctemp    : 353
mtfa      : 50
hmpre     : 0
hmmin     : 0
tnvmcap   : 0
unvmcap   : 0
rpmbs     : 0
edstt     : 5
dsto      : 1
fwug      : 0
kas       : 0
hctma     : 0x1
mntmt     : 303
mxtmt     : 348
sanicap   : 0x3
hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
anatt     : 0
anacap    : 0
anagrpmax : 0
nanagrpid : 0
sqes      : 0x66
cqes      : 0x44
maxcmd    : 0
nn        : 1
oncs      : 0x5f
fuses     : 0
fna       : 0x4
vwc       : 0x1
awun      : 0
awupf     : 0
nvscc     : 0
nwpc      : 0
acwu      : 0
sgls      : 0
mnan      : 0
subnqn    : nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
ctrattr   : 0
msdbd     : 0
ps    0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0040W non-operational enlat:6000 exlat:8000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

you need to set nvme_core.default_ps_max_latency_us to value lower than enlat in state that you want to exclude. nvme_core.default_ps_max_latency_us=0 disables all powersaving states, but it should be fine for server use.