Kernel panic related to storage?

jdruwe

New Member
Mar 13, 2020
18
1
3
31
Hey guys, I am yet again experiencing a kernel panic, this is what I was able to see on the screen that is directly configured to my NUC, I am not sure if there is another way to get a full log of this event happening:

106552295_676548406229118_372823275267394857_n.jpg

Output from pveversion -v:

Bash:
root@pve:~# pveversion -v
proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve)
pve-manager: 6.2-6 (running version: 6.2-6/ee1d7754)
pve-kernel-5.4: 6.2-4
pve-kernel-helper: 6.2-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-3
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-8
pve-cluster: 6.1-8
pve-container: 3.1-8
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1

I am using an USB for backup, a Zigbee USB in pass-through mode for home automation and a Kingston A2000 M.2 NVMe SSD. I was thinking it was related to this post: https://forum.proxmox.com/threads/nvme-ssd-driver-or-kernel-problem.31845/ but that seems to be solved 3 years ago already. Can anyone help me from keeping my nuc from freezing?
 
Last edited:
Hey guys, I am yet again experiencing a kernel panic, this is what I was able to see on the screen that is directly configured to my NUC, I am not sure if there is another way to get a full log of this event happening:

You have IO errors on the dm-1 blockdevice, which is probably your root dev, the kernel panic is then probably just a result of those errors, not the error per se.

What are you using as main disk? Is smartctl showing any errors/problems? From those errors above it seems pretty faulty.
 
You have IO errors on the dm-1 blockdevice, which is probably your root dev, the kernel panic is then probably just a result of those errors, not the error per se.

What are you using as main disk? Is smartctl showing any errors/problems? From those errors above it seems pretty faulty.

I have 2 disks:

1593766395182.png
/dev/nvme0n1 is used for my main storage:

Code:
root@pve:~# smartctl -a /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.44-2-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       KINGSTON SA2000M8500G
Serial Number:                      50026B7282536DB4
Firmware Version:                   S5Z42105
PCI Vendor/Subsystem ID:            0x2646
IEEE OUI Identifier:                0x0026b7
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          500,107,862,016 [500 GB]
Namespace 1 Utilization:            44,674,641,920 [44.6 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            0026b7 282536db45
Local Time is:                      Fri Jul  3 10:54:24 2020 CEST
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     75 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
0 +     9.00W       -        -    0  0  0  0        0       0
1 +     4.60W       -        -    1  1  1  1        0       0
2 +     3.80W       -        -    2  2  2  2        0       0
3 -   0.0450W       -        -    3  3  3  3     2000    2000
4 -   0.0040W       -        -    4  4  4  4    15000   15000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        23 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    4,545,172 [2.32 TB]
Data Units Written:                 1,149,848 [588 GB]
Host Read Commands:                 31,699,751
Host Write Commands:                50,846,125
Controller Busy Time:               2,074
Power Cycles:                       76
Power On Hours:                     2,387
Unsafe Shutdowns:                   40
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, max 256 entries)
No Errors Logged

There seems to be no errors for the disk.

/dev/sda is used as vm/container backup
 
Hello,

it seams to be a problem with the Kingston A2000 M.2 NVMe SSD. We would like to setup a homelab and have the same issues...
Problem found on pve 6.2-15
 
Hello,

it seams to be a problem with the Kingston A2000 M.2 NVMe SSD. We would like to setup a homelab and have the same issues...
Problem found on pve 6.2-15

Yes indeed, replaced the SSD with another one from crucial and haven't seen the issue since.