Kernel 5.4.44 causes system freeze on HP MicroServer Gen8

SnowFox

Member
Jun 27, 2020
3
0
6
37
Last week pve-kernel 5.4.44 was released. I installed this and rebooted my machine (HP Microserver Gen8 running Proxmox 6.2).
A few hours later the system suddenly froze up completely and required a reboot. Nothing in the logs afterwards so probably a kernel panic?

This seemed to work fine for a while again, except ~20 hours later it happened yet again. Full freeze.
I rebooted the machine and chose the old kernel (5.4.41) which has always been working properly and has been for the past 7 days (after switching back from .44 to .41).

From git log it looks like this was changed between .41 and .44:
 
Hi,

same here after apt-get update from 5.4.34-1-pve to 5.4.44-1-pve.
System stall after 3h and freezes!
Running the "old" 5.4.34-1 is smooth without problems.
Is there a way to select the former 5.4.34-1-pve kernel to auto-boot with UEFI again!?

System is Intel(R) Pentium(R) CPU J4205 @ 1.50GHz, Motherboard J4205-ITX, 16GB RAM.
No probelms before Kernel Update!

Screenshot form the console attached.IMG_6775_small.jpg

Thanks in advance.

Regards,

Christian
 
I got the same issue though on 2 Dell R710 servers. I rebooted them though and loaded old kernel seems to be better now. But continuinung to monitor.
 
Same here.
My home server suddenly freeze randomly.
I am doing a memtest just to be sure.
I will try will old kernel when memtest will be done.

Asus P10S-i
Xeon 1230 v6
32 Go ECC Kingston
AMD WX 3100

edit: memtest success

edit2: actually no freeze with 5.4.41-1
 
Last edited:
In case it helps, I'm not seeing it on my Dell R310 testbed systems.
Intel(R) Xeon(R) CPU X3470 @ 2.93GHz
They have no significant load though. Just Centos 7 test VMs doing absolutely nothing.
I've backed them all up multiple times though, rebooted them, suspended them etc, just to see if I could trigger something, but no.

But as a precaution, I'm not booting into this kernel on my production systems.
 
This may be a fallout from some more in depth changes for some KVM/Kernel issues. We're investigating this on some older HW
and see if can pin point it down. For now please just boot the previous 5.4.41-1-pve kernel.

From git log it looks like this was changed between .41 and .44:

Yes, but not directly, as this is just the packaging repo with the kernel sources checked in as git submodule, in this update the 5.4.0-38.42 tag got in, so the regression was introduced between 5.4.0-32.36 and that one.
https://git.proxmox.com/?p=mirror_ubuntu-focal-kernel.git;a=shortlog;h=refs/tags/Ubuntu-5.4.0-38.42
 
Does an NFS or samba server run direct on this Proxmox VE host?
 
For me yes, both Samba and NFS runs directly from the host


service --status-all
[ + ] apcupsd
[ + ] apparmor
[ + ] atd
[ - ] console-setup.sh
[ + ] corosync
[ + ] cpufrequtils
[ + ] cron
[ + ] dbus
[ + ] edac
[ + ] hddtemp
[ - ] hwclock.sh
[ + ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ + ] lm-sensors
[ + ] loadcpufreq
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ - ] mdadm
[ - ] mdadm-waitidle
[ + ] netdata
[ + ] networking
[ - ] nfs-common
[ + ] nfs-kernel-server
[ + ] nginx
[ - ] nmbd
[ + ] open-iscsi
[ + ] postfix
[ + ] procps
[ + ] rbdmap
[ + ] rpcbind
[ + ] rrdcached
[ - ] rsync
[ + ] rsyslog
[ - ] samba-ad-dc
[ + ] smartmontools
[ + ] smbd
[ + ] ssh
[ - ] sudo
[ + ] udev
[ - ] ups-monitor
[ - ] x11-common
 
Last edited:
Same problem here with 5.4.44 on only one machine of the cluster.

It's a machine in production, I can't really touch it.

I booted on 5.4.41 with no problem.

- On this machine SMB and NFS are activated.
- On the others that work with 5.4.44 just NFS is enabled.

CG
 
What disks models are in use?

Also, which IO scheduler do they use: # cat /sys/block/BLOCKDEVICE/queue/scheduler For example:
Bash:
# cat /sys/block/sda/queue/scheduler
# cat /sys/block/nvme0n1/queue/scheduler
 
For me two NVME disks :

Code:
cat /sys/block/nvme0n1/queue/scheduler
[none] mq-deadline

cat /sys/block/nvme1n1/queue/scheduler
[none] mq-deadline
 
Two devices:
sda (ssd): none] mq-deadline
Samsung SSD 860

sdb(disk): [mq-deadline] none
WDC WD10JFCX-68N

filesystem: ZFS on both devices,
exported only device sdb (disk) with zfs dataset via SMB am NFS.

Regards,

Christian
 
OK, so this looks like a regression with the combination of "a bit older Intel CPU", ZFS and samba export. We could obvserve something like describe here at least once on a system here. We're currently investigating to find a better/quicker reproducer (as of now it happened the earliest with 3-4 hours uptime).
 
  • Like
Reactions: Chriswiss
Not sure for ZFS, because I dont use it. But older Intel CPU with Samba match.

disks:

disks.PNG

All disks uses deadline (all SATA, no NVME)
Root is on ext4, everything else is lvm or thin-lvm

cat /sys/block/sd*/queue/scheduler
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none

mardi 30 juin 2020, 22:29:19 (UTC+0200)
up 2 days, 11 hours, 56 minutes

- SYSTEM
CPU fan: 1000 RPM (min = 600 RPM)
HDD fan: 703 RPM (min = 450 RPM)
CPU temp: +41.5°C (high = +90.0°C, hyst = +91.0°C)
(crit = +100.0°C)

- HDD TEMP
/dev/sda: INTEL SSDSC2KB480G7: 33°C
/dev/sdb: INTEL SSDSC2KB960G7: 39°C
/dev/sdc: WDC WD30EFRX-68EUZN0: 40°C
/dev/sdd: WDC WD30EFRX-68EUZN0: 41°C
/dev/sde: ST2000LM015-2E8174: 38°C

- STOCKAGE
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
udev 16G 0 16G 0% /dev
tmpfs 3,2G 11M 3,2G 1% /run
/dev/sda1 7,9G 4,4G 3,1G 59% /
tmpfs 16G 51M 16G 1% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/pve--backup-backup--vm 1,5T 920G 481G 66% /srv/backup_vm
/dev/mapper/pve--backup-backup--root 7,9G 627M 6,8G 9% /srv/backup_root
/dev/mapper/pve--root-iso 49G 30G 17G 64% /srv/iso
/dev/mapper/wd--raid1-download 98G 24G 70G 25% /srv/download
/dev/mapper/wd--raid1-medias 1,2T 1,1T 103G 92% /srv/mediacenter
/dev/mapper/wd--raid1-temp 196G 121G 66G 65% /srv/temp
/dev/mapper/wd--raid1-backup--pc 492G 249G 218G 54% /srv/backup_pc
/dev/fuse 30M 32K 30M 1% /etc/pve
tmpfs 3,2G 0 3,2G 0% /run/user/1000


md0 : active raid1 sdd1[1] sdc1[2]
2930134464 blocks super 1.2 [2/2] [UU]
bitmap: 0/22 pages [0KB], 65536KB chunk

(parted) p
Model: ATA INTEL SSDSC2KB48 (scsi)
Disk /dev/sda: 480GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1049kB 8591MB 8590MB ext4 root boot, esp
2 8591MB 12,9GB 4295MB linux-swap(v1) swap
3 12,9GB 13,4GB 537MB ext4 clonezilla
4 13,4GB 14,0GB 537MB ext4 gparted
5 14,0GB 480GB 466GB

pvdisplay
--- Physical volume ---
PV Name /dev/sde1
VG Name pve-backup
PV Size <1,82 TiB / not usable <4,07 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 476931
Free PE 0
Allocated PE 476931
PV UUID AO6DZo-9wmI-aEOh-4MNv-fnA5-AUXI-3Gz06W

--- Physical volume ---
PV Name /dev/sdb1
VG Name pve-data
PV Size 894,25 GiB / not usable <2,32 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 228928
Free PE 0
Allocated PE 228928
PV UUID fpl0L0-6BGO-mn8G-dZVx-qqqR-rufw-fM5epf

--- Physical volume ---
PV Name /dev/sda5
VG Name pve-root
PV Size 434,13 GiB / not usable <1,82 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 111137
Free PE 0
Allocated PE 111137
PV UUID ltziXh-vzLW-yb41-FEJT-kjew-DE3m-WceiMX

--- Physical volume ---
PV Name /dev/md0
VG Name wd-raid1
PV Size <2,73 TiB / not usable <3,44 MiB
Allocatable yes
PE Size 4,00 MiB
Total PE 715364
Free PE 195991
Allocated PE 519373
PV UUID 1im3Lg-SAJs-FohY-MOyn-gzoK-mDzX-9fLSGp

lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
backup-root pve-backup -wi-ao---- 8,00g
backup-vm pve-backup -wi-ao---- 1,46t
cromlvhom-backup pve-backup -wi-ao---- 355,01g
vm-101-disk-0 pve-data Vwi-a-tz-- 4,00m vm-data 12,50
vm-101-disk-2 pve-data Vwi-a-tz-- 300,00g vm-data 86,54
vm-103-disk-1 pve-data Vwi-a-tz-- 128,00g vm-data 12,58
vm-106-disk-1 pve-data Vwi-aotz-- 150,00g vm-data 77,30
vm-107-disk-1 pve-data Vwi-a-tz-- 32,00g vm-data 34,28
vm-109-disk-0 pve-data Vwi-a-tz-- 4,00m vm-data 12,50
vm-112-disk-0 pve-data Vwi-a-tz-- 120,00g vm-data 2,29
vm-data pve-data twi-aotz-- 894,03g 45,35 32,58
base-200-disk-0 pve-root Vri---tz-k 50,00g vm-os
base-200-disk-1 pve-root Vri---tz-k 4,00m vm-os
base-201-disk-0 pve-root Vri---tz-k 8,00g vm-os
iso pve-root -wi-ao---- 50,00g
vm-100-disk-0 pve-root Vwi-aotz-- 8,00g vm-os 87,74
vm-101-disk-0 pve-root Vwi-a-tz-- 60,00g vm-os 71,83
vm-102-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 79,94
vm-103-disk-0 pve-root Vwi-a-tz-- 16,00g vm-os 34,41
vm-104-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 96,58
vm-105-disk-0 pve-root Vwi-a-tz-- 6,00g vm-os 94,97
vm-106-disk-0 pve-root Vwi-aotz-- 8,00g vm-os 99,14
vm-107-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 57,20
vm-108-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 34,10
vm-109-disk-0 pve-root Vwi-a-tz-- 50,00g vm-os 60,53
vm-110-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 69,05
vm-110-disk-1 pve-root Vwi-aotz-- 4,00g vm-os 41,63
vm-111-disk-0 pve-root Vwi-a-tz-- 4,00m vm-os 6,25
vm-111-disk-1 pve-root Vwi-a-tz-- 100,00g vm-os 34,01
vm-112-disk-0 pve-root Vwi-a-tz-- 16,00g vm-os 20,47
vm-113-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 11,90
vm-113-disk-1 pve-root Vwi-a-tz-- 8,00g vm-os 16,04
vm-113-disk-2 pve-root Vwi-a-tz-- 8,00g vm-os 15,93
vm-113-disk-3 pve-root Vwi-a-tz-- 8,00g vm-os 15,93
vm-113-disk-4 pve-root Vwi-a-tz-- 4,00m vm-os 6,25
vm-os pve-root twi-aotz-- 383,93g 49,43 33,00
backup-pc wd-raid1 -wi-ao---- 500,00g
download wd-raid1 -wi-ao---- 100,00g
medias wd-raid1 -wi-ao---- 1,20t
temp wd-raid1 -wi-ao---- 200,00g