Kernel 5.4.44 causes system freeze on HP MicroServer Gen8

SnowFox

Member
Jun 27, 2020
3
0
6
37
Last week pve-kernel 5.4.44 was released. I installed this and rebooted my machine (HP Microserver Gen8 running Proxmox 6.2).
A few hours later the system suddenly froze up completely and required a reboot. Nothing in the logs afterwards so probably a kernel panic?

This seemed to work fine for a while again, except ~20 hours later it happened yet again. Full freeze.
I rebooted the machine and chose the old kernel (5.4.41) which has always been working properly and has been for the past 7 days (after switching back from .44 to .41).

From git log it looks like this was changed between .41 and .44:
 
Hi,

same here after apt-get update from 5.4.34-1-pve to 5.4.44-1-pve.
System stall after 3h and freezes!
Running the "old" 5.4.34-1 is smooth without problems.
Is there a way to select the former 5.4.34-1-pve kernel to auto-boot with UEFI again!?

System is Intel(R) Pentium(R) CPU J4205 @ 1.50GHz, Motherboard J4205-ITX, 16GB RAM.
No probelms before Kernel Update!

Screenshot form the console attached.IMG_6775_small.jpg

Thanks in advance.

Regards,

Christian
 
I got the same issue though on 2 Dell R710 servers. I rebooted them though and loaded old kernel seems to be better now. But continuinung to monitor.
 
Same here.
My home server suddenly freeze randomly.
I am doing a memtest just to be sure.
I will try will old kernel when memtest will be done.

Asus P10S-i
Xeon 1230 v6
32 Go ECC Kingston
AMD WX 3100

edit: memtest success

edit2: actually no freeze with 5.4.41-1
 
Last edited:
In case it helps, I'm not seeing it on my Dell R310 testbed systems.
Intel(R) Xeon(R) CPU X3470 @ 2.93GHz
They have no significant load though. Just Centos 7 test VMs doing absolutely nothing.
I've backed them all up multiple times though, rebooted them, suspended them etc, just to see if I could trigger something, but no.

But as a precaution, I'm not booting into this kernel on my production systems.
 
This may be a fallout from some more in depth changes for some KVM/Kernel issues. We're investigating this on some older HW
and see if can pin point it down. For now please just boot the previous 5.4.41-1-pve kernel.

From git log it looks like this was changed between .41 and .44:

Yes, but not directly, as this is just the packaging repo with the kernel sources checked in as git submodule, in this update the 5.4.0-38.42 tag got in, so the regression was introduced between 5.4.0-32.36 and that one.
https://git.proxmox.com/?p=mirror_ubuntu-focal-kernel.git;a=shortlog;h=refs/tags/Ubuntu-5.4.0-38.42
 
Does an NFS or samba server run direct on this Proxmox VE host?
 
For me yes, both Samba and NFS runs directly from the host


service --status-all
[ + ] apcupsd
[ + ] apparmor
[ + ] atd
[ - ] console-setup.sh
[ + ] corosync
[ + ] cpufrequtils
[ + ] cron
[ + ] dbus
[ + ] edac
[ + ] hddtemp
[ - ] hwclock.sh
[ + ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ + ] lm-sensors
[ + ] loadcpufreq
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ - ] mdadm
[ - ] mdadm-waitidle
[ + ] netdata
[ + ] networking
[ - ] nfs-common
[ + ] nfs-kernel-server
[ + ] nginx
[ - ] nmbd
[ + ] open-iscsi
[ + ] postfix
[ + ] procps
[ + ] rbdmap
[ + ] rpcbind
[ + ] rrdcached
[ - ] rsync
[ + ] rsyslog
[ - ] samba-ad-dc
[ + ] smartmontools
[ + ] smbd
[ + ] ssh
[ - ] sudo
[ + ] udev
[ - ] ups-monitor
[ - ] x11-common
 
Last edited:
Same problem here with 5.4.44 on only one machine of the cluster.

It's a machine in production, I can't really touch it.

I booted on 5.4.41 with no problem.

- On this machine SMB and NFS are activated.
- On the others that work with 5.4.44 just NFS is enabled.

CG
 
What disks models are in use?

Also, which IO scheduler do they use: # cat /sys/block/BLOCKDEVICE/queue/scheduler For example:
Bash:
# cat /sys/block/sda/queue/scheduler
# cat /sys/block/nvme0n1/queue/scheduler
 
For me two NVME disks :

Code:
cat /sys/block/nvme0n1/queue/scheduler
[none] mq-deadline

cat /sys/block/nvme1n1/queue/scheduler
[none] mq-deadline
 
Two devices:
sda (ssd): none] mq-deadline
Samsung SSD 860

sdb(disk): [mq-deadline] none
WDC WD10JFCX-68N

filesystem: ZFS on both devices,
exported only device sdb (disk) with zfs dataset via SMB am NFS.

Regards,

Christian
 
OK, so this looks like a regression with the combination of "a bit older Intel CPU", ZFS and samba export. We could obvserve something like describe here at least once on a system here. We're currently investigating to find a better/quicker reproducer (as of now it happened the earliest with 3-4 hours uptime).
 
  • Like
Reactions: Chriswiss
Not sure for ZFS, because I dont use it. But older Intel CPU with Samba match.

disks:

disks.PNG

All disks uses deadline (all SATA, no NVME)
Root is on ext4, everything else is lvm or thin-lvm

cat /sys/block/sd*/queue/scheduler
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none
[mq-deadline] none

mardi 30 juin 2020, 22:29:19 (UTC+0200)
up 2 days, 11 hours, 56 minutes

- SYSTEM
CPU fan: 1000 RPM (min = 600 RPM)
HDD fan: 703 RPM (min = 450 RPM)
CPU temp: +41.5°C (high = +90.0°C, hyst = +91.0°C)
(crit = +100.0°C)

- HDD TEMP
/dev/sda: INTEL SSDSC2KB480G7: 33°C
/dev/sdb: INTEL SSDSC2KB960G7: 39°C
/dev/sdc: WDC WD30EFRX-68EUZN0: 40°C
/dev/sdd: WDC WD30EFRX-68EUZN0: 41°C
/dev/sde: ST2000LM015-2E8174: 38°C

- STOCKAGE
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
udev 16G 0 16G 0% /dev
tmpfs 3,2G 11M 3,2G 1% /run
/dev/sda1 7,9G 4,4G 3,1G 59% /
tmpfs 16G 51M 16G 1% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/pve--backup-backup--vm 1,5T 920G 481G 66% /srv/backup_vm
/dev/mapper/pve--backup-backup--root 7,9G 627M 6,8G 9% /srv/backup_root
/dev/mapper/pve--root-iso 49G 30G 17G 64% /srv/iso
/dev/mapper/wd--raid1-download 98G 24G 70G 25% /srv/download
/dev/mapper/wd--raid1-medias 1,2T 1,1T 103G 92% /srv/mediacenter
/dev/mapper/wd--raid1-temp 196G 121G 66G 65% /srv/temp
/dev/mapper/wd--raid1-backup--pc 492G 249G 218G 54% /srv/backup_pc
/dev/fuse 30M 32K 30M 1% /etc/pve
tmpfs 3,2G 0 3,2G 0% /run/user/1000


md0 : active raid1 sdd1[1] sdc1[2]
2930134464 blocks super 1.2 [2/2] [UU]
bitmap: 0/22 pages [0KB], 65536KB chunk

(parted) p
Model: ATA INTEL SSDSC2KB48 (scsi)
Disk /dev/sda: 480GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1049kB 8591MB 8590MB ext4 root boot, esp
2 8591MB 12,9GB 4295MB linux-swap(v1) swap
3 12,9GB 13,4GB 537MB ext4 clonezilla
4 13,4GB 14,0GB 537MB ext4 gparted
5 14,0GB 480GB 466GB

pvdisplay
--- Physical volume ---
PV Name /dev/sde1
VG Name pve-backup
PV Size <1,82 TiB / not usable <4,07 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 476931
Free PE 0
Allocated PE 476931
PV UUID AO6DZo-9wmI-aEOh-4MNv-fnA5-AUXI-3Gz06W

--- Physical volume ---
PV Name /dev/sdb1
VG Name pve-data
PV Size 894,25 GiB / not usable <2,32 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 228928
Free PE 0
Allocated PE 228928
PV UUID fpl0L0-6BGO-mn8G-dZVx-qqqR-rufw-fM5epf

--- Physical volume ---
PV Name /dev/sda5
VG Name pve-root
PV Size 434,13 GiB / not usable <1,82 MiB
Allocatable yes (but full)
PE Size 4,00 MiB
Total PE 111137
Free PE 0
Allocated PE 111137
PV UUID ltziXh-vzLW-yb41-FEJT-kjew-DE3m-WceiMX

--- Physical volume ---
PV Name /dev/md0
VG Name wd-raid1
PV Size <2,73 TiB / not usable <3,44 MiB
Allocatable yes
PE Size 4,00 MiB
Total PE 715364
Free PE 195991
Allocated PE 519373
PV UUID 1im3Lg-SAJs-FohY-MOyn-gzoK-mDzX-9fLSGp

lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
backup-root pve-backup -wi-ao---- 8,00g
backup-vm pve-backup -wi-ao---- 1,46t
cromlvhom-backup pve-backup -wi-ao---- 355,01g
vm-101-disk-0 pve-data Vwi-a-tz-- 4,00m vm-data 12,50
vm-101-disk-2 pve-data Vwi-a-tz-- 300,00g vm-data 86,54
vm-103-disk-1 pve-data Vwi-a-tz-- 128,00g vm-data 12,58
vm-106-disk-1 pve-data Vwi-aotz-- 150,00g vm-data 77,30
vm-107-disk-1 pve-data Vwi-a-tz-- 32,00g vm-data 34,28
vm-109-disk-0 pve-data Vwi-a-tz-- 4,00m vm-data 12,50
vm-112-disk-0 pve-data Vwi-a-tz-- 120,00g vm-data 2,29
vm-data pve-data twi-aotz-- 894,03g 45,35 32,58
base-200-disk-0 pve-root Vri---tz-k 50,00g vm-os
base-200-disk-1 pve-root Vri---tz-k 4,00m vm-os
base-201-disk-0 pve-root Vri---tz-k 8,00g vm-os
iso pve-root -wi-ao---- 50,00g
vm-100-disk-0 pve-root Vwi-aotz-- 8,00g vm-os 87,74
vm-101-disk-0 pve-root Vwi-a-tz-- 60,00g vm-os 71,83
vm-102-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 79,94
vm-103-disk-0 pve-root Vwi-a-tz-- 16,00g vm-os 34,41
vm-104-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 96,58
vm-105-disk-0 pve-root Vwi-a-tz-- 6,00g vm-os 94,97
vm-106-disk-0 pve-root Vwi-aotz-- 8,00g vm-os 99,14
vm-107-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 57,20
vm-108-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 34,10
vm-109-disk-0 pve-root Vwi-a-tz-- 50,00g vm-os 60,53
vm-110-disk-0 pve-root Vwi-a-tz-- 4,00g vm-os 69,05
vm-110-disk-1 pve-root Vwi-aotz-- 4,00g vm-os 41,63
vm-111-disk-0 pve-root Vwi-a-tz-- 4,00m vm-os 6,25
vm-111-disk-1 pve-root Vwi-a-tz-- 100,00g vm-os 34,01
vm-112-disk-0 pve-root Vwi-a-tz-- 16,00g vm-os 20,47
vm-113-disk-0 pve-root Vwi-a-tz-- 8,00g vm-os 11,90
vm-113-disk-1 pve-root Vwi-a-tz-- 8,00g vm-os 16,04
vm-113-disk-2 pve-root Vwi-a-tz-- 8,00g vm-os 15,93
vm-113-disk-3 pve-root Vwi-a-tz-- 8,00g vm-os 15,93
vm-113-disk-4 pve-root Vwi-a-tz-- 4,00m vm-os 6,25
vm-os pve-root twi-aotz-- 383,93g 49,43 33,00
backup-pc wd-raid1 -wi-ao---- 500,00g
download wd-raid1 -wi-ao---- 100,00g
medias wd-raid1 -wi-ao---- 1,20t
temp wd-raid1 -wi-ao---- 200,00g
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!