Kernel 5.4.44 causes system freeze on HP MicroServer Gen8

SnowFox · Jul 3, 2020

Glad to hear the issue appears to have been solved! I just installed pve-kernel-5.4.44-2-pve on my machine.

stephane de Labrusse · Jul 4, 2020

I had no luck with pve-kernel-5.4.44-2-pve, I needed to roll back to the stable previous kernel. Now I have no freeze anymore, with the last kernel I just need a couple of hours to get a server down

Not sure the story is over

AMD FX(tm)-8320 Eight-Core Processor

t.lamprecht · Jul 4, 2020

stephane de Labrusse said:
Not sure the story is over

As our test case here and all others mentioned are working fine I'd say the story for the "cgroup_bpf_run_filter_skb NULL pointer dereference" is over, for now at least.

You can check if you see a kernel oops/panic message somewhere (tty1 or just ssh/console with dmesg -wT open.

This specific issue would have at least the following lines included:

Code:

[Wed Jul  1 13:54:11 2020] BUG: kernel NULL pointer dereference, address: 0000000000000010
[Wed Jul  1 13:54:11 2020] #PF: supervisor read access in kernel mode
[Wed Jul  1 13:54:11 2020] #PF: error_code(0x0000) - not-present page
[ ...snip .. ]
[Wed Jul  1 13:54:11 2020] RIP: 0010:__cgroup_bpf_run_filter_skb+0x26d/0x3d0

If yours doesn't it's highly probable something completely different.

stephane de Labrusse said:
AMD FX(tm)-8320 Eight-Core Processor

Quite the old processor, could be another regression. So please:

1. Check that your motherboards firmware is up to date
2. Ensure that the amd64-microcode is installed
3. get some hands on that server to ensure you get any kernel log if it crashes

If point 3 hits and you got some info please open another thread with that.

stephane de Labrusse · Jul 5, 2020

t.lamprecht said:
If yours doesn't it's highly probable something completely different.

thank for all of this, indeed I had maybe something different, will check

Highly appreciated your inputs

H4R0 · Jul 5, 2020

pve-kernel-5.4.44-2-pve fixed the kernel panic for me running 48hours

sahostking · Jul 6, 2020

ok looks for me like still happening when I do an resize on lvm partitions or anything runs against lvm.
If I dont then its fine but as soon as I try to make a change to lvm partition it dies. Tried it on few servers same issue on pve-kernel-5.4.44-2-pve.

I see these stuck:
root 5938 0.0 0.0 15848 8724 ? D 08:40 0:00 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count
root 7077 0.0 0.0 15848 8572 ? D 08:45 0:00 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count
root 7985 0.0 0.0 6072 892 pts/0 S+ 08:49 0:00 grep vgs
root 15259 0.0 0.0 15848 8640 ? D 02:36 0:00 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count
root 19983 0.0 0.0 15848 8724 ? D 02:56 0:00 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count

pve-kernel-5.4.44-2-pve atleast stops the vms going down randomly anymore so thats good but still caused the above for me.

Moved back to 5.3 pve kernel which has no issues and I can resize etc. fine.

t.lamprecht · Jul 6, 2020

sahostking said:
ok looks for me like still happening when I do an resize on lvm partitions or anything runs against lvm.

This seems like another issue, please open a new thread for it. Check dmesg -T for any error message.
FYI: lvresize works fine with the 5.4 kernel here, so this may also be something more specific top your setup.

Killvearn · Jul 6, 2020

pve-kernel-5.4.44-2-pve fixed it on AMD too (ryzen 7 1700)

Search

Search

Kernel 5.4.44 causes system freeze on HP MicroServer Gen8

SnowFox

Member

stephane de Labrusse

Member

t.lamprecht

Proxmox Staff Member

stephane de Labrusse

Member

H4R0

Renowned Member

sahostking

Renowned Member

t.lamprecht

Proxmox Staff Member

Killvearn

New Member

We value your privacy