Opt-in Linux 6.2 Kernel for Proxmox VE 7.x available

What does "freeze" means exactly? Anything in the logs (e.g., journalctl -b)
Freeze = Proxmox webgui not accessible, no ping reply, no "DOS" screen with monitor attached. But the physical pc is still running (I can hear the cpu fan spinning)
Logs only say "Reboot", no errors etc.
 
What network card is running there? Maybe one using the igb driver? (e.g., check with lspci -k)
Code:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
        DeviceName: Onboard Lan
        Subsystem: Hewlett-Packard Company Ethernet Connection (2) I219-LM
        Kernel driver in use: e1000e
        Kernel modules: e1000e
 
By the way, the overal temperature of the unit/pc is rising when it is doing almost nothing.
Now on 65.5 C degrees (was ~50).
This pc was running perfectly and cool before the kernel and BIOS updates.
The big question is : is it the new kernel or is it the BIOS update?
 
  • Like
Reactions: Spoonman2002
Kernel is probably easier to test by simply rebooting into the old one (select on boot or use kernel pin tool)

Will do a kernel downgrade when Proxmox freezes again,

I noticed with the 6.2.1-9 kernel that all my VMs could upgrade from 6.x to 7.2 (vm > Hardware > Machine > pc-i440fx-7.2).
Do I have to adjust this setting again when downgrading?
 
I noticed with the 6.2.1-9 kernel that all my VMs could upgrade from 6.x to 7.2 (vm > Hardware > Machine > pc-i440fx-7.2).
Do I have to adjust this setting again when downgrading?
The VM machine version has nothing to do with the kernel, only with the installed QEMU version.
Also note that by default Proxmox VE normally only pins the VM machine version for Windows OS type; as Windows is very brittle and breaks at the slightest/smallest changes of the machine layout; so for important (production) Windows VMs, I'd recommend testing if the VM can handle the change first.
 
  • Like
Reactions: Spoonman2002
What network card is running there? Maybe one using the igb driver? (e.g., check with lspci -k)

Hello again,

I am curious about your remark about "using the igb driver".
What exactly do you mean by that, and is the igb driver good or bad (performance wise)?

My other Proxmox host has 2 nics and this is the lspci -k command :

Code:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection I219-LM (rev 21)
        Subsystem: Intel Corporation Ethernet Connection I219-LM
        Kernel driver in use: e1000e
        Kernel modules: e1000e
01:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
        Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer I211 Gigabit Network Connection
        Kernel driver in use: igb
        Kernel modules: igb

You can see it uses e1000e AND igb drivers....
 
What exactly do you mean by that, and is the igb driver good or bad (performance wise)?
We identified a potential deadlock with the igb driver in the 5.15.104-1 kernel due to a backported patch, see this thread.
I mentioned it because the 6.2 might have also been affected and symptoms sounded similar enough. But the just today uploaded 6.2.11-1 kernel (available on pvetest) should also contain that fix then (on mobile so cannot check closely)
 
quick update:
Proxmox host with kernel 6.2.9-1 is running (uptime) 3 days without any problems!
(I accessed the BIOS, and switched off TPM settings that I switched on when I updated the BIOS firmware. Since then no freezes).
 
I installed kernel 6.2 as instructed and when it came back, it did not find the pve group, so it would not boot. I had to do a reset and boot in 6.1.15-1, and it came back. Do you have any idea what's happening?
how do I remove kernel 6.2? I did the upgrade because every two weeks I have a kernel soft lockup, and the machine needs to be rebooted. I only use containers, not VMs. How do I know what containers is crashing the box?
 
I updated my 6.2.6-1-pve kernel to the latest 6.2.9-1-pve today and it broke intel_gpu_top with the error: Failed to initialize PMU! (No such file or directory)

I went back to 6.2.6-1 and it works again.
 
I updated my 6.2.6-1-pve kernel to the latest 6.2.9-1-pve today and it broke intel_gpu_top with the error: Failed to initialize PMU! (No such file or directory)

I went back to 6.2.6-1 and it works again.
It is fixed in the latest 6.2.11-1-pve.
 
It's still failing for me unfortunately:
Code:
# uname -a
Linux accod 6.2.11-1-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.11-1 (2023-04-20T09:59Z) x86_64 GNU/Linux
# intel_gpu_top
Failed to initialize PMU! (No such file or directory)

However, going back to 6.2.6-1 now doesn't resolve it. I noticed pve-firmware was updated along with the kernel so I'm wondering if that's why. The gpu is working inside containers so passthrough and the device is working correctly, just not intel_gpu_top on the host.
 
I upgraded from 6.2.9 to 6.2.11 yesterday and after some hours, the machine completely locked up (not reachable via network).
I could not reset it via IPMI so I had to hold the power button for a couple of seconds until it shut off.
Then I could not power it on anymore, not even after a CMOS reset.
But leaving the server without power for 10 minutes finally solved it so it booted again.

After setting my BIOS up again (CMOS reset did work apparently), I had the same issue a couple of hours later.
This time I did not reset CMOS and just waited 10-15min without power to the server, booted up fine afterwards.

To be safe, I reverted to 6.1.15 which is working fine since 6pm yesterday.

Board is Supermicro X11SSH-F (latest BIOS 2.7) with Intel Core i3-7100.
 
By the way, the overal temperature of the unit/pc is rising when it is doing almost nothing.
Now on 65.5 C degrees (was ~50).
This pc was running perfectly and cool before the kernel and BIOS updates.
The big question is : is it the new kernel or is it the BIOS update?
I noticed the same behaviour, since kernel 6.2 update, the cpu temp is always about 4-10 degrees higher then before, although the power usage stayed the same. (mind you, I also updated proxmox to 7.4-3 at the same time, so not sure if it is the proxmox update or kernel update that is causing this)
 
It's still failing for me unfortunately:
Code:
# uname -a
Linux accod 6.2.11-1-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.11-1 (2023-04-20T09:59Z) x86_64 GNU/Linux
# intel_gpu_top
Failed to initialize PMU! (No such file or directory)

However, going back to 6.2.6-1 now doesn't resolve it. I noticed pve-firmware was updated along with the kernel so I'm wondering if that's why. The gpu is working inside containers so passthrough and the device is working correctly, just not intel_gpu_top on the host.
Today I updated the held back packages proxmox-ve and pve-kernel-helper and this resolved the error with intel_gpu_top.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!