Opt-in Linux 6.1 Kernel for Proxmox VE 7.x available

Blacklisting ahci would get you into trouble (since all SATA cotnrolers need it) but softdep should do the trick. Something like softdep ahci pre: vfio-pci (added to /etc/modprobe.d/vfio.conf and running update-initramfs -u) will simply load vfio-pci (just) before ahci loads. Check with lspci -nnk after a fresh reboot without starting the VM to make sure the driver in use is vfio-pci.
PS: Feel free to make a separate thread about this if you run into issues and need help.

This worked out like a charm. Thank you very much! :)
Unbelievable how often I did read the softdep-option in your posts and never came to the idea, that I should use it in my own setup; stupid me. :D

Hopefully the stack trace came either from this, maybe in combination with changes in that kernel-version or that it at least was only a one-time thing.
 
  • Like
Reactions: leesteken
Hi @all

Yesterday I tested the kernel 6.1.10-1-pve.
The good news is that the systems will boot again with HDD passthrough.
The bad news is that I have a disk IO of sometimes over 400% and therefore no VM can be used properly.
So I'm back to kernel 6.1.2-1-pve because that's the last one that works for me without errors.

Best regards,
Marcel
 
Hi @all

Yesterday I tested the kernel 6.1.10-1-pve.
The good news is that the systems will boot again with HDD passthrough.
The bad news is that I have a disk IO of sometimes over 400% and therefore no VM can be used properly.
So I'm back to kernel 6.1.2-1-pve because that's the last one that works for me without errors.

Best regards,
Marcel

I noticed that right after the update too.
However, after multiple kernel changes up and down, i not longer notice it even with 6.1.10-1. but it doesn't feel 100% right at the moment.
 
Hi, folks!
I have a PVE node installed on a 2018 Mac Mini (with the T2 chip). since the Proxmox Kernel is the Ubuntu Kernel, I wonder if it would be possible to install the modified Kernel https://github.com/t2linux/T2-Ubuntu-Kernel . I would like to test it, because I can't make the FAN work at all and this Kernel already has the patches for the infamous T2.
 
Hi there!

I've read good news about 6.* version kernels using for PVE. It's good because I've got fresh AMD AM5 Ryzen 5 7600X platform. Its' integrated RDNA2 GPU has got drivers only in 5.16.* kernels, so I hope to use Proxmox with 6.1.* option kernels.

But now bad news... how to INSTALL ProxmoxVE from official .ISO image with old kernel without new hardware support?

Please help me with link or post about such problem solved... may be customized image?

https://linux-hardware.org/?probe=260d257a0c
 
Wondering, if there might be something odd, seemingly in regard to storage, going on since 6.1.6?

With 6.1.6 there were the physical disk passthrough problems, which seem to be gone with 6.1.10:

With 6.1.6 and 6.1.10 (what about others?) a problem regarding (seemingly?) virtio-blk:

(With 6.1.6 host-crashes and) with 6.1.10 a problem regarding bad crc/signature and data crc errors on Ceph:

With 6.1.10 (partly already gone?) problems with high disk-IO:

With 6.1.10 a problem regarding PCIe-passthrough of an SATA-HBA (of course, could have been my not proper set up early-binding; but it worked almost 2 years this way without a single problem):

I have absolutely no clue, if this all could be (theoretically) related at all; but at least to my lay eyes, it looks like it all has to do with storage...
 
I noticed with kernel pve-kernel-5.15.85-1-pve my system enters C-State 8 but with this 6.10 kernel it only goes to C3.

Has anyone else noticed this? I'm trying to figure out a way to debug this but haven't found anything yet.
 
I noticed with kernel pve-kernel-5.15.85-1-pve my system enters C-State 8 but with this 6.10 kernel it only goes to C3.

Has anyone else noticed this? I'm trying to figure out a way to debug this but haven't found anything yet.

How can you tell? What is your CPU?
 
Wondering, if there might be something odd, seemingly in regard to storage, going on since 6.1.6?

....

With 6.1.10 (partly already gone?) problems with high disk-IO:

With 6.1.10 a problem regarding PCIe-passthrough of an SATA-HBA (of course, could have been my not proper set up early-binding; but it worked almost 2 years this way without a single problem):

I have absolutely no clue, if this all could be (theoretically) related at all; but at least to my lay eyes, it looks like it all has to do with storage...

switched to 6.1.10 yesterday - all VMs using diskpassthrough were hanging this morning (on 4 different hosts). Other machines wo diskpaththrough and LXC containers are not showing any issue so far.

What kind of info would be helpful to troubleshoot those issues? Normal shutdown did not work but after killing the VM and restart it seems to work again. (question is how long).
Bah just written the first VM running into issues again. Showing 100% CPU load 60+ (4 cores).

top - 09:41:34 up 3:10, 2 users, load average: 596.51, 573.15, 506.59
Tasks: 132 total, 1 running, 131 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.3 sy, 0.0 ni, 0.0 id, 99.7 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 5807.5 total, 2947.3 free, 1147.8 used, 1712.4 buff/cache
MiB Swap: 3227.0 total, 3226.7 free, 0.3 used. 4360.9 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
832 olaf-kr+ 20 0 17464 8088 5396 S 0.7 0.1 0:02.70 sshd

on the prompt you get then regularly hung timeouts
1676651300629.png
 
Last edited:
Wondering, if there might be something odd, seemingly in regard to storage, going on since 6.1.6?

With 6.1.6 there were the physical disk passthrough problems, which seem to be gone with 6.1.10:

With 6.1.6 and 6.1.10 (what about others?) a problem regarding (seemingly?) virtio-blk:

(With 6.1.6 host-crashes and) with 6.1.10 a problem regarding bad crc/signature and data crc errors on Ceph:

With 6.1.10 (partly already gone?) problems with high disk-IO:

With 6.1.10 a problem regarding PCIe-passthrough of an SATA-HBA (of course, could have been my not proper set up early-binding; but it worked almost 2 years this way without a single problem):

I have absolutely no clue, if this all could be (theoretically) related at all; but at least to my lay eyes, it looks like it all has to do with storage...
I also reverted back because I have containers that wouldn't stop, and issues taking snapshots and backups on ceph.
 
switched to 6.1.10 yesterday - all VMs using diskpassthrough were hanging this morning (on 4 different hosts). Other machines wo diskpaththrough and LXC containers are not showing any issue so far.

What kind of info would be helpful to troubleshoot those issues? Normal shutdown did not work but after killing the VM and restart it seems to work again. (question is how long).
Bah just written the first VM running into issues again. Showing 100% CPU load 60+ (4 cores).

top - 09:41:34 up 3:10, 2 users, load average: 596.51, 573.15, 506.59
Tasks: 132 total, 1 running, 131 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.3 sy, 0.0 ni, 0.0 id, 99.7 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 5807.5 total, 2947.3 free, 1147.8 used, 1712.4 buff/cache
MiB Swap: 3227.0 total, 3226.7 free, 0.3 used. 4360.9 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
832 olaf-kr+ 20 0 17464 8088 5396 S 0.7 0.1 0:02.70 sshd

on the prompt you get then regularly hung timeouts
View attachment 46948
One night later on 6.1.2 all working normal still - so definitely an issue introduced since 6.1.2.
 
How can you tell? What is your CPU?
PowerTop will give you a lot of information. Should be in the apt repositories.

Edit: My findings on a
Code:
Model name:                      Intel(R) Celeron(R) N5105 @ 2.00GHz
Stepping:                        0
CPU MHz:                         2000.000
CPU max MHz:                     2900.0000
CPU min MHz:                     800.0000

Code:
           Pkg(OS)  |            CPU(OS) 0
Powered On  0.0%    | POLL       64.4%    0.3 ms
C1_ACPI     0.3%    | C1_ACPI     0.2%    0.2 ms
C2_ACPI     2.1%    | C2_ACPI     2.0%    1.6 ms
C3_ACPI    32.5%    | C3_ACPI    31.7%   15.5 ms
RC6pp       0.0%

                    |             GPU     |
                    | Powered On  1.7%    |
                    | RC6        98.3%    |
                    | RC6p        0.0%    |
                    | RC6pp       0.0%    |
                    |                     |

Code:
            Package |            CPU 0
Idle        98.9%   | Idle        99.8%
 800 MHz     0.0%   | 2.90 GHz     0.2%
2.90 GHz     1.1%   |
 
Last edited:
With 6.1.2-1-pve everithing was fine for me.
Since 6.1.10-1-pve my extra USB 2,5GBit/s NIC's are getting randomly disconnected.
 
@t.lamprecht are those issues tracked? Just contacting you as you openend this thread. Maybe there are already plenty of bugreports and tickets? If needed we could also open a thread per issue area? In case this was already clarified (how to handle) sorry was not reading the complete thread.
 
@t.lamprecht are those issues tracked? Just contacting you as you openend this thread. Maybe there are already plenty of bugreports and tickets? If needed we could also open a thread per issue area? In case this was already clarified (how to handle) sorry was not reading the complete thread.
Yeah, we keep track of it, and most issues from a few weeks+ ago got already solved. I don't think opening a separate thread for every one of those will help that much.
Please note as that with the 6.1 being opt-in, we do not allocate as much time as for our default 5.15 kernel. But we still definitively keep a look-out for newer 6.1 point release and move also relatively quickly to them, at least if it seems like they bring fixes our users here could profit from.
 
Will a 6.2 kernel be released in the same manner?
No, probably not in the near future. The 6.1 is an LTS one and might be the one we'll base off our next major release, so in Proxmox VE 7.x releases we probably won't ship any newer major kernel as opt-in than the already available 6.1 version.
 
it looks like it all has to do with storage...

Just for test I tried to get a stack trace for 6.1.6 and with pass through physical disk (default options) I was able to get a stack trace like this:
- add disk: qm set 100 -scsi1 /dev/disk/by-id/<physicaldiskid>
- boot vm with parted live iso, create partition table, create partition ext4.
- as soon as I press apply, I get a stack trace on the host, host is not able to shutdown.

This stack trace and others I saw in this thread has similar info as this bugreport.
As this report hinted towards io-uring, I set the passed through disk to aio=threads, and there was no stack trace when running above scenario.
With kernel 6.1.10 I cannot reproduce this stack trace anymore, even with aio=io_uring.
So for physical disk passthrough it might be a try to set aio=threads and see how that behaves.
 
  • Like
Reactions: Neobin
Please note as that with the 6.1 being opt-in, we do not allocate as much time as for our default 5.15 kernel. But we still definitively keep a look-out for newer 6.1 point release and move also relatively quickly to them, at least if it seems like they bring fixes our users here could profit from.
Would be fine but had to switch to 6.1 as I got crashes when migrating VM between machines. In 6.1 this works wo issue.

As this report hinted towards io-uring, I set the passed through disk to aio=threads, and there was no stack trace when running above scenario.
With kernel 6.1.10 I cannot reproduce this stack trace anymore, even with aio=io_uring.
will try changing this option for the next 6.1 kernel version.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!