Opt-in Linux 6.5 Kernel with ZFS 2.2 for Proxmox VE 8 available on test & no-subscription

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
6,440
3,474
303
South Tyrol/Italy
shop.proxmox.com
We recently uploaded a 6.5 kernel into our repositories, it will be used as new default kernel in the next Proxmox VE 8.1 point release (Q4'2023), following our tradition of upgrading the Proxmox VE kernel to match the current Ubuntu version until we reach an (Ubuntu) LTS release.

We have run this kernel on some parts of our test setups over the last few weeks without any notable issues, at least none that we addressed before this first public release.

How to install:
  1. Ensure that either the pve-no-subscription or pvetest repository is set up correctly.
    You can do so via CLI text-editor or using the web UI under Node -> Repositories
  2. apt update
  3. apt full-upgrade # to get ZFS 2.2 user space packages
  4. apt install pve-kernel-6.5
  5. reboot
Future updates to the 6.5 kernel will now get installed automatically.

Please note:
  • We will provide this new kernel soon on the pve-no-subscription repository. While the new kernel is currently only opt-in, and newer ZFS user space packages are compatible with older kernel modules, we still wanted to avoid rushing the ZFS user-space out to no-subscription.
  • The current 6.2 kernel is still supported and will still receive updates until at least end of year, but it's unlikely that the ZFS module will be updated in the 6.2 kernel anymore.
  • There were many changes, for improved hardware support, performance improvements in file system, but really all over the place, and enabling the Multi-Gen LRU by default. For a more complete list of changes we recommend checking out the kernel-newbies sites for 6.3, 6.4 and 6.5.
  • The kernel is also available on the test repository of Proxmox Backup Server and Proxmox Mail Gateway.
  • If you're unsure, we recommend continuing to use the 6.2-based kernel for now.

Feedback about how the new kernel performs in any of your setups is welcome!
Please provide basic details like CPU model, storage types used, ZFS as root file system, and the like, for both positive feedback or if you ran into some issues, where the 6.5 kernel seems to be the cause.
 
Last edited:
Big thanks. I will wait for the no-sub repo release. I wanna avoid changing the repos back and forward :)
 
So, i give the test repo an try, found some time. I will change boot aprameters for active EPP, reboot and test a bit.

First finding (not sure if PVE related):
/sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference: Device or resource busy --> i cant set the settings here
 
Last edited:
We will provide this new kernel soon on the pve-no-subscription repository. While the new kernel is currently only opt-in, and newer ZFS user space packages are compatible with older kernel modules, we still wanted to avoid rushing the ZFS user-space out to no-subscription.
Reading this i first wanted to test without kernel 6.5, so with current 6.2.16-19-pve.

As instructed in OP did a full-upgrade on a up-to-date system previously running no-sub repo. System is running zfs as root fs. Also encrypted zfs datasets with manual mount.
No issues detected, except one log entry, logged twice every boot and every time I manually mount for the encrypted sets: zfs mount -l -a
Code:
failed to lock /etc/exports.d/zfs.exports.lock: No such file or directory

Noticed that /etc/exports.d/ did not exist, so for testing created it: mkdir /etc/exports.d/
Now the logentry is gone while booting and manual mount. The lock file is created too.
Removed the exports.d folder and the log entry returns.
I don´t know if this is the needed fix.
There is a similar bugreport

Will test with kernel 6.5 too soon. Thanks for providing this new opt-in kernel.
 
  • Like
Reactions: Dunuin
we just upgraded our non critical PVE playground to pvetest with kernel 6.5 and ZFS 2.2 - no issues so far.
It's a single node PVE with two oldie but goldie Intel Xeon E5-2697 v2 and 512 GB RAM. / is on a single SSD (LVM), and there is a ZFS based storage.
zpool status <poolname> shows Some supported and requested features are not enabled on the pool..., and after zpool upgrade <poolname> status changed to ONLINE.
 
we just upgraded our non critical PVE playground to pvetest with kernel 6.5 and ZFS 2.2 - no issues so far.
It's a single node PVE with two oldie but goldie Intel Xeon E5-2697 v2 and 512 GB RAM. / is on a single SSD (LVM), and there is a ZFS based storage.
zpool status <poolname> shows Some supported and requested features are not enabled on the pool..., and after zpool upgrade <poolname> status changed to ONLINE.
This is expected - see `man zpool-features` - between minor version-upgrades - e.g. ZFS 2.1 -> 2.2 the on-disk format can change - because new features are added.
Keep in mind that `zpool upgrade` is usually not reversible - and if you have enabled a particular feature, but not yet used it (it is in state enabled, but not active) - you can still import the pool with an older kernel (with ZFS 2.1 ) - but once a feature is active the pool will only be importable with a kernel that ships with ZFS 2.2
see also: https://pve.proxmox.com/wiki/ZFS:_Switch_Legacy-Boot_to_Proxmox_Boot_Tool#Background

I hope this helps!
 
On kernel 6.5 no issue seen for now.
What I did notice on 6.2.16-19-pve with zfs-2.2.0-pve1:
zpool status gives (non-allocating) after each entry. Booting with 6.5 doesn't.

I think the warning is due to having a mismatch between the module version (which is 2.1.13 for the 6.2.16-19-pve kernel) and the user-space utilities (`zpool status`, which is from 2.2.0) - this particular message seems to have entered with the "VDEV property features" in commit 2a673e76a928cca4df7794cdcaa02e0be149c4da - and brings the ability to evacuate/remove certain types of vdevs in a pool

I did not find an explicity policy about the expected compatibility between userspace and kernel for a jump from 2.1.X to 2.2.Y - but can imagine quite a few things that cause glitches like that.

I'll ask around if someone knows if there are some kind of guarantee (but would not expect so).
 
No issues detected, except one log entry, logged twice every boot and every time I manually mount for the encrypted sets: zfs mount -l -a
Code:
failed to lock /etc/exports.d/zfs.exports.lock: No such file or directory
Noticed that /etc/exports.d/ did not exist, so for testing created it: mkdir /etc/exports.d/
Now the logentry is gone while booting and manual mount. The lock file is created too.
Removed the exports.d folder and the log entry returns.
I don´t know if this is the needed fix.
There is a similar bugreport

Will test with kernel 6.5 too soon. Thanks for providing this new opt-in kernel.
managed to reproduce the issue - and would agree that the upstream bugreport matches this.
I potentially found the issue and commit that introduced it and submitted a patch:
https://github.com/openzfs/zfs/pull/15468

Will try to post an update here if this gets applied (but check the above commit for more details)
 
kernel 6.5 works fine except on the one server with Adpatec controller there I get the following error

aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,17,0):
aacraid: Host bus reset request. SCSI hang ?
aacraid 0000:18:00.0: Controller reset type is 3

this is probably the same error as reported here

https://bugzilla.kernel.org/show_bug.cgi?id=217599
 
Last edited:
  • Like
Reactions: mdo
  • Like
Reactions: the-last-englishman
I did not find an explicity policy about the expected compatibility between userspace and kernel for a jump from 2.1.X to 2.2.Y - but can imagine quite a few things that cause glitches like that.

I'll ask around if someone knows if there are some kind of guarantee (but would not expect so).
ok - to be on the safe side I opened an issue upstream:
https://github.com/openzfs/zfs/issues/15472

see there for more information - but for the time being I consider this a cosmetic issue.
 
  • Like
Reactions: janssensm
So, i give the test repo an try, found some time. I will change boot aprameters for active EPP, reboot and test a bit.

First finding (not sure if PVE related):
/sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference: Device or resource busy --> i cant set the settings here
I am having this issue as well.
 
We recently uploaded a 6.5 kernel into our repositories, it will be used as new default kernel in the next Proxmox VE 8.1 point release (Q4'2023), following our tradition of upgrading the Proxmox VE kernel to match the current Ubuntu version until we reach an (Ubuntu) LTS release.

We have run this kernel on some parts of our test setups over the last few weeks without any notable issues, at least none that we addressed before this first public release.

How to install:
  1. Add the pvetest repository (via CLI editor or using the web UI under Node -> Repositories)
  2. apt update
  3. apt full-upgrade # to get ZFS 2.2 user space packages
  4. apt install pve-kernel-6.5
  5. reboot
Future updates to the 6.5 kernel will now get installed automatically.

Please note:
  • We will provide this new kernel soon on the pve-no-subscription repository. While the new kernel is currently only opt-in, and newer ZFS user space packages are compatible with older kernel modules, we still wanted to avoid rushing the ZFS user-space out to no-subscription.
  • The current 6.2 kernel is still supported and will still receive updates until at least end of year, but it's unlikely that the ZFS module will be updated in the 6.2 kernel anymore.
  • There were many changes, for improved hardware support, performance improvements in file system, but really all over the place, and enabling the Multi-Gen LRU by default. For a more complete list of changes we recommend checking out the kernel-newbies sites for 6.3, 6.4 and 6.5.
  • The kernel is also available on the test repository of Proxmox Backup Server and Proxmox Mail Gateway.
  • If you're unsure, we recommend continuing to use the 6.2-based kernel for now.

Feedback about how the new kernel performs in any of your setups is welcome!
Please provide basic details like CPU model, storage types used, ZFS as root file system, and the like, for both positive feedback or if you ran into some issues, where the 6.5 kernel seems to be the cause.
Thank you, the kernel is running fine here on my threadripper 3960x.
Only for some reason I'm unable to get the amd_pstate driver working.
My kernel command line:
BOOT_IMAGE=/boot/vmlinuz-6.5.3-1-pve root=/dev/mapper/pve-root ro quiet iommu=pt amd_iommu=on kvm_amd.npt=1 kvm_amd.avic=1 nmi_watchdog=0 video=vesafb:off video=efifb:off video=simplefb:off nomodeset initcall_blacklist=sysfb_init modprobe.blacklist=nouveau modprobe.blacklist=amdgpu modprobe.blacklist=radeon modprobe.blacklist=nvidia hugepagesz=1G default_hugepagesz=2M mitigations=off amd_pstate=active amd_pstate.shared_mem=1

During boot time it seems to complain:
[ 0.479527] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

Could you perhaps enable: CONFIG_X86_AMD_PSTATE_UT=m for the next release?
Or do you have a repository from where I can build it?
 
Bummer.

Tested 6.5 on one of my new SuperMicro front ends with 4x Intel Xeon Gold 6448H. VM locks up under load with CPU's stuck. I do run zfs on root with 2 Micron 5400 Pro's.

Server.
https://www.supermicro.com/en/products/system/mp/2u/sys-241e-tnrttp

VM storage is on HPE Alletra NVMe.

Back to 5.15.x and no issues.

I will be looking to test KSM and other performance issues on older hardware next.
 
Last edited:
  • Like
Reactions: Whatever
I followed the steps to install the 6.5 kernel and it seems to be working without any problems. The only anomalous situation is that I now receive an email every time a backup is performed. In the backup configuration I have the option to just warn when it fails.

EDIT: I have reconfigured the email option and it seems to be fixed now
 
Last edited: