Opt-in Linux 6.5 Kernel with ZFS 2.2 for Proxmox VE 8 available on test & no-subscription

This computer doesn't have a GPU, so this isn't a passthrough issue.
The issue of above poster was not directly related to passthrough, they just used their GPU for passthrough.
Server have no GPU.
You mean no extra dedicated one, or are you just using a serial console? As otherwise there's always a GPU that provides the frame buffer displayed on the IPMI/iKVM remote viewer, or any potentially attached monitor.

Can you please also try adding the nomodeset parameter to the kernel command line on boot?
 
One thing that came to my mind is that it could be maybe a side effect of dropping adding the simplefb module by default to the initramfs again, as it caused some trouble on other HW while not really seeming to fix that many issues.

https://git.proxmox.com/?p=proxmox-...ff;h=9c41f9482666a392b80a3c4da3e695c4649d8ee1

To try that out you would do:
Code:
echo "simplefb" >> /etc/initramfs-tools/modules
update-initramfs -u -k 6.5.11-4-pve
# reboot

Any feedback would be appreciated, it could help to improve this in the long run.
 
Hi,
Yes, server IPMI frame buffer.
Adding nomodeset parameter to the kernel command line doesn't change anything.

I will try 'simplefb' suggestion
 

Attachments

  • boot.JPG
    boot.JPG
    13.8 KB · Views: 52
Unfortunately in my case, neither of these fixes allows me to boot into the 6.5.11-4-pve kernel. I think it's safe to say that the kernel just crashes after that loading initial ramdisk part.
 
  • Like
Reactions: Pakillo77
Unfortunately in my case, neither of these fixes allows me to boot into the 6.5.11-4-pve kernel. I think it's safe to say that the kernel just crashes after that loading initial ramdisk part.
Just to be sure, you did try to ping the Proxmox VE host IP address, and it doesn't reply – after waiting a few minutes (depending on your HW) to ensure it actually had enough time to boot? As a crashing kernel is much noisier most of the time.
 
The kernel supports indeed loading compressed firmware, depending on the config that can be zstd or xz.
And what's also true is that the loader checks for uncompressed files first, so if both, an uncompressed (downloaded by you) and a compressed one (installed by our current pve-firmware) exist, then the former uncompressed will be loaded.

So in your case it would mean that the from linux-firmware.git downloaded version, which tag matches the one we use, worked but the one pve-firmware ships didn't, seems a bit strange to me.. Possibly it only breaks on some boots?
You could also extract the compressed firmware shipped by pve-firmware and compare each files (e.g. sha256sum checksums)

I apologize for the late reply... I pulled the offending m.2 card out of the machine, and started using a cheap USB bluetooth dongle for my required BT. My best (sarcastic) guess is that the intel firmware/driver has a built in random number generator and uses that to determine if it's going to work on any given boot.
 
  • Like
Reactions: t.lamprecht
The only remaining issue not fixed in our ZFS version is not one specific to 2.2 at all, but a lot older that got only uncovered due to a very synthetic reproducer, and only then in certain cirumstances.
I.e., it normally only gets exposed with very new coreutils using reflink by default in the wild, the coreutils version from Debian Bookworm, which Proxmox VE 8 bases on, doesn't use that. And even then, with the two bigger issues fixed, it's not really triggerable without synthetic benchmarks, but one can set the /sys/module/zfs/parameters/zfs_dmu_offset_next_sync to 0 to stop-gap that issue (as per https://github.com/openzfs/zfs/issues/15526#issuecomment-1823737998 )
As the remaining issue is very unlikely to hit setups in the wild, and also present in older releases, we see no benefit from rush some downgrade. If you're concerned you can set the tunable until the ZFS project, and we have a better fix available.
 
The only remaining issue not fixed in our ZFS version is not one specific to 2.2 at all, but a lot older that got only uncovered due to a very synthetic reproducer, and only then in certain cirumstances.
I.e., it normally only gets exposed with very new coreutils using reflink by default in the wild, the coreutils version from Debian Bookworm, which Proxmox VE 8 bases on, doesn't use that. And even then, with the two bigger issues fixed, it's not really triggerable without synthetic benchmarks, but one can set the /sys/module/zfs/parameters/zfs_dmu_offset_next_sync to 0 to stop-gap that issue (as per https://github.com/openzfs/zfs/issues/15526#issuecomment-1823737998 )
As the remaining issue is very unlikely to hit setups in the wild, and also present in older releases, we see no benefit from rush some downgrade. If you're concerned you can set the tunable until the ZFS project, and we have a better fix available.
I started a thread about this before I thought to check this one. However,

Seems someone can repo it even with zfs_dmu_offset_next_sync=0.
https://github.com/openzfs/zfs/issues/15526#issuecomment-1826065538

Seems this might be the fix?
https://github.com/openzfs/zfs/pull/15571/files?diff=split&w=0

But wow, out of my depth. Interesting seeing this play out.
 
How did you add that? Did you try it like the following post suggests:
https://forum.proxmox.com/threads/pve-8-0-and-8-1-hangs-on-boot.137033/#post-609320

As there it worked for a handful of setups.

Yes , i made exactly the same.
Server stop after Loading initial Ramdisk.
The server does not respond to ping even after 10 minutes...

Revert to 6.2.16-19-pve, all is ok.

Booting on 6.2, changes are present :

[Mon Nov 27 20:15:52 2023] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.2.16-19-pve root=/dev/mapper/pve-root ro quiet nomodeset intel_iommu=off
[Mon Nov 27 20:15:52 2023] Booted with the nomodeset parameter. Only the system framebuffer will be available
[Mon Nov 27 20:15:52 2023] DMAR: IOMMU disabled
...
[Mon Nov 27 20:16:00 2023] simple-framebuffer simple-framebuffer.0: fb0: simplefb registered!

Remove 'quiet' and put 'earlyprintk=vga' from command line don't give more information with 6.5.11-4-pve kernel.
 
@t.lamprecht:
Once all of those current ZFS issues are fully fixed, is it planned to release a new version of the 8.1 ISO with all the fixes / fixed versions included ootb?
 
@t.lamprecht:
Once all of those current ZFS issues are fully fixed, is it planned to release a new version of the 8.1 ISO with all the fixes / fixed versions included ootb?
Yes, I think it will still make sense to update the ISO to avoid any doubts. Even if the ISO currently is not problematic, as the remaining potential issue can only be reproduced synthetically and is even then flaky – which is explaining why it went unnoticed for so many years until it was found by accident when checking out the block cloning issue.

We closely follow the discussion and the fix upstream and will take it in once it passed some testing here, at least the fix is very targeted and does not require bigger rework.
We're also looking into the ZFS of older releases and will update any affected version (e.g., from the 6.2 and 5.15 kernels)
 
Yes, I think it will still make sense to update the ISO to avoid any doubts. Even if the ISO currently is not problematic, as the remaining potential issue can only be reproduced synthetically and is even then flaky explaining why it went unnoticed for so many years until it was found by accident when checking out the block cloning issue.

We closely follow the discussion and the fix upstream and will take it in once it passed some testing here, at least the fix is very targeted and does not require bigger rework.
We're also looking into the ZFS of older releases and will update any affected version (e.g., from the 6.2 and 5.15 kernels)

Thank you very much, highly appreciated! :)
 
  • Like
Reactions: proximoxi2
Hi,
Yes, server IPMI frame buffer.
Adding nomodeset parameter to the kernel command line doesn't change anything.

I will try 'simplefb' suggestion

Same issue here. I tried the simplefb suggestions, but it did not fix the problem.
System: Dell Poweredge T140 / 64GB / 2x1TB ZFS

I have applied latest Kernel update 6.5.11-5-pve. But still do not get over the loading initial ramdisk...
Booting Kernel 6.2.16-19-pve works fine.
 
  • Like
Reactions: proteus
same here as sl4vik..

Dell t140/64Gb ECC

i tried adding nomodeset, removing it, added simplefb, removed it, updated to the latest everything both bios/firmware/pve

i posted in another thread but was linked to this one by fabian at proxmox

booting the 6.2 kernel works fine.
 
I think the console not showing anything and the server not coming up might be two separate issues.

What NIC (model) is built-in those? One thing that we got some reports on is about Realtek NICs, especially those where users previously used an out-of-tree driver via DKMS, and one of those was also a Dell IIRC.

Do any of those affected Dell's has a serial console that one can get some potentially more useful info from?
Some provide that also via a virtual console viewer in the ILO/iDRAC/IPMI/...

If, then it would be great if you would try booting with that, i.e., add something like:
console=tty0 console=ttyS0,115200

as extra kernel command line parameters and check the serial console for any errors and/or full dmesg output.
 
Thanks a lot for the reply,

the NIC model in my dell t140 is Broadcom Gigabit Ethernet BCM5720

so it should not be affected i think?

I tried adding in the serial console but i didn't get any more information and the server was never reachable so i'm guessing it did not boot into the os.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!