Opt-in Linux 6.8 Kernel for Proxmox VE 8 available on test & no-subscription

Thanks for your feedback, and yeah changes in kernel release, systemd version and moving around HW can unfortunately result in such name changes. IME the ones from kernel updates stabilize once all features of a HW are supported correctly and no new issues come up.

One thing to avoid such changes is to pin the name of the interfaces manually. E.g., one could name a network interface net0 through matching their MAC address in a /etc/systemd/network/00-net0.link configuration like:
Code:
[Match]
MACAddress=aa:bb:cc:12:34:56
[Link]
Name=net0

This is something that might be worth to expose as option in our installer.

Edit: change from eth0 to net0 for the example to avoid potential race between kernel and udev.

Hi, so in preparation for version 8.2 I tried to implement this in 8.1 now - it does seem to have worked at least partially (ip a shows updated physical interface name, web portal shows old name still), but there seem to be some issues with this due to the MAC being the same for the physical port (enp1s0; that should actually be renamed) as well as the bridge/VLAN devices (vmbr0, vmbr0.10) that it's part of?

Code:
Apr 22 15:56:03 pve (udev-worker)[1427]: lan0: Failed to rename network interface 5 from 'vmbr0.10' to 'lan0': File exists
Apr 22 15:56:03 pve (udev-worker)[1427]: lan0: Failed to process device, ignoring: File exists
Apr 22 15:56:04 pve networking[1413]: error: >>> Full logs available in: /var/log/ifupdown2/network_config_ifupdown2_191_Apr-22-2024_15:56:02.352760 <<<
Apr 22 15:56:04 pve /usr/sbin/ifup[1413]: >>> Full logs available in: /var/log/ifupdown2/network_config_ifupdown2_191_Apr-22-2024_15:56:02.352760 <<<


Update:
as per https://utcc.utoronto.ca/~cks/space/blog/linux/NetworkdMACMatchesWidely , the link file should thus probably also include a "Type" match, such that
Code:
[Match]
MACAddress=xxxx
Type=<one of: ether, wlan, wwan>
[Link]
Name=<your preferred name>
 
Last edited:
  • Like
Reactions: tew
Damn I did miss this one will update to 6.8 in the next days and will give some feedback.
 
Running into kernel issues about once a day on intel 14th gen (i7-14700k). Have to physically reboot when this happens.
ZFS not used on root storage only mounted into some LXC containers

Code:
Apr 23 14:34:50 pve kernel: Linux version 6.8.4-2-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) ()
-- Boot e4647796f31b4cd7af7e75b5cceec935 --
Apr 23 14:22:15 pve kernel: watchdog: BUG: soft lockup - CPU#13 stuck for 121721s! [Plex Media Scan:89243]
Apr 23 14:22:15 pve kernel:  </TASK>
Apr 23 14:22:15 pve kernel: R13: 0000736b00b31310 R14: 0000000000000000 R15: 0000736b028adb7c
Apr 23 14:22:15 pve kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000736b028adb40
Apr 23 14:22:15 pve kernel: RBP: 0000000000008000 R08: 0000000000000000 R09: 0000000000000000
Apr 23 14:22:15 pve kernel: RDX: 0000000000008000 RSI: 0000736afd73ac70 RDI: 000000000000002f
Apr 23 14:22:15 pve kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000736b0299a739
Apr 23 14:22:15 pve kernel: RSP: 002b:00007ffd2a131878 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 14:22:15 pve kernel: Code: c0 0f 85 24 00 00 00 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 d5 cb ff ff 41 57 41 56 53 48 81 >
Apr 23 14:22:15 pve kernel: RIP: 0033:0x736b0299a739
Apr 23 14:22:15 pve kernel:  entry_SYSCALL_64_after_hwframe+0x73/0x7b
Apr 23 14:22:15 pve kernel:  ? exc_page_fault+0x94/0x1b0
Apr 23 14:22:15 pve kernel:  ? irqentry_exit+0x43/0x50
Apr 23 14:22:15 pve kernel:  ? irqentry_exit_to_user_mode+0x7b/0x260
Apr 23 14:22:15 pve kernel:  ? do_user_addr_fault+0x343/0x6b0
Apr 23 14:22:15 pve kernel:  do_syscall_64+0x84/0x180
Apr 23 14:22:15 pve kernel:  __x64_sys_read+0x19/0x30
Apr 23 14:22:15 pve kernel:  ksys_read+0x73/0x100
Apr 23 14:22:15 pve kernel:  vfs_read+0x255/0x390
Apr 23 14:22:15 pve kernel:  zpl_iter_read+0xe6/0x1a0 [zfs]
Apr 23 14:22:15 pve kernel:  zfs_read+0x143/0x400 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_read_uio_dbuf+0x48/0x70 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_read_uio_dnode+0x5a/0x150 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_buf_hold_array_by_dnode+0x333/0x6a0 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_zfetch_run+0x18f/0x300 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dmu_zfetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dmu_zfetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  dbuf_prefetch_impl+0x866/0x8e0 [zfs]
Apr 23 14:22:15 pve kernel:  dbuf_issue_final_prefetch+0xa7/0x100 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dbuf_issue_final_prefetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  arc_read+0xc87/0x17c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x2a5/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_mirror_io_start+0xa7/0x270 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_vdev_mirror_child_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x14c/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  ? zio_create+0x3e8/0x660 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_raidz_io_start+0x17a/0x310 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x14c/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_disk_io_start+0x89/0x4d0 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_classic_physio+0x325/0x4c0 [zfs]
Apr 23 14:22:15 pve kernel:  submit_bio+0xb2/0x110
Apr 23 14:22:15 pve kernel:  submit_bio_noacct+0x1f3/0x650
Apr 23 14:22:15 pve kernel:  ? abd_bio_map_off+0x20e/0x280 [zfs]
Apr 23 14:22:15 pve kernel:  submit_bio_noacct_nocheck+0x2b7/0x390
Apr 23 14:22:15 pve kernel:  __submit_bio+0xb3/0x1c0
Apr 23 14:22:15 pve kernel:  blk_mq_submit_bio+0x14f/0x750
Apr 23 14:22:15 pve kernel:  blk_mq_attempt_bio_merge+0x5f/0x70
Apr 23 14:22:15 pve kernel:  blk_mq_sched_bio_merge+0x36/0x120
Apr 23 14:22:15 pve kernel:  dd_bio_merge+0x49/0xb0
Apr 23 14:22:15 pve kernel:  _raw_spin_lock+0x3f/0x60
Apr 23 14:22:15 pve kernel:  ? native_queued_spin_lock_slowpath+0x7f/0x2d0
Apr 23 14:22:15 pve kernel:  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Apr 23 14:22:15 pve kernel:  <TASK>
Apr 23 14:22:15 pve kernel:  </IRQ>
Apr 23 14:22:15 pve kernel:  ? sysvec_apic_timer_interrupt+0x8d/0xd0
Apr 23 14:22:15 pve kernel:  ? __sysvec_apic_timer_interrupt+0x4e/0x150
Apr 23 14:22:15 pve kernel:  ? hrtimer_interrupt+0xf6/0x250
Apr 23 14:22:15 pve kernel:  ? clockevents_program_event+0xb3/0x140
Apr 23 14:22:15 pve kernel:  ? __hrtimer_run_queues+0x105/0x280
Apr 23 14:22:15 pve kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
Apr 23 14:22:15 pve kernel:  ? watchdog_timer_fn+0x206/0x290
Apr 23 14:22:15 pve kernel:  ? show_regs+0x6d/0x80
Apr 23 14:22:15 pve kernel:  <IRQ>
Apr 23 14:22:15 pve kernel: Call Trace:
Apr 23 14:22:15 pve kernel: PKRU: 55555554
Apr 23 14:22:15 pve kernel: CR2: 0000736afc01c000 CR3: 00000003e65ac000 CR4: 0000000000f50ef0
Apr 23 14:22:15 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 23 14:22:15 pve kernel: FS:  0000736b028adb40(0000) GS:ffff9905cee80000(0000) knlGS:0000000000000000
Apr 23 14:22:15 pve kernel: R13: ffff98ff10055400 R14: 0000000000000001 R15: 0000000000000001
Apr 23 14:22:15 pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff98f69f2635e8
Apr 23 14:22:15 pve kernel: RBP: ffffbea74b167340 R08: 0000000000000000 R09: 0000000000000000
Apr 23 14:22:15 pve kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff98f69f706b48
Apr 23 14:22:15 pve kernel: RAX: 0000000000000001 RBX: ffff98f69f706b48 RCX: 0000000000000000
Apr 23 14:22:15 pve kernel: RSP: 0018:ffffbea74b167320 EFLAGS: 00000202
Apr 23 14:22:15 pve kernel: Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 5f 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 >
Apr 23 14:22:15 pve kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x7f/0x2d0
Apr 23 14:22:15 pve kernel: Hardware name: ASRock Z790 Pro RS/Z790 Pro RS, BIOS 11.03 02/07/2024
Apr 23 14:22:15 pve kernel: CPU: 9 PID: 89117 Comm: Plex Media Scan Tainted: P    B D W  O L     6.8.4-2-pve #1
Apr 23 14:22:15 pve kernel:  polyval_generic snd_intel_sdw_acpi ghash_clmulni_intel snd_hda_codec sha256_ssse3 sha1_ssse3 drm_buddy aesni_intel snd_hda_core ttm crypto_simd snd_hwdep cryptd mei_hdcp me>
Apr 23 14:22:15 pve kernel: Modules linked in: tcp_diag inet_diag xt_MASQUERADE xt_tcpudp xt_mark nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 cfg80211 veth ebtable_filter>
 
Last edited:
Running into kernel issues about once a day on intel 14th gen (i7-14700k). Have to physically reboot when this happens.
ZFS not used on root storage only mounted into some LXC containers

Code:
Apr 23 14:34:50 pve kernel: Linux version 6.8.4-2-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) ()
-- Boot e4647796f31b4cd7af7e75b5cceec935 --
Apr 23 14:22:15 pve kernel: watchdog: BUG: soft lockup - CPU#13 stuck for 121721s! [Plex Media Scan:89243]
Apr 23 14:22:15 pve kernel:  </TASK>
Apr 23 14:22:15 pve kernel: R13: 0000736b00b31310 R14: 0000000000000000 R15: 0000736b028adb7c
Apr 23 14:22:15 pve kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000736b028adb40
Apr 23 14:22:15 pve kernel: RBP: 0000000000008000 R08: 0000000000000000 R09: 0000000000000000
Apr 23 14:22:15 pve kernel: RDX: 0000000000008000 RSI: 0000736afd73ac70 RDI: 000000000000002f
Apr 23 14:22:15 pve kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000736b0299a739
Apr 23 14:22:15 pve kernel: RSP: 002b:00007ffd2a131878 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 14:22:15 pve kernel: Code: c0 0f 85 24 00 00 00 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 d5 cb ff ff 41 57 41 56 53 48 81 >
Apr 23 14:22:15 pve kernel: RIP: 0033:0x736b0299a739
Apr 23 14:22:15 pve kernel:  entry_SYSCALL_64_after_hwframe+0x73/0x7b
Apr 23 14:22:15 pve kernel:  ? exc_page_fault+0x94/0x1b0
Apr 23 14:22:15 pve kernel:  ? irqentry_exit+0x43/0x50
Apr 23 14:22:15 pve kernel:  ? irqentry_exit_to_user_mode+0x7b/0x260
Apr 23 14:22:15 pve kernel:  ? do_user_addr_fault+0x343/0x6b0
Apr 23 14:22:15 pve kernel:  do_syscall_64+0x84/0x180
Apr 23 14:22:15 pve kernel:  __x64_sys_read+0x19/0x30
Apr 23 14:22:15 pve kernel:  ksys_read+0x73/0x100
Apr 23 14:22:15 pve kernel:  vfs_read+0x255/0x390
Apr 23 14:22:15 pve kernel:  zpl_iter_read+0xe6/0x1a0 [zfs]
Apr 23 14:22:15 pve kernel:  zfs_read+0x143/0x400 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_read_uio_dbuf+0x48/0x70 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_read_uio_dnode+0x5a/0x150 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_buf_hold_array_by_dnode+0x333/0x6a0 [zfs]
Apr 23 14:22:15 pve kernel:  dmu_zfetch_run+0x18f/0x300 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dmu_zfetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dmu_zfetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  dbuf_prefetch_impl+0x866/0x8e0 [zfs]
Apr 23 14:22:15 pve kernel:  dbuf_issue_final_prefetch+0xa7/0x100 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_dbuf_issue_final_prefetch_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  arc_read+0xc87/0x17c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x2a5/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_mirror_io_start+0xa7/0x270 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  ? __pfx_vdev_mirror_child_done+0x10/0x10 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x14c/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  ? zio_create+0x3e8/0x660 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_raidz_io_start+0x17a/0x310 [zfs]
Apr 23 14:22:15 pve kernel:  zio_nowait+0xd2/0x1c0 [zfs]
Apr 23 14:22:15 pve kernel:  zio_vdev_io_start+0x14c/0x340 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_disk_io_start+0x89/0x4d0 [zfs]
Apr 23 14:22:15 pve kernel:  vdev_classic_physio+0x325/0x4c0 [zfs]
Apr 23 14:22:15 pve kernel:  submit_bio+0xb2/0x110
Apr 23 14:22:15 pve kernel:  submit_bio_noacct+0x1f3/0x650
Apr 23 14:22:15 pve kernel:  ? abd_bio_map_off+0x20e/0x280 [zfs]
Apr 23 14:22:15 pve kernel:  submit_bio_noacct_nocheck+0x2b7/0x390
Apr 23 14:22:15 pve kernel:  __submit_bio+0xb3/0x1c0
Apr 23 14:22:15 pve kernel:  blk_mq_submit_bio+0x14f/0x750
Apr 23 14:22:15 pve kernel:  blk_mq_attempt_bio_merge+0x5f/0x70
Apr 23 14:22:15 pve kernel:  blk_mq_sched_bio_merge+0x36/0x120
Apr 23 14:22:15 pve kernel:  dd_bio_merge+0x49/0xb0
Apr 23 14:22:15 pve kernel:  _raw_spin_lock+0x3f/0x60
Apr 23 14:22:15 pve kernel:  ? native_queued_spin_lock_slowpath+0x7f/0x2d0
Apr 23 14:22:15 pve kernel:  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Apr 23 14:22:15 pve kernel:  <TASK>
Apr 23 14:22:15 pve kernel:  </IRQ>
Apr 23 14:22:15 pve kernel:  ? sysvec_apic_timer_interrupt+0x8d/0xd0
Apr 23 14:22:15 pve kernel:  ? __sysvec_apic_timer_interrupt+0x4e/0x150
Apr 23 14:22:15 pve kernel:  ? hrtimer_interrupt+0xf6/0x250
Apr 23 14:22:15 pve kernel:  ? clockevents_program_event+0xb3/0x140
Apr 23 14:22:15 pve kernel:  ? __hrtimer_run_queues+0x105/0x280
Apr 23 14:22:15 pve kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
Apr 23 14:22:15 pve kernel:  ? watchdog_timer_fn+0x206/0x290
Apr 23 14:22:15 pve kernel:  ? show_regs+0x6d/0x80
Apr 23 14:22:15 pve kernel:  <IRQ>
Apr 23 14:22:15 pve kernel: Call Trace:
Apr 23 14:22:15 pve kernel: PKRU: 55555554
Apr 23 14:22:15 pve kernel: CR2: 0000736afc01c000 CR3: 00000003e65ac000 CR4: 0000000000f50ef0
Apr 23 14:22:15 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 23 14:22:15 pve kernel: FS:  0000736b028adb40(0000) GS:ffff9905cee80000(0000) knlGS:0000000000000000
Apr 23 14:22:15 pve kernel: R13: ffff98ff10055400 R14: 0000000000000001 R15: 0000000000000001
Apr 23 14:22:15 pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff98f69f2635e8
Apr 23 14:22:15 pve kernel: RBP: ffffbea74b167340 R08: 0000000000000000 R09: 0000000000000000
Apr 23 14:22:15 pve kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff98f69f706b48
Apr 23 14:22:15 pve kernel: RAX: 0000000000000001 RBX: ffff98f69f706b48 RCX: 0000000000000000
Apr 23 14:22:15 pve kernel: RSP: 0018:ffffbea74b167320 EFLAGS: 00000202
Apr 23 14:22:15 pve kernel: Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 5f 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 >
Apr 23 14:22:15 pve kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x7f/0x2d0
Apr 23 14:22:15 pve kernel: Hardware name: ASRock Z790 Pro RS/Z790 Pro RS, BIOS 11.03 02/07/2024
Apr 23 14:22:15 pve kernel: CPU: 9 PID: 89117 Comm: Plex Media Scan Tainted: P    B D W  O L     6.8.4-2-pve #1
Apr 23 14:22:15 pve kernel:  polyval_generic snd_intel_sdw_acpi ghash_clmulni_intel snd_hda_codec sha256_ssse3 sha1_ssse3 drm_buddy aesni_intel snd_hda_core ttm crypto_simd snd_hwdep cryptd mei_hdcp me>
Apr 23 14:22:15 pve kernel: Modules linked in: tcp_diag inet_diag xt_MASQUERADE xt_tcpudp xt_mark nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 cfg80211 veth ebtable_filter>
Nice log. For me it looks like probably a bug in plex media scanner?
But no doubt it's a bug in the kernel either because the cpu shouldn't freeze.

Did you tryed to google if others have similar issues with plex on 6.8?

Another thing is, when Ubuntu 24.04 gets finally released and people start using it, we should see almost everything fixed very fast, because we are using that kernel.
And 24.04 will get a lot attention.

Cheers
 
Nice log. For me it looks like probably a bug in plex media scanner?
But no doubt it's a bug in the kernel either because the cpu shouldn't freeze.

Did you tryed to google if others have similar issues with plex on 6.8?

Another thing is, when Ubuntu 24.04 gets finally released and people start using it, we should see almost everything fixed very fast, because we are using that kernel.
And 24.04 will get a lot attention.

Cheers
I've gotten similar logs for other containers running into kernel issues when busy. Still not able to confidently connect it to high I/O load/zfs though.

I haven't necessarily checked for people running into plex issues on 6.8 - I've been running into 14th gen issues on 6.8 on most linux distros. Although it seemed fine on Ubuntu 23.10 starting with 6.8.4.

I'll check out Ubuntu 24.04 once it's out and repeat the same (mounting zfs media storage and plex scans)
 
as of today openzfs doesn't support kernels greater than 6.7, so would it be safe to assume if you use ZFS then upgrading to 6.8 is at your own risk?
 
proxmox kernel 6.8 is based from ubuntu kernel lts, and ubuntu is supporting zfs officially.
It looks like zfs is only starting to fully support 6.8 starting with 2.2.4.
Both latest ubuntu and proxmox are zfs 2.2.3 based
Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
[...]
zfsutils-linux: 2.2.3-pve2
 
It looks like zfs is only starting to fully support 6.8 starting with 2.2.4.
Both latest ubuntu and proxmox are zfs 2.2.3 based
Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
[...]
zfsutils-linux: 2.2.3-pve2
I don't have an authoritative answer for you on this specific issue, but I've seen Ubuntu and PVE both, at separate times, backport new ZFS features to an older version of ZFS, until ZFS catches up.
 
It looks like zfs is only starting to fully support 6.8 starting with 2.2.4.
Both latest ubuntu and proxmox are zfs 2.2.3 based
We pulled in the changes that are relevant for support of kernel 6.8 in ZFS in: https://git.proxmox.com/?p=zfsonlinux.git;a=commit;h=68be554e71e0cac3144d19930898f95bf6616620

(which is included in the version shipped with the 6.8 kernels).
Our internal tests, and our users running the 6.8 kernel in the past weeks also did not indicate anything problematic regarding kernel 6.8 + ZFS.

I hope this helps!
 
I'm running a Supermicro M11SDV-8C-LN4F with an Intel X710-DA2 card, using SR-IOV.
With kernel 6.5 it works well when compiling the i40e driver manually.
With kernel 6.8, the interfaces got renamed (I took care of that and added npX everywhere), but my VMs, all have mapped VFs, will not start anymore:

Code:
[Thu Apr 25 08:18:46 2024] vfio-pci 0000:06:02.0: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.
[Thu Apr 25 08:18:47 2024] vfio-pci 0000:06:02.1: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.
[Thu Apr 25 08:18:48 2024] vfio-pci 0000:06:0a.2: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.

I cannot compile the driver anymore, either.

Code:
root@epyc:~/intel-driver/i40e-2.24.6/src# make install
filtering include/net/flow_keys.h out
filtering include/linux/jump_label_type.h out
filtering include/linux/jump_label_type.h out
*** The target kernel has CONFIG_MODULE_SIG_ALL enabled, but
*** the signing key cannot be found. Module signing has been
*** disabled for this build.
make[1]: Entering directory '/usr/src/linux-headers-6.8.4-2-pve'
  CC [M]  /root/intel-driver/i40e-2.24.6/src/i40e_main.o
/root/intel-driver/i40e-2.24.6/src/i40e_main.c: In function ‘i40e_send_version’:
/root/intel-driver/i40e-2.24.6/src/i40e_main.c:11530:9: error: implicit declaration of function ‘strlcpy’; did you mean ‘strscpy’? [-Werror=implicit-function-declaration]
11530 |         strlcpy(dv.driver_string, DRV_VERSION, sizeof(dv.driver_string));
      |         ^~~~~~~
      |         strscpy
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:243: /root/intel-driver/i40e-2.24.6/src/i40e_main.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.8.4-2-pve/Makefile:1926: /root/intel-driver/i40e-2.24.6/src] Error 2
make[1]: *** [Makefile:240: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.8.4-2-pve'
Had to revert to 6.5 for now. Any hints?
 
  • Like
Reactions: SInisterPisces
Hi,

I had 2 crash in 24h with kernel 6.8 on 2 differents servers (same generation, same hardware):
no kernel panic, server freeezed
system installed on lvm
nic: mellanox ConnectX-4 Lx


Edit: The 2 crashing servers also have local encrypted ceph osd nvme drives. (The 20 other servers not crashing only have ssd lvm boot drive without vm storage)

crash.png


Code:
server:

System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 05

cpu:

model name    : AMD EPYC 7413 24-Core Processor
stepping    : 1
microcode    : 0xa0011d3



The kernel 6.8 is working fine since 2 weeks on other model, almost similar, bu newer epyc generation

Code:
System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 06

model name    : AMD EPYC 7543 32-Core Processor
stepping    : 1
microcode    : 0xa0011d1
 
Last edited:
With kernel 6.5 it works well when compiling the i40e driver manually.
With kernel 6.8, the interfaces got renamed (I took care of that and added npX everywhere), but my VMs, all have mapped VFs, will not start anymore:
have you tried using the i40e driver shipped in the 6.8 kernel? - i.e. are there any issues if you just remove the dkms-package?
see https://pve.proxmox.com/wiki/Roadmap#8.2-known-issues
 
have you tried using the i40e driver shipped in the 6.8 kernel? - i.e. are there any issues if you just remove the dkms-package?
see https://pve.proxmox.com/wiki/Roadmap#8.2-known-issues
Hi,
I am usually building the driver by hand, did not know about a dkms package.
So, I am (was) actually using the driver that is shipped with the kernel, because the latest Intel source will not compile.
Any ideas on how to get the system in a working state with kernel 6.8?
Thank you very much!
 
I just upgraded to PVE 8.2 last night and the new 6.8 kernel still has the ixgbe driver bug where the NIC won't connect. :( I can't compile it from source, either.

4x " Intel X553 10 GbE SFP+ " hardwired on the motherboard of my Qotom firewall appliance, can't swap them out so I'm still running old PVE 7.x kernel 5.15.136
 
after the update to 6.8 the bnxt_en driver does also not work for me.

In this case it is the onboard ports of a SuperMicro H13SSL-NT mainboard and there seems to be no firmware update for them

Code:
Apr 09 19:55:10 gcd-virthost3 kernel: bnxt_en 0000:c1:00.1: QPLIB: bnxt_re_is_fw_stalled: FW STALL Detected. cmdq[0xe]=0x3 waited (102364 > 100000) msec active 1
Apr 09 19:55:10 gcd-virthost3 kernel: bnxt_en 0000:c1:00.1 bnxt_re1: Failed to modify HW QP
Apr 09 19:55:10 gcd-virthost3 kernel: infiniband bnxt_re1: Couldn't change QP1 state to INIT: -110
Apr 09 19:55:10 gcd-virthost3 kernel: infiniband bnxt_re1: Couldn't start port
Apr 09 19:55:10 gcd-virthost3 kernel: bnxt_en 0000:c1:00.1 bnxt_re1: Failed to destroy HW QP
Apr 09 19:55:10 gcd-virthost3 kernel: bnxt_en 0000:c1:00.1 bnxt_re1: Free MW failed: 0xffffff92
Apr 09 19:55:10 gcd-virthost3 kernel: infiniband bnxt_re1: Couldn't open port 1
Apr 09 19:55:10 gcd-virthost3 kernel: infiniband bnxt_re1: Device registered with IB successfully
May I ask which BIOS version you have? I'm currently on 1.4 and SuperMicro offers 1.6b within their firmware bundle. Had to blacklist the driver.
 
Hi,
I am usually building the driver by hand, did not know about a dkms package.
So, I am (was) actually using the driver that is shipped with the kernel, because the latest Intel source will not compile.
Any ideas on how to get the system in a working state with kernel 6.8?
Thank you very much!
First off.. :eek: you look familiar!

I found this thread by chance. I bumped up from 6.5 to 6.8 on the official release and I get the same issues as Smooky. Fails to find PVE root on boot and falls back to initramfs. Dropping back to 6.5 in grub lets it boot normally. I'm ~1000 miles from the server so I can't dig in at this time to double check UUIDs. I had a remote assist on site to drop it back to 6.5 and get it back online until I fly back that direction.

Edit for more detail: I'll grab the deep specs when I fly back tomorrow, but this is on an older R320.

Code:
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  20
  On-line CPU(s) list:   0-19
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel
  Model name:            Intel(R) Xeon(R) CPU E5-2470 v2 @ 2.40GHz
    BIOS Model name:           Intel(R) Xeon(R) CPU E5-2470 v2 @ 2.
                         40GHz  CPU @ 2.4GHz
    BIOS CPU family:     179
    CPU family:          6
 
Last edited:
  • Like
Reactions: athurdent
I'm running a Supermicro M11SDV-8C-LN4F with an Intel X710-DA2 card, using SR-IOV.
With kernel 6.5 it works well when compiling the i40e driver manually.
With kernel 6.8, the interfaces got renamed (I took care of that and added npX everywhere), but my VMs, all have mapped VFs, will not start anymore:

Code:
[Thu Apr 25 08:18:46 2024] vfio-pci 0000:06:02.0: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.
[Thu Apr 25 08:18:47 2024] vfio-pci 0000:06:02.1: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.
[Thu Apr 25 08:18:48 2024] vfio-pci 0000:06:0a.2: Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.

I cannot compile the driver anymore, either.

Code:
root@epyc:~/intel-driver/i40e-2.24.6/src# make install
filtering include/net/flow_keys.h out
filtering include/linux/jump_label_type.h out
filtering include/linux/jump_label_type.h out
*** The target kernel has CONFIG_MODULE_SIG_ALL enabled, but
*** the signing key cannot be found. Module signing has been
*** disabled for this build.
make[1]: Entering directory '/usr/src/linux-headers-6.8.4-2-pve'
  CC [M]  /root/intel-driver/i40e-2.24.6/src/i40e_main.o
/root/intel-driver/i40e-2.24.6/src/i40e_main.c: In function ‘i40e_send_version’:
/root/intel-driver/i40e-2.24.6/src/i40e_main.c:11530:9: error: implicit declaration of function ‘strlcpy’; did you mean ‘strscpy’? [-Werror=implicit-function-declaration]
11530 |         strlcpy(dv.driver_string, DRV_VERSION, sizeof(dv.driver_string));
      |         ^~~~~~~
      |         strscpy
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:243: /root/intel-driver/i40e-2.24.6/src/i40e_main.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.8.4-2-pve/Makefile:1926: /root/intel-driver/i40e-2.24.6/src] Error 2
make[1]: *** [Makefile:240: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.8.4-2-pve'
Had to revert to 6.5 for now. Any hints?

I'm using the DKMS module from StrongTZ that enables SR-IOV for i915-driver-based iGPUs (12th gen Alder Lake, in my case).

That module is completely unusable in kernel 6.8, which changes quite a bit in preparation for official mainline SR-IOV support in iGPUs (and other things?).

I think we're going to be stuck on 6.5.13 for a while. :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!