Opt-in Linux 6.2 Kernel for Proxmox VE 7.x available

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
6,309
2,922
303
South Tyrol/Italy
shop.proxmox.com
We recently uploaded a 6.2 kernel into our repositories. The 5.15 kernel will stay the default on the Proxmox VE 7.x series, 6.2 is an option that replaces the previous 6.1 based opt-in kernel.
The 6.2 based kernel may be useful for some (especially newer) setups, for example if there is improved hardware support that has not yet been backported to 5.15.

How to install:
  • apt update
  • apt install pve-kernel-6.2
  • reboot
Future updates to the 6.2 kernel will now get installed automatically.

Please note:
  • It's not required to enable the pvetest repository, the opt-in kernel package is available also on the pve-no-subscription repository.
    But, as a newer pve-firmware package is required to fully use the new opt-in kernel we didn't yet upload it to the enterprise repository; that will be done in the following days.
  • While we are trying to provide a stable experience with the opt-in 6.2 kernel, updates may appear less frequently than for the default 5.15 based kernel.
  • The previous 5.19 opt-in kernel is now fully EOL, the previous 6.1 one might get another update, but we won't guarantee that.
  • If unsure, we recommend continuing to use the 5.15 based kernel.
Feedback is welcome!
 
Thanks, installed and running without issues so far.

Though I spotted the following in my boot log:
systemd-modules-load[1031]: Failed to find module 'vfio_virqfd'

But found this Arch bug report with in it a link to a vfio kernel commit that indeed seems to be introduced in kernel 6.2.
The commit message is:
vfio: Fold vfio_virqfd.ko into vfio.ko
Just tested gpu passthrough and that seems to work fine in kernel 6.2.2-1-pve regardless of the error. So that would give a wiki and documentation update when 6.2 is going to be used as default pve kernel, and for now the error can be ignored while running 6.2.y.
 
Setup: proxmox 7.3 (latest), plugin kernel-6.2.2-1-pve, pfsense 2.6 as virtual machine, standard bridge on both ports
Symptoms: kernel 6.2.2: the pfsense interface will not get an ip address on boot and ever (port dead), kernel 5.15: dhcp ip address will disappear multiple times a day (port dead), kernel 5.13: dhcp ip address will disappear every month (port dead). Fortunately the 6.1.10 kernel seems to work.

Please do not upgrade to the 6.2.2 kernel if you have an Intel Ethernet E810 / E810-XXV / E810-XXVDA2 Netwerk card. Honestly, I would recommend you buy something elese if you can. This card was working up to and about 5.13 kernel. Above and from that version, It was crashing every month up to multiple times a day. Only a reboot was able to resolve that. With kernel 6.2.2, the reboot will not help. The port is dead on boot.

Furthermore there is a complete freeze during or shortly after Post with Shuttle XH510G2 / XH510G Barebones in conjunction with 11th generation of Intel CPU. There seem to be errors either in the CPU or the card. Not even Intel NUCs can boot with this card:
https://community.intel.com/t5/Intel-NUCs/NUC12DCMi9-with-Intel-X710-E810-NICs/m-p/1393398
Intel had to make a BIOS workaround for their NUC "Fixed issue where system hang (CATERR = Catastrophic Error) when installing Intel X710/E810 LAN card on PEG (x16) slot."
And now probably every other Mainboard manufacturer has to make workarounds in their BIOS because of problems either in the Gen11 CPU or the E810 card. And Intel seems to be very forthcoming not to inform all their mainboard partners about the mistakes they made.
And then on top, the E810 kernel driver is so buggy that it works in one kernel and further kernels are broken. It completely eludes me how you can f... things up so much.

If you have installed the 6.2 kernel, use
apt-get purge pve-kernel-6.2* pve-headers-6.2*
to get rid of it.
 
Last edited:
Please do not upgrade to the 6.2.2 kernel if you have an Intel Ethernet E810 / E810-XXV / E810-XXVDA2 Netwerk card. Honestly, I would recommend you buy something elese if you can. This card was working up to and about 5.13 kernel. Above and from that version, It was crashing every month up to multiple times a day.
I cannot confirm this.
I have several Intel servers with E810 25GBit onboard and E810 100GBit cards in use.
Most of them run with kernel 5.15 and a few also with kernel 5.19.
The servers have never had a crash.

A little Hint, these Cards runs only stable with FEC91 Mode fix on the Switch.
Direct Connection has no problems.
 
Last edited:
Hi.
Do I need /tmp settings in /etc/fstab with zfs like for kernel 6.1?
Thx for your job, guys!
 
Hi,
Hi.
Do I need /tmp settings in /etc/fstab with zfs like for kernel 6.1?
Thx for your job, guys!
fixes for that should be included since ZFS version 2.1.7-pve3. See here for more information.
 
Hi,

fixes for that should be included since ZFS version 2.1.7-pve3. See here for more information.
Thx for answer.
But I don't understand :(
Now I have pve + zfs 6.1 kernel with /tmp line in /etc/fstab.
I want to upgrade 6.2.
Do I have to delete the /tmp line in the file /etc/fstab before upgrade to 6.2 ?
 
Last edited:
Thx for answer.
But I don't understand :(
Now I have pve + zfs 6.1 kernel with /tmp line in /etc/fstab.
I want to upgrade 6.2.
Do I have to delete the /tmp line in the file /etc/fstab before upgrade to 6.2 ?

This workaround was only needed for a short time.
So yes, assuming you are currently on the recent 6.1 kernel and especially ZFS (at least: 2.1.7-pve3; the most recent in: pve-no-subscription is: 2.1.9-pve1) version, you do not need that workaround anymore.
So simply remove the workaround, install the 6.2 kernel and reboot.
 
  • Like
Reactions: fiona
This workaround was only needed for a short time.
So yes, assuming you are currently on the recent 6.1 kernel and especially ZFS (at least: 2.1.7-pve3; the most recent in: pve-no-subscription is: 2.1.9-pve1) version, you do not need that workaround anymore.
So simply remove the workaround, install the 6.2 kernel and reboot.
Thank you :)
 
  • Like
Reactions: Neobin
Tried the latest kernel on one of my axiomtek NA362-DAMI-C3758-US nodes. None of the 4 built-in Intel X553 10 GbE SFP+ (rev 11) SFP will establish a link and all remain down. A pair is connected via DACs to a managed switch in an LACP bond. The other 2 are connected in an frr mesh routed setup (with Fallback) for Ceph storage with my other 2 nodes. The 6 I210 ports are usable.

Code:
root@axiom1:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 6.2.6-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-6.2: 7.3-8
pve-kernel-helper: 7.3-8
pve-kernel-5.15: 7.3-3
pve-kernel-6.2.6-1-pve: 6.2.6-1
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-6
libpve-network-perl: 0.7.2
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
openvswitch-switch: 2.15.0+ds1-2+deb11u2.1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20221111-1
pve-firewall: 4.2-7
pve-firmware: 3.6-4
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

Code:
root@axiom1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.15.15.18/32 brd 10.15.15.18 scope global lo
       valid_lft forever preferred_lft forever
    inet6 2001:db8:1111::/128 scope global
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master br0 state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:97 brd ff:ff:ff:ff:ff:ff
3: enp3s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master br0 state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:98 brd ff:ff:ff:ff:ff:ff
4: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr2 state UP group default qlen 1000
    link/ether 00:60:e0:7a:ba:99 brd ff:ff:ff:ff:ff:ff
5: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master bat0 state UP group default qlen 1000
    link/ether 00:60:e0:7a:ba:9a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::260:e0ff:fe7a:ba9a/64 scope link
       valid_lft forever preferred_lft forever
6: enp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master bat0 state UP group default qlen 1000
    link/ether 00:60:e0:7a:ba:9b brd ff:ff:ff:ff:ff:ff
    inet6 fe80::260:e0ff:fe7a:ba9b/64 scope link
       valid_lft forever preferred_lft forever
7: enp7s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:9c brd ff:ff:ff:ff:ff:ff
8: enp10s0f0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond0 state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:93 brd ff:ff:ff:ff:ff:ff
9: enp10s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc mq state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:94 brd ff:ff:ff:ff:ff:ff
10: enp12s0f0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond0 state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:93 brd ff:ff:ff:ff:ff:ff permaddr 00:60:e0:7a:ba:95
11: enp12s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc mq state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:96 brd ff:ff:ff:ff:ff:ff
12: br0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:97 brd ff:ff:ff:ff:ff:ff
13: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether be:8d:ac:3a:38:a1 brd ff:ff:ff:ff:ff:ff
    inet 10.15.10.18/24 scope global bat0
       valid_lft forever preferred_lft forever
    inet6 fe80::bc8d:acff:fe3a:38a1/64 scope link
       valid_lft forever preferred_lft forever
14: enp2s0.0@enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master vmbr1 state LOWERLAYERDOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:97 brd ff:ff:ff:ff:ff:ff
15: vmbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:97 brd ff:ff:ff:ff:ff:ff
16: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:60:e0:7a:ba:99 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::260:e0ff:fe7a:ba99/64 scope link
       valid_lft forever preferred_lft forever
17: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 9000 qdisc noqueue master vmbr3 state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:93 brd ff:ff:ff:ff:ff:ff
18: vmbr3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:93 brd ff:ff:ff:ff:ff:ff
    inet 192.168.10.18/27 scope global vmbr3
       valid_lft forever preferred_lft forever
19: vmbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether ee:f7:70:f7:2b:d6 brd ff:ff:ff:ff:ff:ff
20: enp7s0.20@enp7s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master vmbr6 state LOWERLAYERDOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:9c brd ff:ff:ff:ff:ff:ff
21: vmbr6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:60:e0:7a:ba:9c brd ff:ff:ff:ff:ff:ff

Code:
root@axiom1:~# lspci -v -s 0a:00.0
0a:00.0 Ethernet controller: Intel Corporation Ethernet Connection X553 10 GbE SFP+ (rev 11)
        Subsystem: Intel Corporation Ethernet Connection X553 10 GbE SFP+
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at dea00000 (64-bit, prefetchable) [size=2M]
        Memory at dec04000 (64-bit, prefetchable) [size=16K]
        Expansion ROM at df280000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 00-00-c9-ff-ff-00-00-00
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [1b0] Access Control Services
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

On an identical axiomtek NA362 node running the latest 5.15 kernel with exact same network config, speed and duplex come up appropriately as, Speed: 10000Mb/s and Duplex: Full
Code:
root@axiom1:~# ethtool enp12s0f1
Settings for enp12s0f1:
        Supported ports: [ FIBRE ]
        Supported link modes:   10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: off
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: no

Let me know if I can provide anything else?
 
Last edited:
None of the 4 built-in Intel X553 10 GbE SFP+ (rev 11) SFP will establish a link and all remain down
Anything in the kernel log (journalctl or dmesg) - what happens if you set the link manually up?

Also, is the network config still matching the right NIC names/IDs? Not that the newer kernel changes something on that front.
 
Anything in the kernel log (journalctl or dmesg) - what happens if you set the link manually up?

Also, is the network config still matching the right NIC names/IDs? Not that the newer kernel changes something on that front.
Nic names appear to be the same. Tried ip link up on all 4, let me know if another command would be more appropriate:
Code:
root@axiom1:~# ip link set enp10s0f0 up
root@axiom1:~# ip link set enp10s0f1 up
root@axiom1:~# ip link set enp12s0f1 up
root@axiom1:~# ip link set enp12s0f0 up
Code:
root@axiom1:~# dmesg | grep ixgbe
[    6.009490] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[    6.015139] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[    6.020581] ixgbe 0000:0a:00.0: enabling device (0140 -> 0142)
[    6.381320] ixgbe 0000:0a:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[    6.507027] ixgbe 0000:0a:00.0: MAC: 6, PHY: 14, SFP+: 3, PBA No: 000500-000
[    6.514094] ixgbe 0000:0a:00.0: 00:60:e0:7a:ba:93
[    6.641031] ixgbe 0000:0a:00.0: Intel(R) 10 Gigabit Network Connection
[    6.641203] ixgbe 0000:0a:00.1: enabling device (0140 -> 0142)
[    7.017323] ixgbe 0000:0a:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[    7.153033] ixgbe 0000:0a:00.1: MAC: 6, PHY: 14, SFP+: 4, PBA No: 000500-000
[    7.160099] ixgbe 0000:0a:00.1: 00:60:e0:7a:ba:94
[    7.278330] ixgbe 0000:0a:00.1: Intel(R) 10 Gigabit Network Connection
[    7.284957] ixgbe 0000:0c:00.0: enabling device (0140 -> 0142)
[    7.297439] ixgbe 0000:0c:00.0: PCI INT A: not connected
[    7.677564] ixgbe 0000:0c:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[    7.813077] ixgbe 0000:0c:00.0: MAC: 6, PHY: 14, SFP+: 3, PBA No: 000500-000
[    7.820133] ixgbe 0000:0c:00.0: 00:60:e0:7a:ba:95
[    7.943047] ixgbe 0000:0c:00.0: Intel(R) 10 Gigabit Network Connection
[    7.949664] ixgbe 0000:0c:00.1: enabling device (0140 -> 0142)
[    8.329501] ixgbe 0000:0c:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[    8.465079] ixgbe 0000:0c:00.1: MAC: 6, PHY: 14, SFP+: 4, PBA No: 000500-000
[    8.472138] ixgbe 0000:0c:00.1: 00:60:e0:7a:ba:96
[    8.595047] ixgbe 0000:0c:00.1: Intel(R) 10 Gigabit Network Connection
[    8.603426] ixgbe 0000:0a:00.0 enp10s0f0: renamed from eth6
[    8.625216] ixgbe 0000:0a:00.1 enp10s0f1: renamed from eth0
[    8.661498] ixgbe 0000:0c:00.0 enp12s0f0: renamed from eth1
[    8.697188] ixgbe 0000:0c:00.1 enp12s0f1: renamed from eth2
[   15.962067] ixgbe 0000:0a:00.1: registered PHC device on enp10s0f1
[   16.059828] ixgbe 0000:0a:00.1 enp10s0f1: detected SFP+: 4
[   16.097537] ixgbe 0000:0c:00.1: registered PHC device on enp12s0f1
[   16.718589] ixgbe 0000:0a:00.0: registered PHC device on enp10s0f0
[   16.818561] ixgbe 0000:0c:00.0: registered PHC device on enp12s0f0
[   17.074950] ixgbe 0000:0c:00.1 enp12s0f1: detected SFP+: 4
[   18.087846] ixgbe 0000:0a:00.0 enp10s0f0: detected SFP+: 3
[   19.098894] ixgbe 0000:0c:00.0 enp12s0f0: detected SFP+: 3
Code:
root@axiom1:~# dmesg | grep enp
[    6.740213] igb 0000:02:00.0 enp2s0: renamed from eth0
[    6.773679] igb 0000:03:00.0 enp3s0: renamed from eth1
[    6.817317] igb 0000:05:00.0 enp5s0: renamed from eth3
[    6.837273] igb 0000:06:00.0 enp6s0: renamed from eth4
[    6.865226] igb 0000:04:00.0 enp4s0: renamed from eth2
[    6.905740] igb 0000:07:00.0 enp7s0: renamed from eth5
[    8.603426] ixgbe 0000:0a:00.0 enp10s0f0: renamed from eth6
[    8.625216] ixgbe 0000:0a:00.1 enp10s0f1: renamed from eth0
[    8.661498] ixgbe 0000:0c:00.0 enp12s0f0: renamed from eth1
[    8.697188] ixgbe 0000:0c:00.1 enp12s0f1: renamed from eth2
[   15.733241] br0: port 1(enp2s0) entered blocking state
[   15.738412] br0: port 1(enp2s0) entered disabled state
[   15.744246] device enp2s0 entered promiscuous mode
[   15.749472] br0: port 2(enp3s0) entered blocking state
[   15.754639] br0: port 2(enp3s0) entered disabled state
[   15.759912] device enp3s0 entered promiscuous mode
[   15.962067] ixgbe 0000:0a:00.1: registered PHC device on enp10s0f1
[   16.059828] ixgbe 0000:0a:00.1 enp10s0f1: detected SFP+: 4
[   16.097537] ixgbe 0000:0c:00.1: registered PHC device on enp12s0f1
[   16.244232] batman_adv: bat0: Adding interface: enp5s0
[   16.249411] batman_adv: bat0: The MTU of interface enp5s0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   16.273525] batman_adv: bat0: Interface activated: enp5s0
[   16.283669] batman_adv: bat0: Adding interface: enp6s0
[   16.288847] batman_adv: bat0: The MTU of interface enp6s0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   16.312947] batman_adv: bat0: Interface activated: enp6s0
[   16.355403] 8021q: adding VLAN 0 to HW filter on device enp2s0
[   16.361276] 8021q: adding VLAN 0 to HW filter on device enp3s0
[   16.367202] 8021q: adding VLAN 0 to HW filter on device enp5s0
[   16.373078] 8021q: adding VLAN 0 to HW filter on device enp6s0
[   16.379047] 8021q: adding VLAN 0 to HW filter on device enp10s0f1
[   16.385177] 8021q: adding VLAN 0 to HW filter on device enp12s0f1
[   16.431459] vmbr1: port 1(enp2s0.0) entered blocking state
[   16.437065] vmbr1: port 1(enp2s0.0) entered disabled state
[   16.443053] device enp2s0.0 entered promiscuous mode
[   16.511524] vmbr2: port 1(enp4s0) entered blocking state
[   16.516873] vmbr2: port 1(enp4s0) entered disabled state
[   16.522490] device enp4s0 entered promiscuous mode
[   16.558703] 8021q: adding VLAN 0 to HW filter on device enp4s0
[   16.718589] ixgbe 0000:0a:00.0: registered PHC device on enp10s0f0
[   16.726351] 8021q: adding VLAN 0 to HW filter on device enp10s0f0
[   16.732650] bond0: (slave enp10s0f0): Enslaving as a backup interface with a down link
[   16.818561] ixgbe 0000:0c:00.0: registered PHC device on enp12s0f0
[   16.826278] 8021q: adding VLAN 0 to HW filter on device enp12s0f0
[   16.832567] bond0: (slave enp12s0f0): Enslaving as a backup interface with a down link
[   17.074950] ixgbe 0000:0c:00.1 enp12s0f1: detected SFP+: 4
[   18.019547] 8021q: adding VLAN 0 to HW filter on device enp7s0
[   18.082166] vmbr6: port 1(enp7s0.20) entered blocking state
[   18.087835] vmbr6: port 1(enp7s0.20) entered disabled state
[   18.087846] ixgbe 0000:0a:00.0 enp10s0f0: detected SFP+: 3
[   18.099226] device enp7s0.20 entered promiscuous mode
[   18.109967] device enp7s0 entered promiscuous mode
[   19.098894] ixgbe 0000:0c:00.0 enp12s0f0: detected SFP+: 3
[   19.153555] igb 0000:05:00.0 enp5s0: igb: enp5s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   19.163515] IPv6: ADDRCONF(NETDEV_CHANGE): enp5s0: link becomes ready
[   19.217530] igb 0000:06:00.0 enp6s0: igb: enp6s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   19.333233] IPv6: ADDRCONF(NETDEV_CHANGE): enp6s0: link becomes ready
[   19.557547] igb 0000:04:00.0 enp4s0: igb: enp4s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   19.567509] vmbr2: port 1(enp4s0) entered blocking state
[   19.572842] vmbr2: port 1(enp4s0) entered forwarding state
Let me know if another journalctl command would be better?
Code:
root@axiom1:~# journalctl -b | grep -i error
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:09.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:0a.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:0b.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:0c.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:0e.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:0f.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:10.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: pcieport 0000:00:11.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
Mar 20 08:02:56 axiom1 kernel: ERST: Error Record Serialization Table (ERST) support is initialized.
Mar 20 08:02:56 axiom1 kernel: RAS: Correctable Errors collector initialized.
 
Last edited:
Hi,
I have strange corosync retransmit with 6.1 && 6.2 kernel on my amd epyc v3 cluster.

cluster have 8 nodes, no running vms, 100% idle, 2 links on 2 differents switches.

I don't have problem with a bigger loaded intel cluster on same switches. (same nic too, mellanox connect-x5)


Time to time (1-2 occurence per day), a random node of the cluster is lagging (0 error log),and corosync take a timeout + retransmit.

on the bad node, "pvecm status" show all nodes connected (but with a different ringid than other nodes ? but in daemon.log, the corosync logs show random left/join)

"ping -f" show 0 latency or network packet loss on both links.

and restart of corosync on this node don't fix the problem, it need an hard reboot.

It seem to works fine with 5.15 for now

I wonder if it couldn't come from the amd tpm slutter bug
https://www.phoronix.com/news/AMD-Linux-Stuttering-Fix-fTPM
(but my firmware/microcode are updated)

I'll try again 6.1/6.2 kernel when amd patches will be commited.
 

Attachments

  • corosync.log
    88.8 KB · Views: 3
Does anyone know if there are any recommended scheduler adjustments that should be made for balanced power/performance ratio in a home environment when using this kernel with the latest gen intel processors (13th gen raptor lake with p+e cores)?

Is enabling schedutil recommended or a bad idea? Im trying to achieve a low idle power usage without sacrificing too much performance as this machine is mostly idle and occasionally hit with a bursty load. Power is very expensive here.
 
  • Like
Reactions: ITNiels and deniax
Hi @all

Since I've been using Kernel 6.2.6-1-pve, the problem that the VMs with disk passthrough don't start or cause a high disk IO has finally been solved.
Thanks for that.
Unfortunately I now have a new problem.
After starting with the new 6.2 kernel, my host console looks very strange and can therefore no longer really be used.
Shortly before the host has fully started, it looks like this:
20230323_122707.jpg

Errors are thrown out.

But when the host is fully booted, it looks like this:
20230323_122736.jpg

Does anyone have any ideas what I can do about it?

Greetings
Marcel
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!