Proxmox VE 8.2 released!

Same issue with a Dell PowerEdge R730xd equipped with the Intel/Dell x710 4 port interface card. I am posting this so others with this Dell network card will not be surprised like I was when all the interfaces disappeared.

And yes, this absolutely needs to be mentioned in the release notes.
Yep, happened to me just now, too, with an Intel X710-T2L card. I was confused why it didn't come back up until I logged into the console and saw that the uptime was only a few minutes.

Updating the interface names in /etc/network/interfaces (including in the bridge definitions) got me back up and running, but it would have been nice to know ahead of time that I'd need to do that.
 
any special way to do this or just something like

Code:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="01:23:45:67:89:ab", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

in
Code:
/etc/udev/rules.d/70-persistent-net.rules
## Interface Naming Fix

Code:
cp /usr/lib/systemd/network/99-default.link /etc/systemd/network/99-default.link
sed -i 's/NamePolicy=keep kernel database onboard slot path/NamePolicy=path/' /etc/systemd/network/99-default.link
update-initramfs -u

this will change the udev naming to "path" based Policy which means: unless you dont change your network cards pci-location wise, they will always have the same names.

Edit: This will likely change all your naming once again, but then never again as long as you have this configured ;-)
 
Last edited:
  • Like
Reactions: guletz
Hi,

I had 2 crash in 24h with kernel 6.8 on 2 differents servers (same generation, same hardware):
no kernel panic, server freeezed
edit : difference , crashing servers have ceph osd running with crypted osd. They other servers don't have any osd or storage for vm.


Code:
server:

System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 05

cpu:

model name    : AMD EPYC 7413 24-Core Processor
stepping    : 1
microcode    : 0xa0011d3



The kernel 6.8 is working fine since 2 weeks on other model, almost similar, but different epyc cpu

Code:
System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 06

model name    : AMD EPYC 7543 32-Core Processor
stepping    : 1
microcode    : 0xa0011d1

Edit:

Got a 1 crash log:


1714039900334.png
 
Last edited:
  • Like
Reactions: Der Harry
Hi,

I had 2 crash in 24h with kernel 6.8 on 2 differents servers (same generation, same hardware):
no kernel panic, server freeezed


Code:
server:

System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 05

cpu:

model name    : AMD EPYC 7413 24-Core Processor
stepping    : 1
microcode    : 0xa0011d3



The kernel 6.8 is working fine since 2 weeks on other model, almost similar, but different epyc cpu

Code:
System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR645
        Version: 06

model name    : AMD EPYC 7543 32-Core Processor
stepping    : 1
microcode    : 0xa0011d1
Everythings working fine on:

  • AMD EPYC 7282 (Supermicro) ZFS
  • Intel(R) Core(TM) i3-4170 (Supermicro) ZFS
  • Intel(R) Xeon(R) CPU E3-1246 v3 (Supermicro) ZFS
  • Intel(R) Celeron(R) CPU J345 (Intel NUC) LVM-Thin
On all Devices also no real errors in dmesg.
 
Last edited:
Had to pin the 6.5 kernel too because the r8168-module is not patched for 6.8. (yet?).
Without the module my Dell Wyse 5070 NIC is not working anymore.
 
Had to pin the 6.5 kernel too because the r8168-module is not patched for 6.8. (yet?).
Without the module my Dell Wyse 5070 NIC is not working anymore.
are you sure you can't just use the in-tree r8169 driver?
 
  • Like
Reactions: gseeley
Did someone already test if Connectx-3 and SAS2008 are working with the 6.8 kernel? Those are running fine here with 6.5 kernel, but I've seen several threads about those not fully working with PVE8.0/8.1.
 
are you sure you can't just use the in-tree r8169 driver?
I was used to using dkms-r8168, because with r8169 my NIC was no longer accessible after some time (hours..days) since PVE 8.0. Usually the r8168 module was recompiled with every kernel update, but not with 6.8.

I didn't have r8169 on any blacklist, but I'm not sure whether dkms-r8168 prevents the basic loading of r8169 anyway...? In any case, only the virtual network interfaces appeared under “ip a” with kernel 6.8.

I'll wait and see what the others say, as I always have to rewire the test server first, as I have to connect a display and keyboard directly as soon as I load the new kernel.
 
as part of automate install , is there any way to configure bond and vlan directly on response file or it's not taken in account and still need local shell to finish network config ?
This is currently not part of any installer (the automated installer uses the same backend as the GUI and TUI ones) - so this is something that is better handled after the initial installation (with ansible and similar tools)
 
  • Like
Reactions: Dunuin
I needed to do a clean install of Proxmox 8.2 because I broke something.

I'd like to test a DKMS module under 8.2 with the 6.5.13 kernel, but I don't see it in the repos with the Enterprise repo enabled, or the pvetest repo.

Is there a way I can get it back?

EDIT: I was searching for "pve-kernel," but it shows up when searching for "proxmox-kernel" as:
Code:
proxmox-kernel-6.5.13-5-pve/stable 6.5.13-5 amd64
  Proxmox Kernel Image

Can I just install that without breaking something?
 
Last edited:
Yes, but this just happened after kernel change and wasn't an issue from 6.2 to 6.5 in the past. I would expect this to be enabled by default or at least with a warning or information in the release notes... Just tried to update a second host and again only the X710 interfaces are affected.

View attachment 66882View attachment 66881
Same issue here with Dell PowerEdge R730xd X710 card, after update and reboot the Ceph didn't come back and I realized the interface name changed, I fixed renaming it but it's very annoying that doesn't have any information about it.
 
Unrelated to the network interface renaming, for some reason, my opnsense VM completely stopped working after I applied this update. I have eno 1 - 4 devices, only one of them give to OPNSense with a cable connection as a WAN port. It seems this bridge is no longer functioning at all, opnsense not being able to receive or send DHCP packets even.

It was actually not caused by the update but rather the reboot. I have removed interfaces a few days ago, forgot about them and now with the reboot, OPNSense's FreeBSD rewrote the device order. Solved by editing interface names in vi /conf/config.xml. Ironically this is a reminder why we need persistent device names. :)
 
Last edited:
After applying this update, one of my proxmox systems would not boot. It would just hang when trying to load the kernel.

I booted into an older 6.5 kernel, and ran:
Bash:
update-initramfs -u -k 6.8.4-2-pve
update-grub

Afterwards, the system booted fine. I'm not sure what happened, but it appears the initrd image was not generated correctly.
 
Just updated our 3 node CEPH HCI Test-Cluster from 7.4 to 8.2 - wow, that was easy!
Great work, i am really impressed how well it went!
And although the 7to8 and Repsitories Wiki could use some more polishing, everything was clear and easy to follow.

On a side node: we've set up our nodes with the .link files for really really persistent network devices right from the start, so we were not affected by renames due to the kernel update. Proxmox should really tackle this as a default, as can be seen with every bigger kernel release - people will constantly tripp over nonworking networking. I took the config from here: https://forum.proxmox.com/threads/netzwerkadapter-namen-wechseln-nach-reboot.123924/#post-539697


Now the first backup with fleecing enabled is running (and running fine by the looks) - great!


We'll receive our hardware for 3 bigger production clusters next month, i'm very eager to finally start for real!
 
Last edited:
Sorry for silly question, if i map physical interfaces names with systemd.link files, would not it affect vmbr interfaces? All my PVE network is OVS-based, so, for example i have eno1+eno2 in bond and vmbr bridge on top of it; bridge has the same MAC as eno1.

2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether XXXX:9b:a0 brd ff:ff:ff:ff:ff:ff
altname enp5s0f0
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether XXXX:9b:a1 brd ff:ff:ff:ff:ff:ff
altname enp5s0f1
...
9: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XXXX:9b:a0 brd ff:ff:ff:ff:ff:ff
 
Took the plunge and the upgrade itself went well. Issues I ran into were all based on the kernel update so for anyone searching for this:

  • Nvidia grid drivers do NOT work with kernel 6.8, nvidia will need to update them. All we can do is pin to 6.5 for now
  • Google coral driver had a fix pushed TODAY as of a couple hours ago. You'll need to pull current master and rebuild gasket driver
  • Previous SR-IOV workaround for igpus with kernel 6.5 does NOT work on 6.8 and likely never will, see here
  • SR-IOV may be supported OTB with the WIP xe driver. You can change to that as outlined here in the arch wiki. Mesa driver would be mesa-utils for debian based distros. I have 14th gen igpu but where SR-IOV support shows up without any changes from stock proxmox on xe driver however actual libs appear to be missing. More investigation will be needed. I'll update this post as I learn more
Code:
00:02.0 VGA compatible controller [0300]: Intel Corporation Raptor Lake-S GT1 [UHD Graphics 770] [8086:a780] (rev 04) (prog-if 00 [VGA controller])
        DeviceName: Onboard - Video
        Subsystem: Micro-Star International Co., Ltd. [MSI] Raptor Lake-S GT1 [UHD Graphics 770] [1462:7e07]
        Flags: bus master, fast devsel, latency 0, IRQ 210, IOMMU group 0
        Memory at 60fa000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 4000000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 4000 [size=64]
        Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [100] Process Address Space ID (PASID)
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [300] Page Request Interface (PRI)
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: xe
        Kernel modules: i915, xe

Code:
 dmesg | grep xe
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.4-2-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt xe.force_probe=a780 i915.force_probe=!a780
[    0.000000] NX (Execute Disable) protection: active
[    0.000815] MTRR map: 5 entries (3 fixed + 2 variable; max 23), built from 10 variable MTRRs
[    0.118991] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.8.4-2-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt xe.force_probe=a780 i915.force_probe=!a780
[    0.287392] ... fixed-purpose events:   4
[    0.005407] ... fixed-purpose events:   3
[    0.415066] pci 0000:00:1f.4: BAR 4 [io  0xefa0-0xefbf]
[    2.534542] systemd[1]: Set up automount proc-sys-fs-binfmt_misc.automount - Arbitrary Executable File Formats File System Automount Point.
[    2.889587] RAPL PMU: API unit is 2^-32 Joules, 2 fixed counters, 655360 ms ovfl timer
[    3.328283] xe 0000:00:02.0: vgaarb: deactivate vga console
[    3.329022] xe 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
[    3.330688] xe 0000:00:02.0: [drm] HDCP support not yet implemented
[    3.331067] xe 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1)
[    3.416986] xe 0000:00:02.0: [drm] Using GuC firmware from i915/tgl_guc_70.bin version 70.20.0
[    3.418369] xe 0000:00:02.0: [drm] Using HuC firmware from i915/tgl_huc.bin version 7.9.3
[    3.420405] xe REG[0x2340-0x235f]: allow read access
[    3.420409] xe REG[0x7010-0x7017]: allow rw access
[    3.420410] xe REG[0x7018-0x701f]: allow rw access
[    3.420433] xe REG[0x223a8-0x223af]: allow read access
[    3.420450] xe REG[0x1c03a8-0x1c03af]: allow read access
[    3.420468] xe REG[0x1d03a8-0x1d03af]: allow read access
[    3.420486] xe REG[0x1c83a8-0x1c83af]: allow read access
[    3.426859] [drm] Initialized xe 1.1.0 20201103 for 0000:00:02.0 on minor 0
[    3.427931] xe 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    3.427987] xe 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    3.428218] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [xe])
[    5.612276] xe 0000:00:02.0: [drm] GT0: suspended

I'll update this post if I figure out the xe driver

Edit 1: Proxmox doesn't include xe firmware. Download/placed from here but didn't appear to even be loaded in my case
Edit 2: Despite it saying sr-iov capable, I'm not seeing the multiple devices in lspci like previously, the previous i915 grub flags not surprisingly didn't change anything
Edit 3: Seems like there's stuff missing:
Code:
error: can't connect to X server!
libva info: VA-API version 1.17.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/xe_drv_video.so
libva info: va_openDriver() returns -1
vaInitialize failed with error code -1 (unknown libva error),exit
root@pve:~# ls /usr/lib/x86_64-linux-gnu/dri/
crocus_dri.so  i915_dri.so       iris_dri.so        nouveau_dri.so  r600_dri.so      swrast_dri.so      vmwgfx_dri.so
d3d12_dri.so   iHD_drv_video.so  kms_swrast_dri.so  r300_dri.so     radeonsi_dri.so  virtio_gpu_dri.so  zink_dri.so
Perhaps missing driver? I can't find anything via web search either on xe video driver or what package it'd be in. I checked intel-media-va-driver in debian sid and doesn't appear present in there
 
Last edited:
  • Like
Reactions: Dunuin

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!