[SOLVED] after update Kernel from 6.2.16-19 to 6.5.11-3 network not work anymore

JF62

New Member
Mar 5, 2023
22
3
3
Hi,
Syslog shows:
pve (udev-worker)[398]: could not find module by name='r8168'

Yes, it's the realtek r8168 card which work fine with the fixes described here. But now after update the kernel to 6.5 System isn't work unless I boot int 6.2.16-19

Any ideas how I can fix this?
 
Have you tried using the r8169 driver that is built into the Linux kernel. It is my understanding that it should have support for r8168 chips, and it is generally considered higher quality than the out-of-tree r8168 driver provided by the vendor. The latter would need to be installed as a DKMS module which takes a few extra steps and which does occasionally break with kernel updates.
 
how can I do this , if the automated process stopped me from having network access and only console works?
 
  • Like
Reactions: krawallovic
This exact problem occured at my pve after this kernel update. Yes, i needed to install the r8168-dkms package beforehand to actually get this server to work with the builtin NIC.

So i would be very interested in a replacement process from the r8168-dkms package to the native kernel drivers! Could someone explain this process step-by-step?

My workaround to let the system work again in the meantime: Boot with the old kernel
 
Last edited:
Can you manually load r8169 (e.g. by typing "sudo modprobe r1869" from the console)? Does it find your network card. I'd check the output of "dmesg" after doing this.

It is my understanding that the presence of the out-of-tree r8168 driver that you seem to have used previously prevents the in-kernel driver from getting activated. But I am not quite sure how this is configured. So, you might need to find an answer to that question first and then fix your configuration.
 
If you can still boot into the old kernel, you should be able to use DKMS to build the out-of-tree module ... assuming RealTek has updated their code to work with the newer kernel. If they haven't, you need to figure out how to do that yourself. That's the cost of relying on out-of-tree drivers, unfortunately. It can be quite a pain, and it's the reason why I usually discourage people from buying that type of hardware (cough, NVidia, cough).

As for your NIC, I'd recommend you try whether it works with the in-kernel driver. It is my understanding that it is supposed to be able to work with r8168 cards. But I don't have any first-hand experience with that. Alternatively, the pragmatic solution would be to spend $10 - $20 and buy a supported networking card. Right now is a great time to shop for hardware.
 
I tried to build the driver with:
$ dkms build -m r8168 -v 8.051.02 -k 6.5.11-3-pve
Sign command: /lib/modules/6.5.11-3-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Building module:
Cleaning build area...
make -j4 KERNELRELEASE=6.5.11-3-pve -C /lib/modules/6.5.11-3-pve/build M=/var/lib/dkms/r8168/8.051.02/build...............(bad exit status: 2)
Error! Bad return status for module build on kernel: 6.5.11-3-pve (x86_64)
Consult /var/lib/dkms/r8168/8.051.02/build/make.log for more information.


this is the log file:

DKMS make.log for r8168-8.051.02 for kernel 6.5.11-3-pve (x86_64)
Wed Nov 22 12:57:26 AM CET 2023
make: Entering directory '/usr/src/linux-headers-6.5.11-3-pve'
CC [M] /var/lib/dkms/r8168/8.051.02/build/r8168_n.o
CC [M] /var/lib/dkms/r8168/8.051.02/build/r8168_asf.o
CC [M] /var/lib/dkms/r8168/8.051.02/build/rtl_eeprom.o
CC [M] /var/lib/dkms/r8168/8.051.02/build/rtltool.o
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c: In function ‘r8168_csum_workaround’:
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c:29208:24: error: implicit declaration of function ‘skb_gso_segment’; did you mean ‘skb_gso_reset’? [-Werror=implicit-function-
declaration]
29208 | segs = skb_gso_segment(skb, features);
| ^~~~~~~~~~~~~~~
| skb_gso_reset
/var/lib/dkms/r8168/8.051.02/build/r8168_n.c:29208:22: warning: assignment to ‘struct sk_buff *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
29208 | segs = skb_gso_segment(skb, features);
| ^
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:251: /var/lib/dkms/r8168/8.051.02/build/r8168_n.o] Error 1
make[1]: *** [/usr/src/linux-headers-6.5.11-3-pve/Makefile:2039: /var/lib/dkms/r8168/8.051.02/build] Error 2
make: *** [Makefile:234: __sub-make] Error 2
make: Leaving directory '/usr/src/linux-headers-6.5.11-3-pve'

any ideas?
 
You could check whether RealTek has released updated sources. Otherwise, you are at the mercy of friendly users posting their suggested patches. Keep searching. Usually, they show up within a couple of weeks. Just be prepared that you have to keep doing this whenever the kernel changes, unless you can convince the r8169 driver to work for your particular NIC.

Or you can read the Linux kernel patch history and figure out how the internal ABI has changed. If you can read C code, then the required changes are usually not all that difficult. It's just tedious to keep up with things.

I used to play these games, but I am getting too old for it. Personally, I no longer buy any hardware that doesn't come with in-kernel drivers. Fortunately, as Linux has become more mainstream, it's really not that difficult nor expensive to swap out poorly supported hardware or -- even better -- avoid buying it in the first place.

Good luck. I remember how much it sucked having to chase driver updates from unresponsive hardware manufacturers
 
You are right, but it's a Dell Wyse thin client working with several lxc's as a HA server. And therefore it's just a 3 years old system an the driver works just fine in all kernels until now. The problem is inside linux and not at realtec's side. Anyway I can't change to a other NIC and so I stay at kernel 6.2 for now and will invest searching further.
 
The problem is that RealTek refuses to work with the kernel developers on this. They have written a driver that by all accounts has very low code quality. And they don't work with the kernel developers to bring it up to the required standards to have it officially supported by Linux. They also fail to actively track changes to the kernel, which inevitably happen with every single kernel release. That's how the Linux kernel improves over time.

For in-kernel drivers, this is no big deal. Any changes to the networking internals will be made in coordination with all the drivers that are built into the kernel. But it doesn't work as well for third-party drivers that only receive occasional updates.

In the case of your thin client, I'd recommend looking into a USB-based NIC. I suspect your device has USB 3.2 ports which should be good for at least 1 Gigabit Ethernet. You might even be able to push 2.5 GigE through it, but that's more iffy and you won't necessarily be able to saturate the link.

Also, as I keep saying, try the driver that is built into the kernel. It is rewritten from scratch and supported by the kernel developers. If it can talk to your hardware, then it avoids all of the headaches that you are running into here.
 
In general, USB network adapters don't require drivers. So, I'd just buy a random one off of Amazon. You can easy buy one that is built into a USB hub, so that you get to keep connectivity for other devices. Just be aware that some hubs make awkward tradeoffs when sharing bandwidth between different connected devices. So, make sure you confirm that you get the bandwidth that you expect.

And since this is Amazon, if the first device doesn't work, return it and order a different one. Shouldn't be very hard to find something that works for you and shouldn't cost a lot of money either.

And yes, as the comment in the other thread says, you can always download the sources from RealTek's website and see if you can make them work. That's the part that is tedious, but certainly doable.
 
I made it happen, by activating new kernel and installing the driver from realtek website. So all is up and running and no dkms needed anymore.
 
The driver on the RealTek website either invokes DKMS or does something really similar to it but using it's own infrastructure. I am glad to know that you got things to work.
 
That is an orthogonal issue. It affects the way how devices are named, but it doesn't explain OP's difficulties with compiling an out-of-tree NIC driver. Of course, either problem would prevent the server from successfully bringing up the network.
 
To add a datapoint, I had issues with r8169 in the past and I had switched to r8168.
I haven't tested it for long enough to give you a definitive answer but it seems that r8169 fixed the issues I was seeing and it now works better with my hardware.

In my machine I had /etc/modprobe.d/r8168-dkms.conf preventing loading r8169 on boot.
Code:
# settings for r8168-dkms

# map the specific PCI IDs instead of blacklisting the whole r8169 module
alias    pci:v00001186d00004300sv00001186sd00004B10bc*sc*i*    r8168
alias    pci:v000010ECd00008168sv*sd*bc*sc*i*            r8168

# if the aliases above do not work, uncomment the following line
# to blacklist the whole r8169 module
#blacklist r8169

Commenting out the aliases is allowing r8169 to load and so far it's working fine for me, so that might be an option for you as well.
 
To be able to update the kernel to 6.5.11-4-pve, WITH R8168 kernel module working, you need to build the kernel module for the new version of the kernel. The module was previously build for the kernel version you were running.

To fix it you'll need to get the kernel headers for the new version:

Code:
apt-get install linux-headers-6.5.11-4-pve

Second, you'll need to fix the kernel module source code, as Realtek did not really update the source code so it builds correctly for kernel 6.5. Luckily, fixing it is simple. Check my notes in the end for a technical explanation for the issue.

Edit the following file with your favourite editor:

Code:
nano /usr/src/r8168-8.051.02/r8168_n.c

Then add the following line to the beginning of it:
Code:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
#include <net/gso.h>
#endif

My suggestion is, you should add it in between the 2 lines below:

Code:
#include <linux/netdevice.h>
#include <linux/etherdevice.h>

So it looks like this:

Code:
#include <linux/netdevice.h>
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
#include <net/gso.h>
#endif
#include <linux/etherdevice.h>

Then you can try building the kernel module:
Code:
dkms build -m r8168 -v 8.051.02 -k 6.5.11-4-pve

You can check if anything went wrong this way:
Code:
cat /var/lib/dkms/r8168/8.051.02/build/make.log

You MIGHT need to install grub-efi-amd64 if you have related errors while building the module. If so, do this:

Code:
apt install grub-efi-amd64

Then retry building the kernel module.

After that you should be able to upgrade the kernel to 6.5.11-4 and also upgrade Proxmox to 8.1:

Code:
apt upgrade


ADDENDUM - What's wrong with the kernel module source code?

The skb_gso_segment C function definition was moved from the netdevice.h header file to net/gso.h in kernel 6.4.10 (if I'm not wrong) [1].

Hence the error when building the kernel module. The function definition couldn't be found.
The fix is just a matter of adding the net/gso.h header include.
However that's only necessary for kernel 6.4.10 forward. Hence the
Code:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
conditional statement and its matching
Code:
$endif
statement around the include.

References:
[1] - https://lore.kernel.org/netfilter-devel/1367259744-8922-16-git-send-email-pablo@netfilter.org/
 
To be able to update the kernel to 6.5.11-4-pve, WITH R8168 kernel module working, you need to build the kernel module for the new version of the kernel. The module was previously build for the kernel version you were running.

To fix it you'll need to get the kernel headers for the new version:

Code:
apt-get install linux-headers-6.5.11-4-pve

Second, you'll need to fix the kernel module source code, as Realtek did not really update the source code so it builds correctly for kernel 6.5. Luckily, fixing it is simple. Check my notes in the end for a technical explanation for the issue.

Edit the following file with your favourite editor:

Code:
nano /usr/src/r8168-8.051.02/r8168_n.c

Then add the following line to the beginning of it:
Code:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
#include <net/gso.h>
#endif

My suggestion is, you should add it in between the 2 lines below:

Code:
#include <linux/netdevice.h>
#include <linux/etherdevice.h>

So it looks like this:

Code:
#include <linux/netdevice.h>
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
#include <net/gso.h>
#endif
#include <linux/etherdevice.h>

Then you can try building the kernel module:
Code:
dkms build -m r8168 -v 8.051.02 -k 6.5.11-4-pve

You can check if anything went wrong this way:
Code:
cat /var/lib/dkms/r8168/8.051.02/build/make.log

You MIGHT need to install grub-efi-amd64 if you have related errors while building the module. If so, do this:

Code:
apt install grub-efi-amd64

Then retry building the kernel module.

After that you should be able to upgrade the kernel to 6.5.11-4 and also upgrade Proxmox to 8.1:

Code:
apt upgrade


ADDENDUM - What's wrong with the kernel module source code?

The skb_gso_segment C function definition was moved from the netdevice.h header file to net/gso.h in kernel 6.4.10 (if I'm not wrong) [1].

Hence the error when building the kernel module. The function definition couldn't be found.
The fix is just a matter of adding the net/gso.h header include.
However that's only necessary for kernel 6.4.10 forward. Hence the
Code:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,10)
conditional statement and its matching
Code:
$endif
statement around the include.

References:
[1] - https://lore.kernel.org/netfilter-devel/1367259744-8922-16-git-send-email-pablo@netfilter.org/

Keep in mind r8169 is working on Kernel 6.5.11-4. I just tested.

So you can uninstall r8168-dkms completely and move on.
Just remember to remove r8169 from modprobe blacklist otherwise it will not be loaded at boot time.
 
Just to make sure I'm doing the right things. I remove r8168-dkms with: sudo apt-get remove --auto-remove r8168-dkms
And to enable r8169 I remove r8168-dkms.conf from /etc/modprobe.d/

After a restart the r8169 driver should work again?
 
  • Like
Reactions: punkfruit

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!