Diagnosing between hardware or software fault: RTL8125 NIC issues in PVE 7.4 AND PVE 8.2

munkiemagik

New Member
Jun 14, 2024
5
0
1
One realtek 4 port RTL8125 NIC (rev 05). Used in a HP Prodesk 600 G4 (intel i5 8500), with an M2 PCIE Mediatek MT7916 6GHz AP module.
The only specific thign i can say about this NIC is that it appears to use Pericom Semiconductor PI7C9X2G608GP PCIE bridge. I think I have seen some others that use Asmedia PCI bridge


Displays two different behaviours under PVE 7.4 and PVE 8.2 but ultimately ends up non-functional with no working networking in both cases. Im not sure how to conclusively determine if I am just not able to identify a bug in the drivers/firmware or if there is something faulty in the hardware itself. Or if I am missing a step in my network reconfiguring after driver update.

My knowledge of linux and network hardware and configuration is very limited. Im learning as I go along.

At this point there are no othere VMs or containers. It is just a fresh new Proxmox installation each time. The Prodesk 600 G4 has 2 PCIE slots and 1x M KEy and 1x AE Key M2 slot. Ive tried the NIC in both PCIE slots and get the same result.



A) Under PVE 7.4 the RTL8125 works as expected full speed with no issue but when I enable intel_iommu as I need to pass through teh M2 PCIE device I get hit with:​
r8169 (RTL8125B): "rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100)​
When I remove "intel_iommu=on" the system goes back to normal. But in this non iommu state its no good to me as I need to pass through the M2 PCIe card to a VM. And why would iommu break the RTL8125 NIC in proxmox?​
Dug around the subject and learnt that its stated to be a driver bug and solution is to install PVE-headers and install alternative driver. Have tried two different drivers, from github/awesomtic dkms and also latest driver downloaded from Realtek. After succesfully installing (each time on a fresh PVE install) and then blacklisting the defualt r8169. My entire NIC dies and there is no data going in or out of anythign anymore and I lose the communnication to Proxmox, even though /etc/network/interfaces looks to be spot on.​



B) Under PVE 8.2 immediately after a fresh install the NIC spams error:​
NETDEV WATCHDOG: (r8169): transmit queue 0 timed out​
I go through the similiar process of installing appropriate pve-headers, drivers and blacklisting r8169 and again I end up with everything looking like it should be working but an absolutely dead network with no data going around.​




How do I dig deeper into this to find out what is happening and determine whether there is somethiing physically wrong with the NIC or me just having broken Proxmox? I read so many people saying they have working systems with this card, is it possible I have a malfucntioning NIC?

Its highly unlikley that after all these years of being a budget friendly 2.5GbE option noone has used it in a system where intel_iommu is on or with my particular combination of hardware. So its much more more likely there is something wrong in the way I am installing drivers or my silicon is bad? Or I am missing a step in the network reconfiguring after installing newer drivers?

The vendor has offered to send me a replacement. But I dont know if this is likley to solve the problem.
 
> The vendor has offered to send me a replacement

If you have the option, have them send you one with the RTL8125B chip. Word on the street is that rev works better with Linux.

Or switch to an Intel-based chip. Have seen no issues with those.
 
  • Like
Reactions: leesteken
Proxmox VE 8.2 with kernel version 6.8 enables intel_iommu=on by default, which was a surprise for may users. If this is problematic for your hardware (which could be caused by CPU, motherboard or BIOS or the device itself) add intel_iommu=off to the kernel parameters: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline
I did not know that thanks. Would be interesting to be able to confirm that enabling iommu is triggering the RTL8125 to glitch with both kernels. I dont think Im capable of figuring out a solution for it though. But still its better to at least know what the problem is. Bit of a bugger though as I really need PCIE passthrough for the M2 slot. The workaround for that would be the more reccommended option of purchasing a seperate Ubiquiti 6GHz AP negating the need for pcie passthrough but then that adds another £200 (U7 Pro and POE injector) to total system build cost.

(PS - this is all just a learning and experimenting project so no critical data or services, but the plan is to run openwrt VM as router with the M2 PCIE 6GHz AP module from mediatek (MT7916) and I do understand everyone advises, with good reason, against virtualising everything into one box.)

How would I be able to investigate deeper what is happening internally, is it just a case of scouring through dmesg? I've recently discovered that ethtool gives me access to a whole host of things i can tinker with, for example Virgin media have a botched firmware currently and that kills the 2.5GbE port on their Hub 5. But at least thats an easy fix: ethtool --set-eee 'enxxxx' eee off for whichever physical port is connecting to the virginmedia hub's port.
 
Last edited:
> The vendor has offered to send me a replacement

If you have the option, have them send you one with the RTL8125B chip. Word on the street is that rev works better with Linux.

Or switch to an Intel-based chip. Have seen no issues with those.

Sorry whch 'rev' are you referring to, do you mean that the rev05 I have is the one that is supposed to work well with linux? I should have been more acccurate, I do have the RTL8125B. One of those generic 4 port versions. The only noteable difference I see on mine is that somehwere online I saw them as advertised with Asmedia PCIE pridge but mine has the Pericom pcie bridges.

I'll get back to the vendor and ask them to kindly send another one through then.

I specifically do want 2.5GbE throughout the LAN so when you say intel, the only 'cheap enough to experiment with' intel cards I could see that have a sufficient number of ports for what I want are the i225 cards, Ive not seen 4 port i226 NICs.

There were frequent mentions i've stumbled upon online that there are certain revisions of the i225 that are more prone to be buggy than others in Proxmox and I've come across a lot of 'help i225 not working' posts as well. Any tips please?

Or shoud l really drop the 4 port requirement out of my head and just grab a 2 port NIC (ie X540/x550-T2) and a cheap 2.5GbE/10g sfp+ switch? Are there any potential drawbacks to those cheap aliexpress 4x2.5GbE/2x10G SFP+ switch options (alleged 60gbps backplane throughput and 44Mpps forwarding rate)? But then this means additional cost is NIC & switch and all the SFP+ modules.

Disclaimer: I am building all this just to mess around with wireless Quest 3 PCVR (local) game streaming at 120fps and higher than default bitrate encode streams and concurrently mess around with some self-hosted cloud storage solutions ie nextcloud or equivalent.

The other solution/workaround is to ditch the £90 I spent on the M2 PCIE AsiaRF AW7916 (Mediatek MT7916) 6GHz AP module so I wont need intel_iommu for pice passthrough and spend another £200 to buy a U7 Pro and poe injector.


Honestly at the moment as I am just messing around and learning stuff that is not job/profession/need related, Im a little hesitant to drop another £200 when I've already spent £200 on the current hardware SFFPC, NIC, AP module and accesories, just to mess around for no good reason other than to occupy a bit of time and learn something I find interesting. If I do finally get this up and runnign smoothly I will happily drop some more money on some Seagate exynos drives and drive caddies to setup a ZFS mirror to play with the self-hosted cloud storage solution. I just really need this M2 pcie AP and 2.5GbE NIC playing nice first in proxmox. The only possible temptaton to go the Ubiquiti U7 Pro route is that on baremetal OpenWRT the MT7916 6GHZ AP can be temperemental with acheiving full bandwidth to the Quest3 at 6GHz, Quest drivers/firmware are also pretty spotty in this regard.

Apologies for the textual overload!
 
Proxmox VE 8.2 with kernel version 6.8 enables intel_iommu=on by default, which was a surprise for may users. If this is problematic for your hardware (which could be caused by CPU, motherboard or BIOS or the device itself) add intel_iommu=off to the kernel parameters: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline
I can confirm that by adding intel_iommu=off in a fresh install of PVE8.2 the RTL8125 no longer throws up
NETDEV WATCHDOG: (r8169): transmit queue 0 timed out

Still dont knwo how to fix the r8125 drivers not working though. Have ordered two more NICs to test, both IOCrest one is an intel i225 and the other is RTL8125B again


NOPE still giving NETDEV WATCHDOG: (r8169): transmit queue 0 timed out even with intel_iommu=off in PVE8.2
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!