Interesting Realtek Speed Issue with PVE5

adamb

Famous Member
Mar 1, 2012
1,326
77
113
We are running into a pretty interesting/odd realtek NIC speed issue on proxmox hosts.

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0e)

The only thing we use these hosts for are small CentOS7 VM's which act as openvpn routers.

- When set to 1000Mb/s full we are seeing extremely slow speeds out to the internet (1-2MB/s) (Client has a 1G pipe in both directions)
- Local LAN speeds are fast and can saturate 1G
- If we force 100Mb/s full with ethtool we have no issues maxing that out at 11MB/s
- So for whatever reason, 100Mb/s full is 10x faster than 1000Mb/s full going out to the internet but not on the LAN
- I know its odd and it doesn't make much sense to me

At the same site, we also have other hardware which also has the same RTL8111 NIC but its running CentOS7 baremetal, it doesn't have this issue at all. Both C7 and PVE5 look to use the same driver as well.

PVE5
driver: r8169
version: 2.3LK-NAPI

C7
driver: r8169
version: 2.3LK-NAPI

This specific client has roughly 40 sites with this hardware, all having the same issues on PVE5. They have a range of network gear from Dell/Cisco/Netgear so there is no common denominator on that side.

Its a stretch but does anyone have any possible ideas? Ive tried the latest 4.15.18-21-pve kernel as well.
 
Hi, this might be related to the same issue described here:
https://forum.proxmox.com/threads/e1000-driver-hang.58284/

However that thread is till now discussing e1000 or e1000e driver issues, but seems to narrow down to routed setups.
A bit of duckduck-ing (yeah there's more than google) learns that this article speaks about the driver you are using.
https://wiki.hetzner.de/index.php/Hang_up_with_Realtek_r8169/r8168_NIC

If disabling tso on your host for this nic fixes this issue, you could also append your nic data to that thread as mentioned by @spirit
https://forum.proxmox.com/threads/e1000-driver-hang.58284/post-285530
 
I disabled TCP offloading. It didn't make any difference. I also looked at the e1000 thread but I'm running VirtIO drivers on some of the VMs and still have the same issue. I didn't make any changes to the physical card on the Proxmox machine however.
 
But in the threads it's talking about the NIC being set to the VirtIO or the E1000. Those would be VM guest settings, not the Promox host.

You're saying it's the physical NIC on the Proxmox host that needs to have TSO disabled?
 
Yep, on the physical one.
That's also the reason why users in the thread have been asked to give output for:
# lspci -nn

I don't know linux well so I don't even know what that request means. I'll have to go find the commands to turn off TSO

I sort of hate to just start throwing commands at it that I see in the threads because I don't really understand what I'm doing. It means I can't really discern between good advice and bad advice and putting in the wrong command could cause more damage. Make sense?
 
Don't worry, I've been there too, but don't let the command line interface scare you.
Most commands are well documented in linux, as well as the commands for Proxmox.
"man" command is always a good start. so if you want to know what lspci means, you just type in your proxmox terminal:
man lspci
And it's indeed wise to not just copy and paste stuff you find on the internet.
If you want to learn and understand/learn what commands actually do and more, it's always nice to have another testing system running (a debian vm inside your proxmox host, or a vm inside virtualbox or virtmanager on another pc or laptop) and try out there first before running them on your proxmox host.

If you want to find out if disabling tso will help, I'm willing to guide you.
If so, first we need some information from your network config so could you post output from these commands?
They should be run on the proxmox host and are just printing information
Code:
cat /etc/network/interfaces
and
Code:
lspci -nn | grep Ethernet
 
I'm all about learning. My biggest problem is that I learn something and then don't use it for five years and forget all about it and have to start over. I'm a two-man shop so I'm into everything, but nothing for very long. Things were fine with my old Proxmox but with Microsoft putting an end to 2008R2 updates it was time to update the servers and that seems like a good time to update Proxmox too. New hardware, new Proxmox, new Windows. Of course I soon found Windows 2019 has changed so much since 2008R2 that I'm starting over there too. We run RemoteApps, which Microsoft always said was the future. But in 2019 they've made it impossible to run RemoteApps easily and require two servers to run one server. More money for them. More work for everyone else. I dislike Microsoft but not enough to leave it... yet.

I would happy accept your offer of assistance so I can learn. I'm on the live box. Right now all I have are my old live box, and the new one.

I've looked at the Interfaces before but the warning "Do NOT modify this file unless you know what you're doing" was enough to cause me to leave it alone. I clearly don't know what I'm doing. <g>
Code:
root@pve:~# cat /etc/network/interfaces[/INDENT]
[INDENT]# network interface settings; autogenerated[/INDENT]
[INDENT]# Please do NOT modify this file directly, unless you know what[/INDENT]
[INDENT]# you're doing.[/INDENT]
[INDENT]#[/INDENT]
[INDENT]# If you want to manage parts of the network configuration manually,[/INDENT]
[INDENT]# please utilize the 'source' or 'source-directory' directives to do[/INDENT]
[INDENT]# so.[/INDENT]
[INDENT]# PVE will preserve these directives, but will NOT read its network[/INDENT]
[INDENT]# configuration from sourced files, so do not attempt to move any of[/INDENT]
[INDENT]# the PVE managed interfaces into external files![/INDENT]
[INDENT][/INDENT]
[INDENT]auto lo[/INDENT]
[INDENT]iface lo inet loopback[/INDENT]
[INDENT][/INDENT]
[INDENT]iface eno1 inet manual[/INDENT]
[INDENT][/INDENT]
[INDENT]iface eno2 inet manual[/INDENT]
[INDENT][/INDENT]
[INDENT]auto vmbr0[/INDENT]
[INDENT]iface vmbr0 inet static[/INDENT]
[INDENT]        address  192.168.xxx.162[/INDENT]
[INDENT]        netmask  24[/INDENT]
[INDENT]        bridge-ports eno1[/INDENT]
[INDENT]        bridge-stp off[/INDENT]
[INDENT]        bridge-fd 0[/INDENT]
[INDENT]#LAN-inside port[/INDENT]
[INDENT][/INDENT]
[INDENT]auto vmbr1[/INDENT]
[INDENT]iface vmbr1 inet static[/INDENT]
[INDENT]        address  xxx.xxx.xxx.xxx[/INDENT]
[INDENT]        netmask  24[/INDENT]
[INDENT]        gateway  xxx.xxx.xxx.xxx[/INDENT]
[INDENT]        bridge-ports eno2[/INDENT]
[INDENT]        bridge-stp off[/INDENT]
[INDENT]        bridge-fd 0[/INDENT]
[INDENT]#WAN port outside edge[/INDENT]
[INDENT]
Code:
root@pve:~# lspci -nn | grep Ethernet
01:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
01:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]

So I don't have the Realteck NIC cards. As I'd said before, I thought we were talking about the virtual cards on the VM rather than the physical machine card so I was already going the wrong direction. I see a ton of articles related to slow performance on the BCM5720 card. Most of course are 5 years old or so. The problem seems to be that Broadcom has the VMQ enabled by default however there a registry value that needs to be added (Registry??? So is this author is clearly hosting his VMs on Windows instead of Proxmox so it may not apply?) I don't know how to do this with a Proxmox host. Below are the steps he used:

STEPS
  1. Since Broadcom has VMQ enabled by default, I disable it in the configuration properties of all my physical adapters assigned to my guests, in the advanced tab. Intel NIC owners need not do this step, as Intel has it disabled by default.
    1. On my Hyper-V host, I open Regedit and drill down to HKLM\SYSTEM\CurrentControlSet\Services\VMSMP\Parameters
      1. I then add to Parameters a DWORD value and name it BelowTenGigVmqEnabled (since I have a 1Gb adapter. 10Gb owners need TenGigVmqEnabled) and give it a value of 1.
      2. Finally, I go back to the physical adapters and enable Virtual Machine Queues. Instantaneously, network performance issues are solved and my pings are all <1ms. This also actually sped up the OS in my VMs and they are no longer sluggish. Queries to AD now return in a snap. My world is now beautiful.
 
So I'm finding a ton of instructions on how to disable VMQ when it's on a Windows based machine. But since Proxmox isn't Windows based and it controls the physical NIC, how do I turn VMQ off?
 
might also be some bad auto negotiation on your switch, try in the switch hard setting the port / duplex speed tp match the nic
 
might also be some bad auto negotiation on your switch, try in the switch hard setting the port / duplex speed tp match the nic
The switch shows 1GB connection in Auto. Changing it to fixed at 1GB makes no difference, but I did try it.
 
Is there a way to access the NIC properties in either Proxmox or the command line? I don't see any way to do it in Linux posted online anywhere. Since this problem has been around for at least five years it's reasonable that someone here knows how to turn off VMQ for Proxmox.
 
Now it's clear you have broadcom nics, we maybe hijacking this thread, as this should be about issue with realtek as OP says.
But... saw you already had another thread running here about your issue:
https://forum.proxmox.com/threads/really-slow-nic-speeds.62563/#post-285860
Perhaps it's best to continue there, although a lot of info is already posted here :(, @adamb might complain.

So I'm finding a ton of instructions on how to disable VMQ when it's on a Windows based machine. But since Proxmox isn't Windows based and it controls the physical NIC, how do I turn VMQ off?
VMQ seems to be a Windows and Hyper-v related feature (sigh).
https://docs.broadcom.com/docs-and-...hernet_nic/Broadcom_NetXtreme_Server_17.0.pdf
https://docs.microsoft.com/en-us/pr...ows-server-2008-R2-and-2008/gg162704(v=ws.10)
And also another interesting article, although windows based, give insights on vmq:
https://www.petri.com/hyper-v-network-issues-1-gbe-nics

So good luck disabling that in linux, i didn't find any linux related articles how to do that.
Some features of VMQ however include hardware offloading, so maybe it's best to stay on trying disabling offloading with ethtool, no guaranties however.

Ethtool can be installed from debian repositories, which you are already using in proxmox.
Code:
apt install ethtool

Then you can disable tso on eno1 and eno2, those are your physical nics.
I will describe for eno1, after that you can do the same for eno2
First print out the current tso status:
Code:
ethtool -k eno1 | grep offload
Then disable tso
Code:
ethtool -K eno1 tso off
And check again status with previous command if it is off now.

Now follow the same steps for eno2.
Note that this setting only change nic properties in current boot status. As soon as you reboot your proxmox host settings are back as they were before. (but we can make it permanent if this gives positive results)
 
I have the same issue, speed is maxed at 11MB/s, when copying files from windows to the server. But i think it is a bridge issue, because my nic is connected to a bridge and my vms are connected again to the bridge. On the switch i see it has 100mbit connection. So the bridges are only 100mbit i think. My switch is autosensing 10/100/1000. My network cards are intel 350 Gigabit. I have 3 of those on the mainboard, i did a test with second nic which is not connected to a bridge and when copying files i get full speed. I can see on the switch it has gigabit so that makes sense.
 
Wee actualli first i thought it has to do wits zfs because i get a general protection error in the logs, i did post this today under
Gernal protection error on new install 6.1

But maybe this is seperate from the general protection error im having because of the test i did with copying files to the server when connected to the bridged and not connected to the bridge, but again it is a bit confusing because when i look in the log when i get the general error protection i see a lot of modules that are linked in the error. The gerneal protection error is only when i reboot the server BTW
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!