CPU KVM64 much slower networking than Host

debsque

Renowned Member
Sep 28, 2016
19
2
68
36
Hi,

I'm setting up a new Proxmox machine and I've noticed something weird.

Proxmox 6.1-3 fresh install + openvswitch
Host: Dell R720 | 2 x Intel Xeon E5-2660 (updated Microcode) | 4 port I350 network card;
VMs: Windows and Debian, virtio everything;

Network speedtests in both VMs under CPU: KVM64 go up to ~500 Mbps. With CPU: Host they go full speed ~ 800 Mbps (netgate sg-2440 bottleneck)

I didn't notice this in my previous setups, albeit I haven't tested network speeds for some time.
 
The default kvm64 CPU is very basic in the instruction sets it supports.

Once you change that to something newer or even host you will have the latest SSE, AES-NI and so forth instruction sets which can speed up certain things significantly.
 
The default kvm64 CPU is very basic in the instruction sets it supports.

Once you change that to something newer or even host you will have the latest SSE, AES-NI and so forth instruction sets which can speed up certain things significantly.

Well, aes and specter mitigation flags were already set.

Anyway, I tried different options and something doesn’t seem right. I’ve virtualized pfsense for gigabit throughput but I have to allocate more than 6 cores. Then, using linux bridge instead of OVS seemed to improve throughput.

I’ve another host with an E3-1230@3.5GHz that works better but has fewer cores. Could it be a CPU problem?
 
E5-2660 = 2,2 GHz
E3-1230 = 3,2 GHz
That is 1 GHz more clockrate. Roughly 30%.

You are experiencing the situation that something only runs single-threaded. And that is an issue if you are having low clock-speeds.
Many cores can help, if you use technologies like multi-queueing, but in the end. More clock frequency speeds up things (until you burden it with too many threads).
 
E5-2660 = 2,2 GHz
E3-1230 = 3,2 GHz
That is 1 GHz more clockrate. Roughly 30%.

You are experiencing the situation that something only runs single-threaded. And that is an issue if you are having low clock-speeds.
Many cores can help, if you use technologies like multi-queueing, but in the end. More clock frequency speeds up things (until you burden it with too many threads).

I understand, but
1. The sg 2440 has an Intel Atom @ 1.8 Ghz x 2 cores and it can do higher speeds than ~ 6 cores on a VM on the E5-2660
2. Linux bridge seemed to yield higher performance

That multi-queueing tip is good. I forgot I had that enabled on the E3.

I'll run some more tests.
 
Totally over read the SG-2440 :rolleyes:

Likely this is due to the fact that virtual NICs (even if para-virtualized) are not 100% comparable to physical NICs.
My guess is that the Netgate-appliance uses decent NIC-chips, which provide a lot of offloading while in a virtual machine, there is likely going more through the CPU (multiple times, as the hypervisor will need to deal with the traffic as well...)
 
specter mitigation flags were already set
well that doesn't really help to speed things up ;)

AFAIR hardware offloading should be disabled in a virtualized pfsense if it does not get the NIC passed through as PCI device. Look it up, it should have been discussed in the forum as well.
 
@tburger I did a passthrough of one of R720's Intel NICs to the pfsense VM and the results were about the same. Passing through directly to a Debian/Win 10 VM yielded higher speeds.

@aaron I've disabled offloading in the pfsense VM. Shouldn't specter flags help VMs? I've tried with/without flags and didn't see a performance impact.

In my old setup, on the E3-1230, I'm pretty sure it did about 950 Mbps in VMs when I virtualized pfsense about 1 year ago. I mean that was one of the reasons I virtualized it - the SG-2440 was a bottleneck at ~700Mbps . Now I can't seem to get more than 800 Mbps from it either. But there are other variables in that setup and maybe my memory isn't that good.

I've tried something else now on the E5. I've set things up like this: SG-2440 -> pfsense VM -> Debian VM. The VMs are new, on virtio with 6 cores each (anything between 2 and 6 works about the same), CPU type Host, 6GB RAM each. The pfs VM has LRO, TCO and powerD disabled.

I consistenly get ~ 100 Mbps less through the pfs VM than directly on the SG. I expected some overhead but this looks like a lot.
 
Shouldn't specter flags help VMs? I've tried with/without flags and didn't see a performance impact.
Okay, AFAIK all these mitigations mean that some feature or instruction set on the CPU is not used anymore because the speed was usually bought with security flaws. Thus mitigating them usually means less performance.

I looked up the specs for both CPUs, the E3-1230 and the E5-2660. The base clock of the older CPU is 1GHz higher than the new one. When it comes to how many pakets it can process per second, this might as well make the difference.
Additionally, the older CPU might not even be affected by Spectre, Meltdown and the likes while the new one probably is. With the latest microcodes that mitigate them, you are losing performance on that CPU.

If you search a bit you will find reports that these mitigations can cost up to 30% in performance, depending on the workload.
 
Okay, AFAIK all these mitigations mean that some feature or instruction set on the CPU is not used anymore because the speed was usually bought with security flaws. Thus mitigating them usually means less performance.
That's true. Intel specified "up to 25%" if I recall correctly. We measured up to 40% impact in certain scenarios.

Additionally, the older CPU might not even be affected by Spectre, Meltdown and the likes while the new one probably is
I doubt that. Because the affected CPUs are way back. The difference might be that Intel has not released patches for older CPUs, so did many of the Mainboard vendors. So these patches might just not be in place.
 
I doubt that. Because the affected CPUs are way back. The difference might be that Intel has not released patches for older CPUs, so did many of the Mainboard vendors. So these patches might just not be in place.
Very well possible, honestly I lost track after spectre and meltdown. I just remember that at least one of the other problems discovered in the years since spectre and meltdown did not affect my older Intel CPU that I had in my personal desktop.

The core problem and takeaway though is that with these security problems discovered in Intel CPUs and the resulting mitigations, it is very much possible that a newer CPU can in fact be slower than an older model that either is not affected or just doesn't get the mitigations anymore.
 
Last edited:
I ran all sorts of tests and I can't get to the bottom of this. No matter what I do, I can't get more than 600 Mbps when pfsense is virtualized. Traffic on the bridge works at about 400 MB/s when on the same VLAN and at ~ 120 MB/s across VLANs (when it's going thorugh pfs VM).

  • Incidentally, the network card is the same as the other host that works @ 800 Mbps, an Intel I350
  • The ISPs are the same. Host and VM settings are about the same;
  • NIC passthrough or running on bridge, same;
What's interesting though, is that the pfs VM doesn't go much past 50% CPU usage during WAN tests. I don't think it's Proxmox or pFsense, I think it's something about the NICs, R720 and how it works with Proxmox.

NICs are not exactly the same. The R720 has its subsystem listed as Dell Gigabit 4P I350-t rNDC whereas the other is Intel Corporation Ethernet Server Adapter I350-T4. I found some reddit post mentioning something about SR-IOV which isn't enabled on the slower host. Maybe that's the issue?
 
Last edited:
I've improved the speeds on the R720 but my issue isn't fully solved. I'd probably have to spend a lot more time on this to get the full 1 Gbps speed and I'm momentarily putting this thing to rest.

First off, the network card wasn't fully up to date. It was the only firmware patch that didn't apply properly and had to apply the firmware several times to get it up to date. I had to apply patch 16.5.20 first and then the latest 18.5.18. Maybe it had something to do with iDRAC and 64 bit DUPs. Then I've enabled SR-IOV but that didn't help.

There are 2 things that improved the performance significantly: (1) enabled NUMA on the pfsense VM and (2) set the Proxmox bridge MTU to the ISP's (1492 PPPoE).

Now I'm hitting ~ 800 Mbps, same as the other host.

Some other details.

I know the R720 can do more because I've asked the ISP to set their equipment to routed mode and passed a NIC to a Debian VM which resulted in ~ 915 Mbps. But then I'd be NATed and I don't want that. Since PPPoE in pfsense runs single-threaded (AFAIK) I believe my performance won't get better due to CPU limitations.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!