VM CPU Performance

joabe

Member
Dec 8, 2020
20
0
6
23
Hi, I've noticed that the CPU performance of the VMs is well below what it should be.

I have a newer one that has a CPU with less performance than this newer one, and the VM on geekbench delivers what it should deliver. A VPS on the newer one has less performance than the previous generation one.
On average, this CPU was supposed to have at least a 1600 score, but in the VM it only has 1031 performance.

The previous node's CPU usage is 10%, while the new one averages 50% but peaks at 70%.

The VM's CPU configuration is the default, only changing the number of cores. (Both VMs that were compared have the same CPU configurations in Proxmox).

Does anyone have any suggestions?

Before, the expected performance was delivered. Would it be due to the increased CPU usage? Still, the node's CPU usage is 50%, it is not much, right?
 
I have retried this now because the node processor is only at 20% usage.

The geekbench result now was 1301 singlecore, still not the 1600 I was looking for. However, it has increased from the test when the node CPU was at 50% (1031).

This node has 6 more VMs than the other node, but I don't understand why it doesn't deliver maximum singlecore performance if the node CPU is not overloaded...
 
what is your current cpu scaling governor?

cat /sys/devices/system/cpu/cpufreq/policy0/scaling_governor

?

maybe some power saving features of the cpu?

also how does the vm config look like?
did you set your cpu type to 'host' ?
 
Output:

Code:
root@r7-5-1-us:~# cat /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
performance

What power saving features, for example?

Config:


Code:
root@r7-5-1-us:~# qm config 110
agent: 1
boot: order=scsi0;ide2
cipassword: **********
ciuser: root
cores: 2
cpu: host
description: ...
ide0: local:110/vm-110-cloudinit.qcow2,media=cdrom,size=4M
ide2: none,media=cdrom
ipconfig0: ...
kvm: 1
memory: 4096
name: 2200
nameserver: 8.8.8.8
net0: virtio=...,bridge=vmbr0
numa: 0
ostype: l26
scsi0: local:100/base-100-disk-0.qcow2/110/vm-110-disk-0.qcow2,cache=none,discard=on,format=qcow2,size=40G
scsihw: virtio-scsi-pci
smbios1: uuid=2a9dd846-3202-44b9-b148-e2c9d04e57be
sockets: 1
vcpus: 2
vga: std
vmgenid: 62adb63c-7b9e-4524-98c3-10191eadc41e

Yes, the cpu type is set to 'host'.
 
What power saving features, for example?
i.e. c-states, cool'n'quite etc..

i just wanted to rule out the obvious, but:

I have retried this now because the node processor is only at 20% usage.

The geekbench result now was 1301 singlecore, still not the 1600 I was looking for. However, it has increased from the test when the node CPU was at 50% (1031).

This node has 6 more VMs than the other node, but I don't understand why it doesn't deliver maximum singlecore performance if the node CPU is not overloaded...
this indicates that the vm just does not get scheduled in time to get the 'full' single core performance

if you have configured more or near all cores (across all vms), then it can happen that the host schedules the virtual cpu cores 'badly', how much cores/threads do you have and how much did you configure?
 
Well, the node has 8 cores and 16 threads, single processor.

Adding up all the VMs, it is configured for about 60 cores. So that would be the reason for not getting maximum performance? Even if the node's CPU is 50% average?
Is there any way around this?
Also, in this case is it better to just leave it with "Cores" and leave it without any VCPU in all VMS to perform better in singlecore?

Geekbech:

https://browser.geekbench.com/v5/cpu/9011499
https://browser.geekbench.com/v5/cpu/9011637
 
Last edited:
I tried the test on a VM and on node to check the singlecore usage.
Before starting the test, the node had 20% average total CPU usage.

The node's singlecore test did not look so good compared to the average of other Ryzen 7 5800X, about 10% lower. (Lower performance is to be expected because there were 18 VMs connected at the time of the test, but I expected lower performance only in multicore and not in singlecore, maybe I was wrong).
But the VM still gets about 20% less than the singlecore power of the node.

Well, I don't know what to do anymore. On the Ryzen 7 3800X node, I put more cores in some VMs for testing, and in geekbench the result is still the best possible. I also reduced the amount of cores in some VMs to try to improve performance, but it had no effect. Even the multicore performance of the Ryzen 7 5800X VMs are inferior to the 3800X. The only difference is that the average CPU usage of the node with 3800X is 10%.

Benchmark nodes 5800X and 3800X:

https://browser.geekbench.com/v4/cpu/16280058
https://browser.geekbench.com/v4/cpu/16280078

Benchmark VMs 5800X and 3800X:

https://browser.geekbench.com/v4/cpu/16280054
https://browser.geekbench.com/v4/cpu/16280099
 
a few things:

please do not do benchmarks on loaded hosts... this will be useless and unpredictable.
it highly depends on what the vms workload is on how much it will consume the processors resources.
for example, a processor has limited amount of avx instructions it can execute. when a vm uses exactly that
you can still have low utilization but an avx benchmark will tank

do benchmarks like for like:
you have different guest os/kernel/config
the kernel between ubuntu 16.04 and 18.04 alone will make a difference (think spectre/meltdown mitigations etc.)
aside from that you use i440fx for one and q35 for another. qemu needs to emulate different
hardware for each that may use more or less resources

check the host performance:
as a baseline, i would recommend running your benchmark on the (not loaded) host.
this will probably be a bit higher than the fastest result from a vm

EDIT:

also two things i just realized:

check your bios for updates for newer processors, these can have a massive effect on power tables etc.
also check your cooling/temperature. if it is not adequate for e.g. your 5800x, it will thermal throttle and reducing performance
 
Last edited:
  • Like
Reactions: Thundercat
also check your cooling/temperature. if it is not adequate for e.g. your 5800x, it will thermal throttle and reducing performance
The 5800Xs thermals are reeeally bad. Got one here with a big tower heatsink (Mugen 5) with two 120mm fans and it is running 40-50 deg C idle and it climbs up to 90 deg C and starts thermal throttling within 1 second if I start some heavy load.
I first thought I didn't got a good thermal connection between the heatsink and the CPU but I remounted it several times and always the same result. If you google the web there are people everywhere complaining about way to high temps.

The CPU got a 105W TDP and is boosting (without time limit) to 138W out of the box. That is the same power as the 5950X but it uses only one chiplet instead. So it is twice as hard to cool the CPU because you got one 138W chiplet instead of two 69W chiplets.
My heatsink stays quite cool while the chiplets reach 90 deg C because the CPUs heatspreader just isn't able to spread the heat fast enough.
 
Last edited:
Well, I did a benchmark with the free node and it gave the expected result, this is the test:
https://browser.geekbench.com/v4/cpu/16289575

However, in VMS, I can't get even an approximate value of 7000 which would be ideal in singlecore, even with the node unused, look:
https://browser.geekbench.com/v4/cpu/16289602

I tried to do a fresh install from an Ubuntu 16-04 ISO, but still no luck:
https://browser.geekbench.com/v4/cpu/16289689

Is there any way to get more singlecore performance out of VMS? Both with "host" CPU and no extra flags.
 
one thing to try next would be e.g. using the 5.11 kernel instead of the 5.4 kernel, it may contain some improvements for those cpus (5.4 was first released in 2019, way before there was the 5000 series)
 
well the v5 benchmark seems now to be much better? (~1450 vs ~1050) ?
also some losses are to be expected, and things like power saving features can not be ruled out. also the security features in the kernel are probably on by default in the guest and the host, so they have probably a double impact

check out the 'meltdown/spectre' part of the docs and turn on the relevant flags and try that
 
Everything was standard, I changed it, but there was no change. What puzzles me the most about all of this is that before it was working normally, without any problems. Time went by and without any changes it stayed like this, even if I only have one VM running and just benchmarking it, it delivers about 20% less than it should.
In Geekbench4 all Ryzen 7 5800X deliver more than 7000 of score, and in a VM here it doesn't pass 5700 :/
 
Did you check the thermals while geekbench was running? Like I already said, I got a big bulky tower heatsink activly cooled by 2x 120mm fans spinning at 100% and still can't get the full performance out of the 5800X because it can't reach the power limit of 138W because the CPU is always at 90 degree C and and so its thermal throttling.
 
Yes, everything is fine with the cooling part and the node can get maximum performance.
 
Have you set cputype host and virtio devices? proxmox has some horrible defaults, lsi scsi, kvm cpu, e1000 nic, all these defaults will chew up cpu cycles. Also virtio rng device is slower than rdrand if cpu has rdrand support.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!