Low performance on AMD 5800X in ubuntu VM

ps2pspgood

New Member
Nov 1, 2021
15
1
3
32
Hi,

Recently, I have finally managed to set up the GPU passthrough and the ubuntu server VM.
However, I found the VM perform poorly thank expected.
I benchmarked the VM with geekbench 5. The single core result is around 1330 and the multicore score is around 8000.
I have expected the score of 5800x around 1600 for single core and 10300 for multicore. I cannot find the temperature of CPU by lmsensors, so I could not know if there is thermal throttling. I tested the native windows. The performance is usual.
Does anyone know how to improve the performance on the AMD CPU in VMs of proxmox?
Currently, I only run single VM on my rig.

The config is here:
CPU: AMD 5800X with PBO2 enabled (Auto).
MB: Asus X570 tuf gaming plus.
RAM: single 3200 DDR4 16 g
GPU: RTX3070 8g
SSD: OCZ trion sata 250 g

Here are the settings of my VM:
Code:
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 16
cpu: host
efidisk0: local-lvm:vm-100-disk-1,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:09:00,pcie=1
ide2: none,media=cdrom
machine: q35
memory: 12288
name: Ubuntu-server
net0: virtio=92:4A:1E:87:DC:35,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-0,cache=writeback,size=128G
scsihw: virtio-scsi-pci
smbios1: uuid=45cd629d-9465-4c26-84fd-cb635efea3e4
sockets: 1
vmgenid: 7f83967a-a577-4060-a19e-fb3687eb5a59


Thank you very much!
 
Last edited:
Wouldn't wonder if it thermal trottles. I also got a 5800X and it is preconfigured from factory to push the single chiplet to 138W. So thats the same power limit like the big 5900X and 5950X but these got 6+6 and 8+8 cores (and not 8+0) so they only use 69W per chiplet and because they use 2 chiplets that also means the heat is way more spread and the CPU can be cooled way better. I already got a massive dual 120mm fan tower heatsink (Scythe Mugen 5) but even at 100% PWM I can't keep the CPU at below 90 degree so it always trottles after some seconds.
 
Last edited:
I found a interesting thing.
If I install windows 10 and ran the cinebench r23, the multi core score is around 14500 which is similar to the native windows 10. However, if ran on Ubuntu geekbench, the score is still lower...
 
Replying to watch the thread and add my observations, might be related.
I recently did some benchmarks in PVE guests and noticed the guest memory benchmarks were really quite poor.

Archlinux PVE Guest: Geekbench5 scores were 1292 single core and 6043 multicore.

Archlinux PVE Guest AMD Ryzen 5 5600X 6-Core Processor
comparison machine (below)
Ubuntu bare metal Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz

YAML:
root@pve:~# cat /etc/pve/qemu-server/100.conf
agent: 1
args: -device ivshmem-plain,memdev=ivshmem,bus=pcie.0 -object memory-backend-file,id=ivshmem,share=on,mem-path=/dev/shm/looking-glass,size=128M
bios: ovmf
boot: order=ide2;scsi0;net0
cores: 6
cpu: host,flags=+ibpb;+virt-ssbd;+amd-ssbd
efidisk0: local-lvm:vm-100-disk-1,size=4M
hostpci1: 0d:00,pcie=1,x-vga=1,romfile=hd7750.rom
hugepages: 1024
ide2: none,media=cdrom
keephugepages: 1
machine: q35
memory: 32768
name: arch-7750-32gb
net0: virtio=9A:DA:41:C0:91:C2,bridge=vmbr0
net1: virtio=E6:01:81:B2:AB:7C,bridge=vmbr1
numa: 1
onboot: 1
ostype: l26
scsi0: local-lvm:vm-100-disk-0,backup=0,cache=writeback,iothread=1,replicate=0,size=116G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=f8e80882-14aa-4a5b-9a90-5a41b94ccd7e
sockets: 1
startup: order=1
usb0: host=046d:c07e
usb1: host=24f0:0140
vcpus: 6
vmgenid: 60759329-eff4-445f-b15e-aaf6af80f78a

Archlinux guest sysbench cpu and memory. CPU is fast (enough), Memory is REALLY slow.
Code:
[q35-arch: ~]$ sysbench cpu --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.20 (using system LuaJIT 2.0.5)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second: 28747.81

General statistics:
    total time:                          30.0002s
    total number of events:              862451

Latency (ms):
         min:                                    0.18
         avg:                                    0.20
         max:                                   28.43
         95th percentile:                        0.21
         sum:                               174709.50

Threads fairness:
    events (avg/stddev):           143741.8333/571.84
    execution time (avg/stddev):   29.1183/0.01

Memory is REALLY slow.
Code:
[q35-arch: ~]$ sysbench memory --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.20 (using system LuaJIT 2.0.5)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 20514600 (683805.69 per second)

20033.79 MiB transferred (667.78 MiB/sec)


General statistics:
    total time:                          30.0001s
    total number of events:              20514600

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                   16.69
         95th percentile:                        0.00
         sum:                                60231.77

Threads fairness:
    events (avg/stddev):           3419100.0000/14370.83
    execution time (avg/stddev):   10.0386/0.03

Hypervisor CPU performance is comparable, but memory performance is RIDIC in comparison..
Code:
pve:~# sysbench cpu --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second: 30583.01

General statistics:
    total time:                          30.0002s
    total number of events:              917507

Latency (ms):
         min:                                    0.19
         avg:                                    0.20
         max:                                    0.36
         95th percentile:                        0.20
         sum:                               179895.48

Threads fairness:
    events (avg/stddev):           152917.8333/125.13
    execution time (avg/stddev):   29.9826/0.00
Memory is 40x faster on the hypervisor.
Code:
pve:~# sysbench memory --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 104857596 (28048351.94 per second)

102400.00 MiB transferred (27390.97 MiB/sec)


General statistics:
    total time:                          3.7380s
    total number of events:              104857596

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    0.02
         95th percentile:                        0.00
         sum:                                14630.60

Threads fairness:
    events (avg/stddev):           17476266.0000/0.00
    execution time (avg/stddev):   2.4384/0.01


Bare metal Intel machine
CPU Is slow (expected, older CPU)
Code:
[ubuntu:~$ sysbench cpu --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:  6082.71

General statistics:
    total time:                          30.0010s
    total number of events:              182496

Latency (ms):
         min:                                    0.90
         avg:                                    0.99
         max:                                    3.76
         95th percentile:                        1.03
         sum:                               179974.46

Threads fairness:
    events (avg/stddev):           30416.0000/1495.00
    execution time (avg/stddev):   29.9957/0.00

Memory is REALLY fast in comparison to the guest VM, but still slower than bare metal hypervisor (expected)
Code:
[ubuntu:~$ sysbench memory --num-threads=6  --time=30 run
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 6
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 104857596 (16689582.88 per second)

102400.00 MiB transferred (16298.42 MiB/sec)


General statistics:
    total time:                          6.2813s
    total number of events:              104857596

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    0.13
         95th percentile:                        0.00
         sum:                                27059.73

Threads fairness:
    events (avg/stddev):           17476266.0000/0.00
    execution time (avg/stddev):   4.5100/0.16
 
Last edited:
Thank you very much for your information.
In my case, vm guest only perform poorly on linux guest. Windows 10 guest performed as expected.
Is this a kind of bug to have poor performance on linux guest or is there any may to fix this?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!