Proxmox slower in CPU/Memory than VMware

JesperAP

New Member
Jun 18, 2024
13
0
1
Hi,

I've done some performance tests with the tool from PassMark. What I notice is that my CPU and memory are slower than in VMware even though Proxmox has a better CPU than VMware.

From my understanding Proxmox should be better performing?

Is there something I can finetune? Am I doing something wrong?

Here are some screenshots:
1718962265726.png

1718962136347.png
 
What PVE version is this?
Why do you add flags and have cpu type host? Does not make sense for me.
Is the CPU socket layout identical with your host? If not, change it.
If you use large pages in your VM, are you also using large pages on your PVE host?
Why do you redact generated and hypervisor-internal mac addresses?
 
  • Like
Reactions: IsThisThingOn
What PVE version is this?
Why do you add flags and have cpu type host? Does not make sense for me.
Is the CPU socket layout identical with your host? If not, change it.
If you use large pages in your VM, are you also using large pages on your PVE host?
Why do you redact generated and hypervisor-internal mac addresses?
Hi,

Version 8.2.2
These are the flags we added, is this not necessary?
1718972438003.png
What cpu type do I need to choose? I thought host was the best option?

What do you mean with? "Is the CPU socket layout identical with your host? If not, change it."
The VM has 4 cores 1 socket and the proxmox host has 36 cores 2 sockets.

I also don't know what you mean with large pages?

About the mac addres I have no clue why I did that
 
Did you thoroughly research your CPU functions before toggling those flags? Does it have IBRS? Is the required microcode installed? Explicitly turning spec-ctrl on and PCID (presumably default) off is gonna destroy performance for loads profiting the most from ILP
 
  • Like
Reactions: IsThisThingOn
Did you thoroughly research your CPU functions before toggling those flags? Does it have IBRS? Is the required microcode installed? Explicitly turning spec-ctrl on and PCID (presumably default) off is gonna destroy performance for loads profiting the most from ILP
Even with all of the flags to default, the VM is as slow as with the flags
 
Numa doesn't work on Proxmox.
Or lets say there is no logic, like assigning 4 Cores of the VM to the same Socket, to benefit from faster Ram.

As long as there is no numa Support, ESXI will always win on dual Socket Systems or Single Socket AMD Bergamo/Genoa Servers.
ESXI handles numa pretty well, there are even options to Assign CPU Cores based on L3 Cache/Hyperthreading.

On a Single Socket Server, as long as there is no need for Numa, Proxmox will always outperform ESXI.

PS: The option in VM-Settings to enable Numa in Proxmox is only beneficial for some rare Scenarios, 99% of the usecase the option is senseless. The option just passes through Numa Capabilities to the VM, So the VM could handle numa itself. But the issue is that CPU-Tasks on the Proxmox host itself "rotate", its pretty senseless to say it shortly.

You can PIN your VM-CPU's to the Physical Cores of your Proxmox host yourself, thats the only way at the moment.
Simply Pin all your 4 VM-Cores to the same Socket on the Host and you will outperform esxi.
But you have to do it for every VM and that can get very fast very confusing, since balancing the cores yourself with 20 VM's is not that easy.
I mean to balance out VM's between both sockets.

You can get even more performance on AMD systems at least (Genoa/milan/bergamo), if you assign the VM-Cores to the CPU's which have all access to the same L3-Cache. On Intel systems it has no benefits, since they are still mostly monotholic.
Monotholic CPUs are great, because you dont have to f... with Numa (compared to AMD) and L3-Cache, which makes everything a lot easier on Proxmox Side, you basically have only to Pin the CPUs of a VM to the same socket on Intel and thats it.

Cheers
 
Last edited:
Numa doesn't work on Proxmox.
Or lets say there is no logic, like assigning 4 Cores of the VM to the same Socket, to benefit from faster Ram.

As long as there is no numa Support, ESXI will always win on dual Socket Systems or Single Socket AMD Bergamo/Genoa Servers.
ESXI handles numa pretty well, there are even options to Assign CPU Cores based on L3 Cache/Hyperthreading.

On a Single Socket Server, as long as there is no need for Numa, Proxmox will always outperform ESXI.

PS: The option in VM-Settings to enable Numa in Proxmox is only beneficial for some rare Scenarios, 99% of the usecase the option is senseless. The option just passes through Numa Capabilities to the VM, So the VM could handle numa itself. But the issue is that CPU-Tasks on the Proxmox host itself "rotate", its pretty senseless to say it shortly.

You can PIN your VM-CPU's to the Physical Cores of your Proxmox host yourself, thats the only way at the moment.
Simply Pin all your 4 VM-Cores to the same Socket on the Host and you will outperform esxi.
But you have to do it for every VM and that can get very fast very confusing, since balancing the cores yourself with 20 VM's is not that easy.
I mean to balance out VM's between both sockets.

You can get even more performance on AMD systems at least (Genoa/milan/bergamo), if you assign the VM-Cores to the CPU's which have all access to the same L3-Cache. On Intel systems it has no benefits, since they are still mostly monotholic.
Monotholic CPUs are great, because you dont have to f... with Numa (compared to AMD) and L3-Cache, which makes everything a lot easier on Proxmox Side, you basically have only to Pin the CPUs of a VM to the same socket on Intel and thats it.

Cheers
So you're saying my server will not perform better in proxmox because it has 2 sockets and VMware handles this better?

Sorry, just saw your edit. How would I pin VM cpu's to physical cores?
 
Last edited:
So you're saying my server will not perform better in proxmox because it has 2 sockets and VMware handles this better?

Sorry, just saw your edit. How would I pin VM cpu's to physical cores?
https://bs.fri.stoss-medica.int:8006/pve-docs/chapter-qm.html#qm_cpu
Read the "affinity" Section.

In VM-Settings under CPU, tick "Advanced" and set Affinity.

To find out which Processors are assigned to which Socket on the Proxmox host, install apt install numactl and enter numactl --hardware
Then you see something like node 0 cpus: 0 1 2 3..... and node 1 cpus: ....
Thats your Sockets.

You have to restart your VM, if you don't some memory would be still on the old Socket.
Or better, shutdown and start it again, not just a simple restart.

PS: And the crap you did with enabling all features-flags in VM-Settings under "Processors" is absolutely senseless, i think its even worse in the combination with "host".
If you set the CPU to "host" your cpu will simply be passed through with all features the CPU Supports. All other options would set something like a mask, or fake the cpu, so the feature-flags makes sense there in some scenarios, but not for "host"

Cheers
 
Last edited:
t
Numa doesn't work on Proxmox.
Or lets say there is no logic, like assigning 4 Cores of the VM to the same Socket, to benefit from faster Ram.

As long as there is no numa Support, ESXI will always win on dual Socket Systems or Single Socket AMD Bergamo/Genoa Servers.
ESXI handles numa pretty well, there are even options to Assign CPU Cores based on L3 Cache/Hyperthreading.

On a Single Socket Server, as long as there is no need for Numa, Proxmox will always outperform ESXI.

PS: The option in VM-Settings to enable Numa in Proxmox is only beneficial for some rare Scenarios, 99% of the usecase the option is senseless. The option just passes through Numa Capabilities to the VM, So the VM could handle numa itself. But the issue is that CPU-Tasks on the Proxmox host itself "rotate", its pretty senseless to say it shortly.

You can PIN your VM-CPU's to the Physical Cores of your Proxmox host yourself, thats the only way at the moment.
Simply Pin all your 4 VM-Cores to the same Socket on the Host and you will outperform esxi.
But you have to do it for every VM and that can get very fast very confusing, since balancing the cores yourself with 20 VM's is not that easy.
I mean to balance out VM's between both sockets.

You can get even more performance on AMD systems at least (Genoa/milan/bergamo), if you assign the VM-Cores to the CPU's which have all access to the same L3-Cache. On Intel systems it has no benefits, since they are still mostly monotholic.
Monotholic CPUs are great, because you dont have to f... with Numa (compared to AMD) and L3-Cache, which makes everything a lot easier on Proxmox Side, you basically have only to Pin the CPUs of a VM to the same socket on Intel and thats it.

Cheers
that's a really good tip saved me a ton of trouble


to the OP i have not used vmware but on passmark test the system I'm running scores top 2 out of 120 samples
 
Numa doesn't work on Proxmox.
Or lets say there is no logic, like assigning 4 Cores of the VM to the same Socket, to benefit from faster Ram.

As long as there is no numa Support, ESXI will always win on dual Socket Systems or Single Socket AMD Bergamo/Genoa Servers.
ESXI handles numa pretty well, there are even options to Assign CPU Cores based on L3 Cache/Hyperthreading.

On a Single Socket Server, as long as there is no need for Numa, Proxmox will always outperform ESXI.

PS: The option in VM-Settings to enable Numa in Proxmox is only beneficial for some rare Scenarios, 99% of the usecase the option is senseless. The option just passes through Numa Capabilities to the VM, So the VM could handle numa itself. But the issue is that CPU-Tasks on the Proxmox host itself "rotate", its pretty senseless to say it shortly.

You can PIN your VM-CPU's to the Physical Cores of your Proxmox host yourself, thats the only way at the moment.
Simply Pin all your 4 VM-Cores to the same Socket on the Host and you will outperform esxi.
But you have to do it for every VM and that can get very fast very confusing, since balancing the cores yourself with 20 VM's is not that easy.
I mean to balance out VM's between both sockets.

You can get even more performance on AMD systems at least (Genoa/milan/bergamo), if you assign the VM-Cores to the CPU's which have all access to the same L3-Cache. On Intel systems it has no benefits, since they are still mostly monotholic.
Monotholic CPUs are great, because you dont have to f... with Numa (compared to AMD) and L3-Cache, which makes everything a lot easier on Proxmox Side, you basically have only to Pin the CPUs of a VM to the same socket on Intel and thats it.

Cheers
what happens if you set always 1 socket shouldn't that take care of the issue without having to pin vcpus manually?
 
what happens if you set always 1 socket shouldn't that take care of the issue without having to pin vcpus manually?
No it make absolutely no difference. The socket option makes only sense if you enable the numa option.
But that just tells the OS in the VM itself to use numa. Which is as i said previously absolutely senseless either 99% of the usecases.
(I dont know any usecase or anything that does benefit from that myself)

The Whole issue is, if you enable numa or use more as one Socket in the VM Setting or dont enable numa and use only one Socket makes absolutely no difference how Proxmox handle the VMs CPU-Thread (Tasks)

Each Core you give a VM, is a "Task" on the Proxmox host itself, and those Tasks on the host rotate Randomly (Details below) between all Physical Cores without any logic.
This is because qemu doesn't tells the Proxmox-Kernel which Tasks belog together. So if you give a VM 4-Cores, the Kernel sees those 4 "Tasks" as separate Tasks on the Host that have nothing common together.

This is glad god not a really dramatic issue (lets say it could be much worse), because the Kernel seems to have still some clever Logic, which leads to not that big of a performance penalty as it should actually be.

Rotate Randomly:
The kernel will not Rotate the task to another Core, if the Task is still busy, something inside the VM is sticking on the CPU, like a running Program that is currently busy. But as soon the Programm or whatever inside the VM is finished or entered a waiting loop, the Task on the host will usually Rotate to another CPU Core.
Its is how i understand it, probably not fully correct in detail, but on a high view it is definitively how it works.

As long as QEMU doesn't tell the Kernel that those 4 Tasks (4 vCPUs in VM) don't belong together somehow, there is no Numa Support on Proxmox at all. No matter what anyone says.
So only CPU-Pinning is a solution at the moment.

Its possible to create some hook scripts that are really clever and balance at least the VM's between Sockets or Numa-Nodes out (on startup at least). So it will get much easier. Pinning yourself 20 VM's is very challanging.

But real numa-support means also that the kernel can Rotate the VM-Tasks while the VM's are running, between the Cores of a Numa-Node.
Moving to another Numa-Node (still with near memory) but another L3-Cache should be supported either.

Numa is not only about Near and Far Memory, its about L3-Cache either. L3-Cache is actually a bigger performance factor. The reason is, that an Application that uses multiple CPUs (Multithreading Apps) can use L3-Cache to share data between tasks/cores, which is insanely fast.
If the Cores of a VM are spreaded around (Not on the same CCD on AMD-Server Systems), the Application cannot use the L3-Cache and needs to use memory, which is 3x slower.
I benchmarked that even with Iperf3 and Multitasking Archiving, with pinning i get 3x more performance. Iperf3 without pinning: 14-15GB/s, with pinning to same CCD (same L3-Cache): over 50GB/s.
The benchmarks are here on this Forums in another Thread.
But L3-Cache on intel-Server Systems is not that big of a deal, because of the Monotholic Design.
On AMD-Servers (Milan/Rome/Genoa) its an extreme huge issue, so huge that Proxmox makes absolutely no sense on those Systems without CPU-Pinning.

So the conclusion is, Sockets (in VM-Settings) in Proxmox is absolutely Senseless. Numa option in VM-Settings is absolutely senseless in my opinion either.
Cheers
 
Last edited:
No it make absolutely no difference. The socket option makes only sense if you enable the numa option.
But that just tells the OS in the VM itself to use numa. Which is as i said previously absolutely senseless either 99% of the usecases.
(I dont know any usecase or anything that does benefit from that myself)

The Whole issue is, if you enable numa or use more as one Socket in the VM Setting or dont enable numa and use only one Socket makes absolutely no difference how Proxmox handle the VMs CPU-Thread (Tasks)

Each Core you give a VM, is a "Task" on the Proxmox host itself, and those Tasks on the host rotate Randomly (Details below) between all Physical Cores without any logic.
This is because qemu doesn't tells the Proxmox-Kernel which Tasks belog together. So if you give a VM 4-Cores, the Kernel sees those 4 "Tasks" as separate Tasks on the Host that have nothing common together.

This is glad god not a really dramatic issue (lets say it could be much worse), because the Kernel seems to have still some clever Logic, which leads to not that big of a performance penalty as it should actually be.

Rotate Randomly:
The kernel will not Rotate the task to another Core, if the Task is still busy, something inside the VM is sticking on the CPU, like a running Program that is currently busy. But as soon the Programm or whatever inside the VM is finished or entered a waiting loop, the Task on the host will usually Rotate to another CPU Core.
Its is how i understand it, probably not fully correct in detail, but on a high view it is definitively how it works.

As long as QEMU doesn't tell the Kernel that those 4 Tasks (4 vCPUs in VM) don't belong together somehow, there is no Numa Support on Proxmox at all. No matter what anyone says.
So only CPU-Pinning is a solution at the moment.

Its possible to create some hook scripts that are really clever and balance at least the VM's between Sockets or Numa-Nodes out (on startup at least). So it will get much easier. Pinning yourself 20 VM's is very challanging.

But real numa-support means also that the kernel can Rotate the VM-Tasks while the VM's are running, between the Cores of a Numa-Node.
Moving to another Numa-Node (still with near memory) but another L3-Cache should be supported either.

Numa is not only about Near and Far Memory, its about L3-Cache either. L3-Cache is actually a bigger performance factor. The reason is, that an Application that uses multiple CPUs (Multithreading Apps) can use L3-Cache to share data between tasks/cores, which is insanely fast.
If the Cores of a VM are spreaded around (Not on the same CCD on AMD-Server Systems), the Application cannot use the L3-Cache and needs to use memory, which is 3x slower.
I benchmarked that even with Iperf3 and Multitasking Archiving, with pinning i get 3x more performance. Iperf3 without pinning: 14-15GB/s, with pinning to same CCD (same L3-Cache): over 50GB/s.
The benchmarks are here on this Forums in another Thread.
But L3-Cache on intel-Server Systems is not that big of a deal, because of the Monotholic Design.
On AMD-Servers (Milan/Rome/Genoa) its an extreme huge issue, so huge that Proxmox makes absolutely no sense on those Systems without CPU-Pinning.

So the conclusion is, Sockets (in VM-Settings) in Proxmox is absolutely Senseless. Numa option in VM-Settings is absolutely senseless in my opinion either.
Cheers

thanks for taking the time to thoroughly explain it as a newbie this really helps

that must be even more of a mess with I/O threads, disk/nic/gpu passthrough etc

luckily some other enlightened ppl like you have written about workarounds

but yeah a real PITA
 
Last edited:
thanks for taking the time to thoroughly explain it as a newbie this really helps

that must be even more of a mess with I/O threads, disk/nic/gpu passthrough etc

luckily some other enlightened ppl like you have written about workarounds

but yeah a real PITA
Thats all easy. In that department there is actually nothing that is missing.

There are only some weknesses:
- ZFS-Zvols beeing 5x slower as they should be (but that is an ZFS issue, nothing the Proxmox devs can do about)
Just mentioning this, because thats the "default storage option" for most users on Proxmox.
- No Numa-Support: Thats mentioned above already, but thats actually more QEMU-Related, every linux distro (apart from SLES/RHEL i believe) have that issue with qemu. But i believe on this point the Proxmox-Devs could actually improve the Situation with hook scripts at least.
- BackupServer backup speed hard limited to 1-1,2 GB/s, no matter what Server/Storage. But this Backup-Solution is still a lot faster as every backup offering for esxi.
Just mentioning this, because this point is easy for the Dev-Team to improve with some config based tuning parameters.

Apart of that, Proxmox has a ton of Strengths, that are far better as ESXI. No stupid VCS-VM that breaks every 2 Years, or Password expirations where you have to hack into grub to change the PW, or Native Storage that is a shitton faster as VMFS, easy updates through Package manager instead of the Stupid Update/Patchmanagement Plugin or on older ESXI a windows-vm for that, LXC Containers which are just amazing (but not amazing to backup, it basically just takes long compared to VM's xD)
There are a ton of things that are better compared to ESXI.
The other alternative is Hyper-V which makes no sense for anything else as hosting Windows-VM's, Nutanix-AHV which is not on par with Proxmox/ESXI or even Hyper-V and is basically just QEMU either.
And thats it, there is maybe unraid, but hell i don't like anything of that, thats a complete different solution targeting home-users only.

So in the end we are still here and complain about Proxmox xD
But it's still the best overall Hypervisor in my opinion, even if it would not be free.
And i have all 3 issues above with my Genoa Servers and Backup-Server with NVME-Storage... Still loving Proxmox xD

Cheers

PS: Hyper-V has one extreme Strength, and thats HA. I don't know in detail how it works, but if a host fails on our Hyper-V Cluster, the VM is just for max 1 second unreachable. Its just insane. But everything else is not great xD
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!