Correct configuration for VM Processors

AiroSam · Jan 20, 2021

Hello,

I have Proxmox VE 6.3-3 running on a Dell Server with an Intel® Xeon® Processor E5-2680 v3 (2x 12 core). Hyperthreading is enabled, so a maximum of 48 cores are available.

What is the correct configuration to allow a VM to use all 48 cores? I tried setting Sockets to 2 and Cores to 24, but it actually resulted in worse performance than when Cores was set to 12. Also, what is the impact of enabling NUMA?

Thanks for your time!

apoc · Jan 20, 2021

AiroSam said:
What is the correct configuration to allow a VM to use all 48 cores

This is an invalid configuration.
Because you don't have 48 cores. You have 48 threads.
You have 24 coresand even those should not be assigned to one VM.
Size to your workload. Not to your hardware!

AiroSam · Jan 21, 2021

In that case, let's say I wanted to have 44 "threads" assigned to one VM. What is the proper configuration for that?

apoc · Jan 21, 2021

tburger said:
You have 24 cores and even those should not be assigned to one VM.

If you want to hear it or not. Thats the way it is. Recommended practice: stay within one NUMA-node (12 vCPUs in your case).
You are not doing the system any good with what you want to do.
Looks pretty / impressive perhaps, but doesnt perform as you expect.

AiroSam · Jan 21, 2021

I don't think the main point is being addressed here. I'm trying to get answers on how to actually configure the CPU settings. The way it's configured now only allows 24 cores to show up in the VM, which isn't enough for what the VM is doing.

apoc · Jan 21, 2021

AiroSam said:
which isn't enough for what the VM is doing.

Then get yourself other gear.
What you try to achieve does not make sense at all from a technical persepcetive.
Reason: If that is not enough. More threads wont help. You only have 24 cores. Physically.

Threads wont help. You create contention on the host. Which will slow things further down.

If you really want to go with 48 threads: go physical. Install your OS on the hardware itself. Skip virtualization.

AiroSam · Jan 21, 2021

tburger said:
Then get yourself other gear.
What you try to achieve does not make sense at all from a technical persepcetive.
Reason: If that is not enough. More threads wont help. You only have 24 cores. Physically.

Threads wont help. You create contention on the host. Which will slow things further down.

If you really want to go with 48 threads: go physical. Install your OS on the hardware itself. Skip virtualization.

It sounds like you're suggesting that hyperthreading has no benefit for VMs? Why is that?

apoc · Jan 21, 2021

it has a potential benefit to the host.
but the host needs to schedule all these cores.
And assigning all the cores puts pressure on the host. It can t breathe.
Hence it will pause the VM from time to time to get its own stuff done.
it is a misconception (a huge one) that VCPUs come for free. Surprise! They dont.
Hence my recommendation.
I even go that far that likely I would say the VM you are trying to build will perform BETTER with LESS cores.
Why? Because it actually can get those scheduled fairly easy.

AiroSam · Jan 21, 2021

tburger said:
it has a potential benefit to the host.
but the host needs to schedule all these cores.
And assigning all the cores puts pressure on the host. It can t breathe.
Hence it will pause the VM from time to time to get its own stuff done.
it is a misconception (a huge one) that VCPUs come for free. Surprise! They dont.
Hence my recommendation.
I even go that far that likely I would say the VM you are trying to build will perform BETTER with LESS cores.
Why? Because it actually can get those scheduled fairly easy.

I appreciate the info!

alpha754293 · Mar 29, 2023

apoc said:
it has a potential benefit to the host.
but the host needs to schedule all these cores.
And assigning all the cores puts pressure on the host. It can t breathe.
Hence it will pause the VM from time to time to get its own stuff done.
it is a misconception (a huge one) that VCPUs come for free. Surprise! They dont.
Hence my recommendation.
I even go that far that likely I would say the VM you are trying to build will perform BETTER with LESS cores.
Why? Because it actually can get those scheduled fairly easy.

@apoc
Sorry for resurrecting this old thread - but a couple of follow-up questions in regards to this:

1) Does the system load affect the host's ability to schedule the threads when VMs are using it?

(i.e. my system has dual Xeon E5-2697A v4 (16-core, 32-threads per socket, dual socket system, so total of 32-cores/64-threads).

Out of that, about 58 threads ("cores" (i.e. "vCPUs")) has been allocated to a variety of VMs. (I have 10 VMs running right now. One has 16 vCPUs allocated to it, 5 have 2 vCPUs allocated to it, and 4 have 8 vCPUs allocated to it.)

The system load average is typically between 5.00-10.00 (the bulk majority of the time) and the overall host CPU usage is between 5-15%.

Given this information, would the host still need to sometimes, temporarily pause the VMs to be able to manage and/or schedule its own tasks/processes/threads?

My thinking was that because on average, it has such a relatively low load, that the host shouldn't run into CPU contention issues, but please educate me if my thinking is neither correct nor accurate.

2) Would disable HyperThreading and then cutting the number of vCPUs allocated to each VM by half, help? Or will this still be potentially a problem?

3) If I were to manually assign the CPU affinity for each of the VMs, would that also take some of the "pressure" off from the host's responsibility of scheduling the host threads and/or tasks/processes?

My original thought with NOT manually setting the CPU affinity was that it might lead to an overall higher level of utilisation since not all VMs are running "full load", all the time. And therefore; vCPUs that have been allocated can be shared between VMs when one VM has more "work" to do, whilst another VM has less work to do.

Again, I am new to Proxmox, so I appreciate you educating me in terms of what would be the best practices for vCPU allocation.

Thank you.

leesteken · Mar 29, 2023

alpha754293 said:
@apoc
Sorry for resurrecting this old thread - but a couple of follow-up questions in regards to this:

1) Does the system load affect the host's ability to schedule the threads when VMs are using it?

(i.e. my system has dual Xeon E5-2697A v4 (16-core, 32-threads per socket, dual socket system, so total of 32-cores/64-threads).

More than one socket, especially with NUMA complicates matters a bit. Often, not all memory is local to each socket and you might want to give each VM also two sockets and enable NUMA for each VM. This way you spread the Non-Uniform Memory Access over each VM.

alpha754293 said:
Out of that, about 58 threads ("cores" (i.e. "vCPUs")) has been allocated to a variety of VMs. (I have 10 VMs running right now. One has 16 vCPUs allocated to it, 5 have 2 vCPUs allocated to it, and 4 have 8 vCPUs allocated to it.)

The system load average is typically between 5.00-10.00 (the bulk majority of the time) and the overall host CPU usage is between 5-15%.

Given this information, would the host still need to sometimes, temporarily pause the VMs to be able to manage and/or schedule its own tasks/processes/threads?

My thinking was that because on average, it has such a relatively low load, that the host shouldn't run into CPU contention issues, but please educate me if my thinking is neither correct nor accurate.

I agree, the average load is probably very low and you have threads in reserve.

alpha754293 said:
2) Would disable HyperThreading and then cutting the number of vCPUs allocated to each VM by half, help? Or will this still be potentially a problem?

Hyperthreads don't improve performance by 2 (more likely by 1.3). Disabling hyperthreading will give little more CPU functions to the remaining threads and double the cache per thread, if I'm not mistaken. This might increase performance per core, which might benefit you. Try and measure for your workloads yourself.

alpha754293 said:
3) If I were to manually assign the CPU affinity for each of the VMs, would that also take some of the "pressure" off from the host's responsibility of scheduling the host threads and/or tasks/processes?

Not really, as not all VM will be 100% busy all the time, better not make the scheduling more complex by pinning cores to VMs. Trust the Linux process scheduler and don't handicap it.

alpha754293 said:
My original thought with NOT manually setting the CPU affinity was that it might lead to an overall higher level of utilisation since not all VMs are running "full load", all the time. And therefore; vCPUs that have been allocated can be shared between VMs when one VM has more "work" to do, whilst another VM has less work to do.

I agree. If you want best performance of a particular VM you might want to assing specific cores to the VM ,but each VM also needs additional threads for emulation and I/O. By not assigning cores, you allow for more flexibility. Also, by not pushing everything to 100%, you make the system as a whole not responsive and you experience less latency (and you won't achieve maximum throughput).

alpha754293 said:
Again, I am new to Proxmox, so I appreciate you educating me in terms of what would be the best practices for vCPU allocation.

Don't worry too much about it. Remember that Proxmox is an enterprise hypervisor and tuned towards many VMs that are not very busy in a HA environment. On this forum, you'll find lots of people running Proxmox with only a few VMs obsessing over performance, GPU passthrough and pinning cores, etc. But I don't see many people with a subscription, and running many VMs in production, fussing about that.

alpha754293 · Mar 29, 2023

leesteken said:
Often, not all memory is local to each socket and you might want to give each VM also two sockets and enable NUMA for each VM.

That is a good point.

leesteken said:
Hyperthreads don't improve performance by 2 (more likely by 1.3).

Thank you.

My experience in HPC/CAE/CFD/FEA workloads -- HyperThreading (or SMT), AT BEST, is only about a 3-7% performance improvement, but otherwise, can also result in a performance degradation as well due to the oversubscription of the FPU on the CPU cores.

leesteken said:
Try and measure for your workloads yourself.

My workload is predominantly just web browsing/watching YouTube videos, so it's generally not very taxing at all.

(The server consolidation project was to migrate from 5 NAS servers down to a single system, and then virtualise any other outstanding systems/towers where and whenever I can, driven predominantly by my desire to cut my over power consumption down from 1200-ish W to ~600 W.)

The cost of electricity here isn't very high, but if I can save a buck, why not?

leesteken said:
Trust the Linux process scheduler and don't handicap it.

I ask this only because sometimes, when I am rebooting my Linux VMs, the "screen" where it shows the progress of the system services shutting down -- will also show that there were times when threads were stuck and/or waiting for the CPU.

That suggests to me that there is some hardware contention issues that's happening, despite the generaly low system load/CPU usage, overall.

Not sure if that is storage related though as sometimes, on the Proxmox dashboard, my IO delay can be as high as 45% (not very often, but it sometimes might spike up that high. My storage consists of three 8-wide raidz2 vdevs, two of the vdevs are built using 10 TB 7200 rpm SATA 6 Gbps and/or SAS 12 Gbps drives (i.e. one vdev consists of eight 10 TB SAS 12 Gbps drives, the other vdev consists of eight 10 TB SATA 6 Gbps drives), and the third vdev is eight 6 TB SATA 6 Gbps drives.

(I am using existing hardware vs. buying new stuff when the old stuff works perfectly fine.)

leesteken said:
Also, by not pushing everything to 100%, you make the system as a whole not responsive and you experience less latency (and you won't achieve maximum throughput).

Yeah, I am having mixed results with this.

Audio, when playing YouTube videos from a Windows 11 VM (using the SPICE audio driver, with the Windows virtio drivers 0.1.229 installed), sometimes stutters or gets a little bit garbled up when the host reports either higher IO delay and/or higher load/CPU usage.

Not the worst thing in the world, but it is certainly annoying though, when you're watching a YouTube video and having to rewind every so often as a result of said audio issues.

leesteken said:
On this forum, you'll find lots of people running Proxmox with only a few VMs obsessing over performance, GPU passthrough and pinning cores, etc.

I do have GPU passthrough (because I have virtualised my gaming system now).

And I am also using the virtio-fs capabilities of Proxmox because rather than building out and expanding my network, I am actually contracting/shrinking my homelab (again - to cut power consumption). Don't need 10 GbE NICs/switches/cables if I can just use virtio-fs.

(Which is an AWESOME feature for Proxmox to support BTW as neither TrueNAS nor xcp-ng/XOA supports virtio-fs.)

That being said, I am running what might appear to be some hardware contention issues (with the host scheduler), so I've set the CPU affinities, but kept HyperThreading enabled to see if that might help with some of that.

(Keeping 2 cores/4 threads "clear" for the host.)

Thank you.

Search

Search

Correct configuration for VM Processors

AiroSam

New Member

apoc

Famous Member

AiroSam

New Member

apoc

Famous Member

AiroSam

New Member

apoc

Famous Member

AiroSam

New Member

apoc

Famous Member

AiroSam

New Member

alpha754293

Member

leesteken

Distinguished Member

alpha754293

Member

We value your privacy