Finding out the culprit behind high CPU usage

mikiudon · Jun 8, 2023

Hello,

Is there a way to find out what the CPU usage is for each VM in a list?

I'm trying to narrow down which VM is taking up the server CPU load? Like it's not 100% CPU but it keeps hovering at 75-90% RAM Is ok.

Doing a top -o +%CPU

But I'm unsure how to look at the actual VM.

bbgeek17 · Jun 8, 2023

you could use "htop" or "top -c"

Code:

 -c  :Command-line/Program-name toggle
            Starts top with the last remembered `c' state reversed.  Thus, if top was displaying command lines, now that field will show program  names,  and  vice  versa.   See  the  `c'  interactive  command  for  additional
            information.

in GUI you can click "Virtual Machine" in left panel and sort by CPU usage

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

mikiudon · Jun 8, 2023

Yeah, thanks. I'm not sure but I may have assigned too many cores to each VMs.

What's the best practice typically?

I've got a 40 x Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz (2 Sockets) DELL R730 for example with 256GB RAM

I'm typically assigning 2x8 cores of CPU for normal Linux ones and windows 2 x 12 cores. I've been able to get away with lower with VMWare but seems that KVM I need more CPU power to get it to work. RAM seems to be fine.

bbgeek17 · Jun 9, 2023

Best practices are going to be dependent on your hardware configuration and what you are trying to achieve inside the VM. Let's dig into your processor to see what that tells us.

The CPU E5-2670 v2 is a 10-core processor from 2013. The processor supports symmetric multithreading (i.e., SMT). This means that the processor has 20 "threads." Note that a thread is NOT a core. A thread does not contribute additional execution resources. Instead, threads allow the core to "context-switch" between different execution pipelines to better use execution resources when the pipeline stalls on memory references. The impact of SMT on performance efficiency varies. I would assume no more than a 15% improvement vs. a single core on raw CPU throughput.

You have an R730. This is a dual CPU package system. You have two 10 core processors. Each processor is its own NUMA domain. You likely have 128G attached to each processor. Generally speaking, you want to do everything possible to constrain a workload to a single NUMA node since "remote" memory accesses over the inter-processor interconnect are slow relative to local NUMA access. Regarding virtual machine configuration, we should strive to ensure that all of the VCPUs execute on the same CPU package, not spanning NUMA domains. In some cases, the Linux scheduler will do a decent job at re-arranging resources to optimize for NUMA. But it's not perfect by any stretch.

Given your hardware configuration, if you are allocating singular VMs with 12 cores, one of two things is guaranteed to happen. You are either spanning NUMA domains, or your VCPUs are competing with one another via threads. Remember, you really only have 10 cores on a package.

Further, if performance is your primary concern, you should look at how much you are over-committing the CPU resources. Specifically, minimize the ratio of VCPU to actual CPU cores. I would recommend that you provision no more than 40 VCPUs total on that hardware, for enterprise applications. And reduce the number of VCPUs in a VM to ensure a good fit from a NUMA perspective.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

Finding out the culprit behind high CPU usage

mikiudon

New Member

bbgeek17

Distinguished Member

mikiudon

New Member

bbgeek17

Distinguished Member