Is 12+36GB on a dual socket server clever or stupid regarding NUMA optimization?

ramaza · Jun 21, 2017

I'm not a virtualization newbie but still have to learn a lot about Proxmox/KVM/QEMU. So please bear with me. After searching all kinds of wikis, forums and mailing lists I still have difficulties to understand if the following advanced configuration doesn't work because it's nonsense or because I simply do it the wrong way.

So here's the story:

My hardware setup is a dual socket Xeon E5620 (4 Cores / 8 Threads) server with 48GB RAM. Running 3 VMs. Two smaller ones with 4GB RAM and 4 cores each. And one larger one with 32GB RAM and 4 cores.

My (slightly limited) knowledge tells me it's a good idea to pin the two smaller VMs to the 1st CPU and the bigger VM to the 2nd CPU. But this only works if there is more than 32GB RAM available to the 2nd CPU. Otherwise the virtual RAM of 32GB needs to span two NUMA nodes. To solve this problem I equipped this dual socket server with 12+36GB RAM. Instead of using the traditional 24+24GB setup.

My problem now is that the output of "numactl -H" always looks like this. No matter which numactl or taskset commands I use.

Code:

root@pve1:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 8 9 10 11
node 0 size: 12005 MB
node 0 free: 5802 MB
node 1 cpus: 4 5 6 7 12 13 14 15
node 1 size: 36286 MB
node 1 free: 9318 MB
node distances:
node   0   1
  0:  10  15
  1:  15  10

I don't understand why the one big VM isn't consuming all of its 32GB RAM on NUMA-NODE1. If it would I guess I could solve another problem more easily. Which is to keep all it's vCPU threads compute on NUMA-NODE1 and don't let them hop between NUMA-NODE0 and NUMA-NODE1.

I hope that you guys can help me to clarify. This topic is bothering me since days.

jim.bond.9862 · Jun 21, 2017

hello ramaza,
I am sort of virtualization newbie , well not truly but not a guru either, but I do not understand why you doing what you are doing here.

unless the VMs are high demand VM, why not just let the Host OS manage the resource distribution as needed?

also you may need more than 36GB on the second CPU as Host OS may take more hence your NUMA overflow to second pool. i.e you have a VM taking up 32GB of ram on single CPU socket but OS may want to take more from bigger ram pool (36GB) .

this is only a guess as I am not overly fluent in how resources are managed in this setup.

ramaza · Jun 21, 2017

You might be right that the difference between 32GB and 36GB is just too small. The kernel/scheduler might not know that this VM process won't span any more threads and will never reach a memory allocation beyond 32GB. I will try to verify this in more detail.

The reason for this exercise is that the bigger VM can need some performance tuning. Or let's say I don't believe that it can't run better than it does at the moment. To verify my assumption I would like to finish this optimization and compare the results with the status quo.

And there's another reason: I tend to learn most about how stuff works by aiming high. With the risk of not being able to succeed. In the worst case I learned a lot and can account the time as education/training.

jim.bond.9862 · Jun 21, 2017

I got that, but to my understanding you can pin the CPU to VM
but I did not see anywhere that you can actually pin the RAM to VM. or force the system to use the RAM from one socket location exclusively. it seems to me that RAM simply dumped into a single pool on host and distributed that way.

Search

Search

Is 12+36GB on a dual socket server clever or stupid regarding NUMA optimization?

ramaza

Member

jim.bond.9862

Renowned Member

ramaza

Member

jim.bond.9862

Renowned Member

We value your privacy