NUMA questions again

cwiggs

Member
Aug 7, 2021
5
1
8
74
I've read the NUMA wiki here: https://pve.proxmox.com/wiki/NUMA and the admin guide regarding NUMA here: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_virtual_machines_settings however I still have some questions.

* Is NUMA only useful if your PVE *host* has more than 1 physical CPU?
* The documentation says that if you enable NUMA you should set your VM with NUMA enabled to have the same amount of vCPUs as your host as CPUs. That seems problematic if I don't want a VM to use all the CPU resources of the host? e.g. I have 4 physical CPUs, and I set the VM to have 4 vCPUs and the VM uses 100% of the CPU resources it could negatively effect the host and/or other VMs.
* The documentation says that NUMA is required for hot-pluggable CPU and Memory. If your hardware doesn't support NUMA does that mean that you cannot have hot-pluggable CPU and Memory, or should you still enable NUMA to get hot-pluggable CPU and Memory, but not get the other benefits of NUMA?

Thank you.
 
* Is NUMA only useful if your PVE *host* has more than 1 physical CPU?
You might want to use NUMA even with just a single socket in case you are running a multi chiplet CPU like a Ryzen 5900X.
 
Last edited:
It says that it is recommended to set the number of sockets for the VM to the amount of NUMA nodes (hardware CPU sockets) you have.
This is not the same as vCPUs.

Ah, you are right, I misunderstood. So if the host has 2 physical CPUs then I should set the VM to NUMA with 2 physical sockets as well, but the vCPUs can be anything I want?

You might want to use NUMA even with just a single socket in case you are running a multi chiplet CPU like a Ryzen 5900X.
Ah interesting. Does numactl show more than 1 "node" on a 5900x platform then?
 
  • Like
Reactions: gurubert
Ah interesting. Does numactl show more than 1 "node" on a 5900x platform then?
Usually it's seen as 1 node but I (5950X) still have to experiment with the BIOS setting that shows both CCDs as different nodes. The memory latency is the same but the L3 cache is per CCD. The Linux scheduler get better about this with newer kernel versions, so I don't know if it's worth the trouble. Also, one CCD is typically from a fast bin and the other from a slow bin, so pinning cores might not be the best option overall. Someone please correct me if I'm wrong about any of this.
 
  • Like
Reactions: gurubert and cwiggs
The only question I think I don't have answered is:

* The documentation says that NUMA is required for hot-pluggable CPU and Memory. If your hardware doesn't support NUMA does that mean that you cannot have hot-pluggable CPU and Memory, or should you still enable NUMA to get hot-pluggable CPU and Memory, but not get the other benefits of NUMA?

Anyone know about this? I don't have multiple CPU sockets but I've been enabling NUMA and not sure if I should or not.
 
The only question I think I don't have answered is:

* The documentation says that NUMA is required for hot-pluggable CPU and Memory. If your hardware doesn't support NUMA does that mean that you cannot have hot-pluggable CPU and Memory, or should you still enable NUMA to get hot-pluggable CPU and Memory, but not get the other benefits of NUMA?

Anyone know about this? I don't have multiple CPU sockets but I've been enabling NUMA and not sure if I should or not.
It's fine. There are no issues with enabling the VM NUMA option on a single socket (or other UMA system).
 
  • Like
Reactions: cwiggs
I have another question.

Lets say if i give a VM, 4 vCPU's and the host have 4 Numa Domains, each domain with 16 Cores.
Will Proxmox then favorize/pin the 4 vCPUS's/Threads to one Numa Domain of the host?

If yes, then it would be amazing, because the VM can be old/outdated etc and the multithreading tasks inside the VM don't need to take care of Numa. Basically no Numa support at all inside the VM is required then.

If no, whats the sense of numa?, lets say i enable numa for that vm and set 4 Sockets with 1 Core. Like the Proxmox documentation tells. (Instead of 1 Socket with 4 vCPU's)
The Cores inside the VM won't have shared Access to L3 Cache or extremely slow one, so the CPUs need to send everything to memory, which will be much slower...
Then at least in my point of view, you do exactly the opposite, slow down everything.

Maybe one can explain me that better, in the case of a Milan/Rome/Genoa CPU with 4 Chiplets per Socket.
Cheers
 
Code:
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
node 0 size: 193205 MB
node 0 free: 133719 MB
node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
node 1 size: 193530 MB
node 1 free: 184164 MB
node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
node 2 size: 193489 MB
node 2 free: 181114 MB
node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
node 3 size: 193487 MB
node 3 free: 174892 MB
node distances:
node   0   1   2   3
  0:  10  12  12  12
  1:  12  10  12  12
  2:  12  12  10  12
  3:  12  12  12  10

Well, with a little bit testing, i can just say that numa works perfect on LXC Containers.
And doesn't work at all on VM's, no matter if numa is enabled or disabled in VM-Settings.

Proxmox or KVM, will simply take randomly any Cores from any Numa Nodes and won't try to take available Cores for a VM from one Numa-Node.
So that whole concept is absolutely senseless. Just passing numa capabilities to a VM, so that the VM knows from which node the core is from, is senseless for almost any application.
Except those that can handle numa (not a lot), but even those will run a lot slower as if they were on the same numa-node.

LXC Container on the other hand, why ever, at least Ubuntu 23.04/23.10/24.04 based ones (i tested just those), run for example "stress --cpu 4/6/8" always exactly on one Numa-Node, which is pretty nice.

Im just wondering that no one complains...
I have definitively to dig further in, how i can accomplish that dynamically, without pinning cpu cores.
Pinning CPU Cores is impossible with like 30VM's on a Cluster, especially if you migrate them around. Even if you balance those 4 Numa Nodes perfectly out, after some migrations Node 1 will get 100% cpu utilisation and Node 4 will idle at 0%...
So there must be a dynamic way somehow, that Proxmox assigns the Cores from one VM to the same numa-node.
The kernel can't do that, since the kernel doesn't know which tasks are related to which VM (vCPU's).

Cheers
 
Last edited:
Check my thread.
https://forum.proxmox.com/threads/iperf3-speed-same-node-vs-2-nodes-found-a-bug.146805

Especially my last 2 posts.

Iperf without Numa = 14GB/s
Iperf with Numa = 40GB/s

that is everything else but not "negligible"
Right now, i have no other ways, other than using CPU Pinning, but hopefully at some point, Proxmox will support Numa.
Cheers

PS: I can even do Split per L3-Cache, so i get 8 Numa-Nodes (one for every L3-Cache), instead of 4 (One for every CCD).
And i tested that either, it gives even more performance, but the difference is not worth it to have headaches with 8 nodes, since every numa-node has only 8 Cores to work with and that is definitively too small. 4 Numa-Nodes gives at least 16 Cores, so a good balance to work with.
The "CPU Pinning" you have to do for 8 Numa-Nodes is far too much. 4 is at least somewhat easier to manage.

If Proxmox could assign dynamically the Cores of one VM to run them on the same Numa-Node, i would definitively go with 8 Numa-Nodes, for even more performance.
And the Performance difference, if a multithreading application inside your VM runs at the Same Numa-Node vs running on different CCD's is insane, most of the cases 2-3x the speed. If the multithreading app is sharing data between the Cores (Any multithreading Compression for example) there is even more speedgain.
Cheers
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!