Feature Request: Add NUMA-Aware CPU Core Assignment for LXC Containers in Proxmox VE

AuroraSKy

New Member
Feb 16, 2024
3
0
1

Feature Request: Add NUMA-Aware CPU Core Assignment for LXC Containers in Proxmox VE​


Summary


Proxmox currently allows administrators to specify a number of CPU cores for LXC containers (e.g., “5 cores”), but the system does not assign those cores in a NUMA-aware or topology-aware manner. The chosen CPUs are distributed arbitrarily across all available NUMA nodes, leading to performance degradation for many latency-sensitive or memory-locality-sensitive workloads.


This proposal introduces an optional, advanced “NUMA-optimized CPU assignment” mode that ensures LXC containers are assigned logical CPU cores intelligently, grouped by physical NUMA node, improving locality and reducing cross-node latency.




Problem Description​


When an LXC container is assigned a specific number of cores (for example: 5, 8, or 10), Proxmox:


  • Allocates exactly that many logical CPUs
  • But distributes them randomly across the system’s entire CPU topology
  • With no awareness of NUMA boundaries
  • With no grouping by socket
  • Without pairing hyperthreads
  • Without respecting locality

Example from a real system with 4 NUMA nodes:


Assigning 5 cores results in CPU selection such as:


1, 46, 7, 30, 68

These cores span multiple NUMA nodes, often all of them.


Impact


For real-time or latency-sensitive workloads (e.g., Rust game servers), this behavior causes:


  • High CPU Pressure Stall (PSI)
  • NUMA thrashing
  • Cross-socket memory access latency
  • Poor tick stability
  • Performance inconsistencies
  • Thread migration penalties
  • Cache invalidation and reduced efficiency

Conversely, when administrators manually pin containers to cores within a single NUMA node and assign matching memory affinity, PSI drops dramatically (e.g., from 10% to nearly 0%), and workload performance becomes stable and predictable.




Why This Matters​


NUMA locality is essential for modern systems:


  • Multi-socket Xeon and EPYC hosts
  • High-density container clusters
  • Memory-intensive workloads
  • Game servers
  • Databases
  • Web hosting stacks
  • Virtualized or containerized HPC workloads

As hardware grows more NUMA-complex, topology-aware CPU assignment becomes increasingly important.




⭐ Proposed Solution: Add a NUMA-Optimized CPU Assignment Option in Advanced Settings​


UI Proposal (Advanced tab):​


[ ] Enable NUMA-optimized CPU assignment


NUMA Node Preference:
[x] Node 0 [ ] Node 1 [ ] Node 2 [ ] Node 3


Core Grouping Strategy:
( ) Contiguous cores
( ) SMT sibling pairs
( ) Auto-select optimal local cores


Memory Affinity:
[x] Bind memory to selected NUMA node(s)

Behavior:​


If “NUMA-optimized CPU assignment” is enabled:


  1. Proxmox automatically chooses cores within the specified NUMA node(s).
  2. The selected CPUs are contiguous, topology-aware, and local.
  3. Memory affinity (cpuset.mems) is set automatically to match.
  4. No random scattering across NUMA nodes occurs.
  5. Linux scheduler migrations stay within the NUMA-local core group.

Optional:
If multiple NUMA nodes are selected, Proxmox divides cores proportionally.




Implementation Outline​


The system would modify LXC config using:


lxc.cgroup2.cpuset.cpus = X,Y,Z...
lxc.cgroup2.cpuset.mems = N

Proxmox already supports cpuset generation internally but does not expose NUMA grouping.


Implementing this feature only requires:


  • Reading host CPU topology
  • Selecting NUMA-local CPUs
  • Writing cpuset + mems entries

This is a lightweight enhancement with a large performance impact.




Real-World Example Result​


On a 4-socket, 4-NUMA node machine:


Before NUMA pinning (using default Proxmox assignment):


  • 10-15% CPU some pressure stall
  • Rust game server lagging/stuttering
  • High latency spikes

After manual NUMA pinning:


  • 0–1% PSI
  • Completely stable tickrate
  • Significantly lower load
  • Better performance and player experience
  • No cross-socket thrashing

This is a dramatic improvement enabled purely by proper CPU locality.




Why This Should Be Added to Proxmox


  • Easier for users than manual config editing
  • Provides correct behavior on multi-socket systems
  • Brings LXC CPU assignment to parity with VM NUMA features
  • Reduces support burden (users often misinterpret “cores”)
  • Improves performance for a huge class of workloads
  • Makes Proxmox competitive with NUMA-aware orchestrators

This feature would be optional, non-breaking, and highly beneficial.




Conclusion​


Proxmox currently assigns LXC CPUs without considering NUMA topology, which hinders performance on modern multi-socket systems. Adding an advanced setting for NUMA-optimized CPU assignment would:


  • Improve performance
  • Reduce scheduler overhead
  • Lower PSI
  • Make workloads more predictable
  • Empower administrators
  • Modernize Proxmox for NUMA-heavy hardware

This proposal is both technically feasible and highly valuable for real users.