pvestatd.pm/rebalance_lxc_containers - NUMA awareness?

bindi

Member
Jul 5, 2021
23
3
23
31
Hey,

Is it possible to make the rebalance_lxc_containers function NUMA-aware? Currently it can assign LXCs across CCDs, which is not optimal. I have a Zen3 processor with two CCDs (NPS2 enabled in bios), so the OS is aware of it:
Code:
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
 
  • Like
Reactions: leesteken
I also use a "fake NUMA" per L3 cache (or CCD). I tend to give each container unlimited cores (but limit it via Advanced > CPU limit) because I don't want Proxmox to pin the container to certain cores(/SMT) as I run VMs as well.
 
I'm also running unlimited cores, but I think it doesn't work properly because some programs can request Nthreads amount of CPU(processes(threads?)), meaning it's spread across two CCDs. Or even if they're using less threads, but once again, the rebalance_lxc_containers has pinned it to all CPUs - or will the linux scheduler still keep them on a single CCD? Not sure, really :).
 
Last edited:
I asked chatgpt help with some python, it seems like some processes are indeed across two CCDs:

Code:
Process 'pve-lxc-syscall' (PID 5813) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  5813  14  CCD1 pve-lxc-syscall
  5820  19  CCD0 pve-lxc-syscall
  5821   2  CCD0 pve-lxc-syscall
  5822  24  CCD1 pve-lxc-syscall
  5823  22  CCD0 pve-lxc-syscall
Process 'lxcfs' (PID 5828) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  5828  25  CCD1 lxcfs
  5855  12  CCD1 lxcfs
  5857  12  CCD1 lxcfs
  8117  14  CCD1 lxcfs
 21894  31  CCD1 lxcfs
 21909  28  CCD1 lxcfs
 21918  31  CCD1 lxcfs
 21929   4  CCD0 lxcfs
 21941  31  CCD1 lxcfs
 21945  14  CCD1 lxcfs
 21947   0  CCD0 lxcfs
Process 'rrdcached' (PID 6613) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  6613   2  CCD0 rrdcached
  6614  21  CCD0 rrdcached
  6695  13  CCD1 rrdcached
  6696   0  CCD0 rrdcached
  6697  24  CCD1 rrdcached
  6699  14  CCD1 rrdcached
  6700   1  CCD0 rrdcached
  8358  10  CCD1 rrdcached
 16074   7  CCD0 rrdcached
532365   4  CCD0 rrdcached
2441049  17  CCD0 rrdcached
2448802   6  CCD0 rrdcached
3684331   2  CCD0 rrdcached
Process 'rsyslogd' (PID 7800) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  7800   5  CCD0 rsyslogd
  7808  28  CCD1 rsyslogd
Process 'mariadbd' (PID 8115) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  8115  29  CCD1 mariadbd
  8129  13  CCD1 mariadbd
  8137  10  CCD1 mariadbd
  8138  29  CCD1 mariadbd
  8139  29  CCD1 mariadbd
  8149  19  CCD0 mariadbd
  8150  15  CCD1 mariadbd
3963609  15  CCD1 mariadbd
Process 'dhclient' (PID 8308) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  8308  19  CCD0 dhclient
  8309  10  CCD1 dhclient
  8310  12  CCD1 dhclient
  8311  12  CCD1 dhclient
Process 'squid' (PID 8699) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  8699   4  CCD0 squid
375834  25  CCD1 squid
375835  12  CCD1 squid
375836  12  CCD1 squid
375837  31  CCD1 squid
375838  13  CCD1 squid
375839  13  CCD1 squid
375840  13  CCD1 squid
375841  12  CCD1 squid
375842  24  CCD1 squid
375843  19  CCD0 squid
375844  14  CCD1 squid
375845  25  CCD1 squid
375846  31  CCD1 squid
375847  25  CCD1 squid
375848  29  CCD1 squid
375849  13  CCD1 squid
Process 'dhclient' (PID 8980) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  8980   1  CCD0 dhclient
  8985  30  CCD1 dhclient
  8986  26  CCD1 dhclient
  8987  10  CCD1 dhclient
Process 'unattended-upgr' (PID 9261) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  9261   1  CCD0 unattended-upgr
  9395  29  CCD1 unattended-upgr
Process 'java' (PID 9342) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  9342  19  CCD0 java
  9384  18  CCD0 java
  9413   5  CCD0 java
  9427   2  CCD0 java
  9430   0  CCD0 java
  9598   0  CCD0 java
  9674   7  CCD0 java
  9675   2  CCD0 java
  9680   4  CCD0 java
  9681  20  CCD0 java
  9684  18  CCD0 java
  9685  23  CCD0 java
  9686   7  CCD0 java
  9687   1  CCD0 java
  9689  20  CCD0 java
  9690  29  CCD1 java
  9794   1  CCD0 java
 10372   1  CCD0 java
 10387   4  CCD0 java
 10831   3  CCD0 java
 10832   0  CCD0 java
 10835  15  CCD1 java
 10857   0  CCD0 java
 11211   4  CCD0 java
 11449   1  CCD0 java
 11451   2  CCD0 java
 11455   0  CCD0 java
 11456  16  CCD0 java
 11457   7  CCD0 java
 11458  22  CCD0 java
 11459   1  CCD0 java
 11460   2  CCD0 java
 11461  18  CCD0 java
 11462   2  CCD0 java
 11463   1  CCD0 java
 11464  18  CCD0 java
 11465  17  CCD0 java
 11466  21  CCD0 java
 11832   6  CCD0 java
 11836   5  CCD0 java
 11837   4  CCD0 java
 11840   6  CCD0 java
 11841   0  CCD0 java
 11842  17  CCD0 java
 11843  16  CCD0 java
 11844   3  CCD0 java
 11846  18  CCD0 java
 11847   5  CCD0 java
383450  16  CCD0 java
401736   2  CCD0 java
1214868  16  CCD0 java
1316114  26  CCD1 java
1614645   1  CCD0 java
Process 'dhclient' (PID 9840) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  9840  29  CCD1 dhclient
  9842  10  CCD1 dhclient
  9843  12  CCD1 dhclient
  9844   4  CCD0 dhclient
Process 'rsyslogd' (PID 9856) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  9856   1  CCD0 rsyslogd
  9869  29  CCD1 rsyslogd
Process 'python3' (PID 9898) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
  9898  24  CCD1 python3
  9903   0  CCD0 python3
  9904   4  CCD0 python3
Process 'unattended-upgr' (PID 10238) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 10238  29  CCD1 unattended-upgr
 10409  19  CCD0 unattended-upgr
Process 'dhclient' (PID 10391) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 10391   1  CCD0 dhclient
 10394  27  CCD1 dhclient
 10395   8  CCD1 dhclient
 10396  10  CCD1 dhclient
Process 'influxd' (PID 10573) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 10573   1  CCD0 influxd
 10627  13  CCD1 influxd
 10632  29  CCD1 influxd
 10634  19  CCD0 influxd
 10636   3  CCD0 influxd
 10637  29  CCD1 influxd
 10648  12  CCD1 influxd
 10731   4  CCD0 influxd
 10747  19  CCD0 influxd
 10748  29  CCD1 influxd
 10749   2  CCD0 influxd
 10750   0  CCD0 influxd
 10751  21  CCD0 influxd
 10752   1  CCD0 influxd
 10753  31  CCD1 influxd
 10762  29  CCD1 influxd
 10828  19  CCD0 influxd
 10829  12  CCD1 influxd
 10830   2  CCD0 influxd
Process 'mongod' (PID 10584) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 10584  29  CCD1 mongod
 10591  19  CCD0 mongod
 10592   3  CCD0 mongod
 10732  15  CCD1 mongod
 10733   4  CCD0 mongod
 10734  15  CCD1 mongod
 10735  10  CCD1 mongod
 10736  31  CCD1 mongod
 10737  30  CCD1 mongod
 10738  30  CCD1 mongod
 10739   1  CCD0 mongod
 10740  24  CCD1 mongod
 10741  31  CCD1 mongod
 10742  30  CCD1 mongod
 10743  12  CCD1 mongod
 10744  15  CCD1 mongod
 10754   8  CCD1 mongod
 10755  10  CCD1 mongod
 10756  30  CCD1 mongod
 10757  11  CCD1 mongod
 10758  29  CCD1 mongod
 10759  14  CCD1 mongod
 10760  10  CCD1 mongod
 10761  16  CCD0 mongod
 10833   2  CCD0 mongod
 10834  20  CCD0 mongod
3094816   7  CCD0 mongod
Process 'dhclient' (PID 11422) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 11422  15  CCD1 dhclient
 11423  22  CCD0 dhclient
 11424  21  CCD0 dhclient
 11425  12  CCD1 dhclient
Process 'grafana' (PID 11513) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 11513  19  CCD0 grafana
 11580  16  CCD0 grafana
 11581   4  CCD0 grafana
 11582   7  CCD0 grafana
 11583   1  CCD0 grafana
 11584  19  CCD0 grafana
 11632  17  CCD0 grafana
 11637  15  CCD1 grafana
 11835   4  CCD0 grafana
 30290   6  CCD0 grafana
 30799  20  CCD0 grafana
 96062   4  CCD0 grafana
1328508   4  CCD0 grafana
3503496  18  CCD0 grafana
Process 'unattended-upgr' (PID 11516) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 11516   1  CCD0 unattended-upgr
 11598  29  CCD1 unattended-upgr
Process 'unattended-upgr' (PID 11928) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 11928  29  CCD1 unattended-upgr
 12019  19  CCD0 unattended-upgr
Process 'node' (PID 11930) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 11930  14  CCD1 node
 11986  19  CCD0 node
 11987  15  CCD1 node
 11988  10  CCD1 node
 11989  12  CCD1 node
 11990  15  CCD1 node
 12004  29  CCD1 node
 12230  15  CCD1 node
 12231  30  CCD1 node
 12232  11  CCD1 node
 12233  13  CCD1 node
Process 'unattended-upgr' (PID 12319) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 12319  19  CCD0 unattended-upgr
 12392  29  CCD1 unattended-upgr
Process 'java' (PID 12661) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 12661   1  CCD0 java
 12676  19  CCD0 java
 12702   0  CCD0 java
 12733   0  CCD0 java
 12736  15  CCD1 java
 12750  15  CCD1 java
 13421   0  CCD0 java
 13422   0  CCD0 java
 13444   0  CCD0 java
1224713   0  CCD0 java
3901494   0  CCD0 java
3901495   0  CCD0 java
3901654   0  CCD0 java
3901656   0  CCD0 java
3901817   0  CCD0 java
3901818   0  CCD0 java
3901823   0  CCD0 java
3337431   0  CCD0 java
3337441   0  CCD0 java
3337442   0  CCD0 java
Process 'transmission-da' (PID 18255) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 18255  17  CCD0 transmission-da
 18270  14  CCD1 transmission-da
 18271  22  CCD0 transmission-da
 21880  16  CCD0 transmission-da
Process 'Radarr' (PID 19048) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 19048  26  CCD1 Radarr
 19059  29  CCD1 Radarr
 19077   0  CCD0 Radarr
 19079   0  CCD0 Radarr
 19080  11  CCD1 Radarr
 19146  12  CCD1 Radarr
 19411  15  CCD1 Radarr
 19412  10  CCD1 Radarr
 19413  14  CCD1 Radarr
Process 'dhclient' (PID 19335) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 19335   0  CCD0 dhclient
 19336  10  CCD1 dhclient
 19337  29  CCD1 dhclient
 19338  14  CCD1 dhclient
Process 'rsyslogd' (PID 20006) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 20006  24  CCD1 rsyslogd
 20011   6  CCD0 rsyslogd
Process 'qbittorrent-nox' (PID 20867) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 20867  20  CCD0 qbittorrent-nox
 20907   0  CCD0 qbittorrent-nox
 20911   8  CCD1 qbittorrent-nox
 20922  19  CCD0 qbittorrent-nox
 20923   1  CCD0 qbittorrent-nox
 20925  16  CCD0 qbittorrent-nox
 21272  12  CCD1 qbittorrent-nox
 21273  14  CCD1 qbittorrent-nox
 21274  14  CCD1 qbittorrent-nox
 21276   4  CCD0 qbittorrent-nox
3880953   7  CCD0 qbittorrent-nox
3880955  22  CCD0 qbittorrent-nox
3883932  20  CCD0 qbittorrent-nox
3884632  16  CCD0 qbittorrent-nox
3892748   2  CCD0 qbittorrent-nox
3892749   0  CCD0 qbittorrent-nox
3899313   5  CCD0 qbittorrent-nox
3899927   4  CCD0 qbittorrent-nox
3899928  21  CCD0 qbittorrent-nox
3902567  23  CCD0 qbittorrent-nox
Process 'qbittorrent-nox' (PID 21280) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 21280  17  CCD0 qbittorrent-nox
 21303   0  CCD0 qbittorrent-nox
 21308  13  CCD1 qbittorrent-nox
 21335   4  CCD0 qbittorrent-nox
 21336   1  CCD0 qbittorrent-nox
 21338  31  CCD1 qbittorrent-nox
 21356  11  CCD1 qbittorrent-nox
 21357  24  CCD1 qbittorrent-nox
 21358  24  CCD1 qbittorrent-nox
 21451   7  CCD0 qbittorrent-nox
Process 'kvm' (PID 27191) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
 27191   7  CCD0 kvm
 27192  14  CCD1 kvm
 27213  17  CCD0 kvm
 27220  13  CCD1 kvm
 27222  19  CCD0 kvm
 27223  14  CCD1 kvm
3945737  21  CCD0 kvm
Process 'polkitd' (PID 405870) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
405870   3  CCD0 polkitd
405871   2  CCD0 polkitd
405873  12  CCD1 polkitd
Process 'python3' (PID 1283071) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
1283071  15  CCD1 python3
1288243   2  CCD0 python3
1288244   0  CCD0 python3
1288245  15  CCD1 python3
1288246  29  CCD1 python3
1288276   1  CCD0 python3
1288277   4  CCD0 python3
1288278   4  CCD0 python3
1288279  14  CCD1 python3
1288282  16  CCD0 python3
1288283  10  CCD1 python3
1288284   0  CCD0 python3
1288285   4  CCD0 python3
1288286  16  CCD0 python3
1288287  16  CCD0 python3
1288288   6  CCD0 python3
1288305   4  CCD0 python3
1288881   2  CCD0 python3
1291384   6  CCD0 python3
1744488   0  CCD0 python3
2128223  25  CCD1 python3
2128464  27  CCD1 python3
3041530  16  CCD0 python3
3606910  10  CCD1 python3
Process 'Sonarr' (PID 2474577) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
2474577  11  CCD1 Sonarr
2474578   9  CCD1 Sonarr
2474580  10  CCD1 Sonarr
2474581  13  CCD1 Sonarr
2474582  12  CCD1 Sonarr
2474589  10  CCD1 Sonarr
2474606  15  CCD1 Sonarr
2474607  14  CCD1 Sonarr
2474608  19  CCD0 Sonarr
Process 'kvm' (PID 3661027) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
3661027  26  CCD1 kvm
3661028  13  CCD1 kvm
3661052  21  CCD0 kvm
3661243  12  CCD1 kvm
3661248   6  CCD0 kvm
3687463  24  CCD1 kvm
Process 'apache2' (PID 3982649) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
3982649  25  CCD1 apache2
3982652  14  CCD1 apache2
3982654  10  CCD1 apache2
3982656  12  CCD1 apache2
3982658  15  CCD1 apache2
3982660  11  CCD1 apache2
3982662   6  CCD0 apache2
3982664  25  CCD1 apache2
3982667  10  CCD1 apache2
3982668  23  CCD0 apache2
3982671  29  CCD1 apache2
3982673  12  CCD1 apache2
3982676  25  CCD1 apache2
3982678  14  CCD1 apache2
3982680  15  CCD1 apache2
3982683  25  CCD1 apache2
3982685  15  CCD1 apache2
3982687  10  CCD1 apache2
3982689  12  CCD1 apache2
3982692  24  CCD1 apache2
3982694  24  CCD1 apache2
3982696  24  CCD1 apache2
3982698  24  CCD1 apache2
3982700  14  CCD1 apache2
3982702  24  CCD1 apache2
3982703  25  CCD1 apache2
3982704  14  CCD1 apache2
Process 'apache2' (PID 3982650) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
3982650   5  CCD0 apache2
3982655  14  CCD1 apache2
3982657   1  CCD0 apache2
3982659  19  CCD0 apache2
3982661  15  CCD1 apache2
3982663  11  CCD1 apache2
3982665  23  CCD0 apache2
3982666  11  CCD1 apache2
3982669   1  CCD0 apache2
3982670  10  CCD1 apache2
3982672   3  CCD0 apache2
3982674   1  CCD0 apache2
3982675  12  CCD1 apache2
3982677   6  CCD0 apache2
3982679   1  CCD0 apache2
3982681   5  CCD0 apache2
3982682  14  CCD1 apache2
3982684  25  CCD1 apache2
3982686   1  CCD0 apache2
3982688   6  CCD0 apache2
3982690  15  CCD1 apache2
3982691  23  CCD0 apache2
3982693   1  CCD0 apache2
3982695  15  CCD1 apache2
3982697  15  CCD1 apache2
3982699   1  CCD0 apache2
3982701   6  CCD0 apache2
Process 'pvefw-logger' (PID 3982878) has threads active on both CCDs:

   TID CPU   CCD Process
-----------------------------------
3982878  14  CCD1 pvefw-logger
3982880   3  CCD0 pvefw-logger
 
Couldn't fit into one post, here's the code if anyone is interested:
Python:
#!/usr/bin/env python3
import os
import sys

# ---- EXACT CCD mapping provided by you ----
CCD0_CPUS = set([0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23])
CCD1_CPUS = set([8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31])

def get_ccd(cpu_id):
    if cpu_id in CCD0_CPUS:
        return "CCD0"
    elif cpu_id in CCD1_CPUS:
        return "CCD1"
    else:
        return "Unknown"

def get_process_name(pid):
    try:
        with open(f"/proc/{pid}/comm") as f:
            return f.read().strip()
    except Exception:
        return "Unknown"

def main():
    if len(sys.argv) != 2:
        print(f"Usage: {sys.argv[0]} <PID>")
        sys.exit(1)
    pid = sys.argv[1]

    task_dir = f"/proc/{pid}/task"
    if not os.path.exists(task_dir):
        #print(f"PID {pid} does not exist.")
        sys.exit(1)

    process_name = get_process_name(pid)
    threads_info = []
    ccds_present = set()

    for tid in os.listdir(task_dir):
        stat_file = f"{task_dir}/{tid}/stat"
        try:
            with open(stat_file) as f:
                stat = f.read().split()
                cpu_id = int(stat[38])  # last CPU the thread ran on
                if cpu_id == -1:
                    continue  # ignore idle/unassigned threads
                ccd = get_ccd(cpu_id)
                threads_info.append((tid, cpu_id, ccd))
                ccds_present.add(ccd)
        except Exception:
            continue

    # Only print if threads exist on both CCDs
    if "CCD0" in ccds_present and "CCD1" in ccds_present:
        print(f"Process '{process_name}' (PID {pid}) has threads active on both CCDs:\n")
        print(f"{'TID':>6} {'CPU':>3} {'CCD':>5} {'Process'}")
        print("-"*35)
        for tid, cpu, ccd in threads_info:
            print(f"{tid:>6} {cpu:>3} {ccd:>5} {process_name}")

if __name__ == "__main__":
    main()

the command I used:
Code:
ps aux | grep -v grep | awk '{print $2}' | xargs -n 1 python3 pid.py

What's interesting is that I'm seeing kvm across CCDs as well, even though I've enabled "NUMA aware" in the VM settings. Unless the code is not analyzing things right?
 
Last edited:
Is it possible to make the rebalance_lxc_containers function NUMA-aware? Currently it can assign LXCs across CCDs, which is not optimal.
it's not clear whether scheduling over multiple CCDs is not optimal. Heat distributed over different CCDs enables higher clockspeed for each thread. The Linux scheduler is NUMA-aware and getting better at it. As long a memory is allocated per CCD (or L3 cache in case of Ryzen/EPYC) and work scheduled accordingly, as the containers share the kernel, it might be fine or even optimal.
The question remains: is Proxmox container thread pinning better than the Linux scheduler (with NUMA information and container NUMA-aware memory allocation) or not (when also running VMs)? Your AI did not really give an answer to that. My Linux containers (PBS, PDM, MythTV) don't have a sufficient work-load that would give an answer to this question either.
 
Last edited:
CC5950X.png
I'm just minmaxing out of boredom :-) But I think it would be a nice feature to have, and let the user choose their preference. Especially with Threadrippers/Epycs with 4CCDs or even dual CPU machines.