NUMA = 1 on single socket systems?

jsterr · Nov 23, 2023

Hello, whats the reason for setting NUMA=1 in a vm, that is running on a single socket system? I have seen that on a 3 node ceph meshed setup, any reason why numa = 1 makes sense on single socket system? I thought this is only multi-socket-related?

Can this cause issues on the host or vm?

sb-jw · Nov 23, 2023

My experience has been that NUMA doesn't help and rather costs performance. This only becomes relevant for VMs that have significantly more RAM allocated, i.e. 32 - 64 GB. Usually only if more than two CPUs are installed. However, this requires that you know your topology and that the applications are also NUMA aware. NUMA could also be used with single sockets, the larger AMD CPUs use this internally between the chips. In this respect, this could also help with single socket, but as I said, you have to set it up correctly and the application should be able to do it so that it really works.

You can check it on the host and determine for yourself whether it is potentially relevant or not.

jsterr · Nov 23, 2023

Do you think numa=1 in vms can cause a hypervisorhost to a unwanted reboot?

ITT · Nov 23, 2023

It depends on your available NUMA´s.
If your System has more than one NUMA, activate it on all your vm´s.
"Single Socket" says nothing about the Numa-Architecture, for example a single Xeon Gold 5218 has 2 Numa´s in an HP Proliant DL380 G10.

jsterr · Nov 23, 2023

ITT said:
It depends on your available NUMA´s.
If your System has more than one NUMA, activate it on all your vm´s.
"Single Socket" says nothing about the Numa-Architecture, for example a single Xeon Gold 5218 has 2 Numa´s in an HP Proliant DL380 G10.

Thanks, is there a easy way to check how many numas a cpu has?
Edit: nevermind can check amd papers: https://www.amd.com/system/files/documents/4th-gen-epyc-processor-architecture-white-paper.pdf page 12

ITT · Nov 23, 2023

jsterr said:
Thanks, is there a easy way to check how many numas a cpu has?

Yes:

numactl --hardware

Maybe you have to install it before via

apt install

ITT · Nov 23, 2023

AMD´s are a different kind of beast, so to be sure use numactl

sb-jw · Nov 23, 2023

jsterr said:
Do you think numa=1 in vms can cause a hypervisorhost to a unwanted reboot?

In principle, in my opinion, this flag is not the fault. However, NUMA may have caused the allocation of a memory area in which an ECC error occurred. Have you checked that? Or do you generally have more logs for this period or can you say more about your hardware/setup?

ITT · Nov 23, 2023

If somebody has activated numa=1 on a UMA-System (only one "Numa-Node") the chance is not low that "this guy" has played with CPU-pinning.
Check this out.
There´s a lot of misunderstandig in terms of NUMA-Architecture out there.

jsterr · Nov 23, 2023

sb-jw said:
In principle, in my opinion, this flag is not the fault. However, NUMA may have caused the allocation of a memory area in which an ECC error occurred. Have you checked that? Or do you generally have more logs for this period or can you say more about your hardware/setup?

Here are some logs from the customer:

Code:

Nov 23 07:00:39 proxmox03 kernel: libceph: osd11 (1)172.29.13.23:6803 socket closed (con state OPEN)
Nov 23 07:00:57 proxmox03 kernel: libceph: osd11 (1)172.29.13.23:6803 socket closed (con state OPEN)
Nov 23 07:01:08 proxmox03 systemd[1]: Starting Checkmk agent (172.29.3.241:39190)...
Nov 23 07:01:10 proxmox03 systemd[1]: check-mk-agent@65963-172.29.11.23:6556-172.29.3.241:39190.service: Succeeded.
Nov 23 07:01:10 proxmox03 systemd[1]: Finished Checkmk agent (172.29.3.241:39190).
Nov 23 07:01:10 proxmox03 systemd[1]: check-mk-agent@65963-172.29.11.23:6556-172.29.3.241:39190.service: Consumed 1.618s CPU time.
Nov 23 07:01:30 proxmox03 systemd[1]: Starting Checkmk agent (172.29.3.241:44194)...

-- Reboot --

Nov 23 07:03:49 proxmox03 kernel: Linux version 5.15.107-2-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) ()
Nov 23 07:03:49 proxmox03 kernel: Command line: initrd=\EFI\proxmox\5.15.107-2-pve\initrd.img-5.15.107-2-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on iommu=pt
Nov 23 07:03:49 proxmox03 kernel: KERNEL supported cpus:
Nov 23 07:03:49 proxmox03 kernel:   Intel GenuineIntel
Nov 23 07:03:49 proxmox03 kernel:   AMD AuthenticAMD
Nov 23 07:03:49 proxmox03 kernel:   Hygon HygonGenuine
Nov 23 07:03:49 proxmox03 kernel:   Centaur CentaurHauls
Nov 23 07:03:49 proxmox03 kernel:   zhaoxin   Shanghai

I checked for corosync errors (could have been fencing) and other "Errors" or "errors" in the log. The server rebooted on 07:01:30 before there are only logs from checkmk agent nothing special.

No logged errors in IPMI Event Log. I also thought about watchdog and softdog, theres one log-line for watchdog:

Code:

NMI watchdog: Enabled. Permanently consumes one hw-PMU counter

softdog logs only after the occured unwanted "reboot".

Code:

Nov 23 07:03:50 proxmox03 kernel: softdog: initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)
Nov 23 07:03:50 proxmox03 kernel: softdog: soft_reboot_cmd=<not set> soft_active_on_boot=0

Edit: We also have some BERT error on that node, but server skipped it. Funny I found a post from our wiki: https://www.thomas-krenn.com/de/wiki/Random_Reboots_AMD_EPYC_Server will try out and report.

ITT · Nov 23, 2023

Hmmm... we have two different Clusters with AMD Milan 7443 (Supermicro/HP) and no such issues (PVE 7.3 / 8.0x).
Why is this amd_iommu=on iommu=pt active?
Which Ceph-Version?
~~Also take a look on the MDS´s.~~

Edit: Update PVE / Ceph to the latest 7.* Version.

sb-jw · Nov 23, 2023

ITT said:
Which Ceph-Version (smells like an Ceph-Issue)?

How do you come up with a CEPH problem?

ITT · Nov 23, 2023

sb-jw said:
How do you come up with a CEPH problem?

Corrected.

sb-jw · Nov 23, 2023

Is the server in a controlled and secure environment? Or is it possible that someone pressed the physical NMI/Power Off? Are there any indications in the IPMI that the server was shut down hard, was temporarily without power, etc.?

In fact, it doesn't even have to have been triggered by PVE or a VM.

Search

Search

NUMA = 1 on single socket systems?

jsterr

Well-Known Member

sb-jw

Famous Member

jsterr

Well-Known Member

ITT

Well-Known Member

jsterr

Well-Known Member

ITT

Well-Known Member

ITT

Well-Known Member

sb-jw

Famous Member

ITT

Well-Known Member

jsterr

Well-Known Member

ITT

Well-Known Member

sb-jw

Famous Member

ITT

Well-Known Member

sb-jw

Famous Member