what happens if you set always 1 socket shouldn't that take care of the issue without having to pin vcpus manually?
No it make absolutely no difference. The socket option makes only sense if you enable the numa option.
But that just tells the OS in the VM itself to use numa. Which is as i said previously absolutely senseless either 99% of the usecases.
(I dont know any usecase or anything that does benefit from that myself)
The Whole issue is, if you enable numa or use more as one Socket in the VM Setting or dont enable numa and use only one Socket makes absolutely no difference how Proxmox handle the VMs CPU-Thread (Tasks)
Each Core you give a VM, is a "Task" on the Proxmox host itself, and those Tasks on the host rotate Randomly (Details below) between all Physical Cores without any logic.
This is because qemu doesn't tells the Proxmox-Kernel which Tasks belog together. So if you give a VM 4-Cores, the Kernel sees those 4 "Tasks" as separate Tasks on the Host that have nothing common together.
This is glad god not a really dramatic issue (lets say it could be much worse), because the Kernel seems to have still some clever Logic, which leads to not that big of a performance penalty as it should actually be.
Rotate Randomly:
The kernel will not Rotate the task to another Core, if the Task is still busy, something inside the VM is sticking on the CPU, like a running Program that is currently busy. But as soon the Programm or whatever inside the VM is finished or entered a waiting loop, the Task on the host will usually Rotate to another CPU Core.
Its is how i understand it, probably not fully correct in detail, but on a high view it is definitively how it works.
As long as QEMU doesn't tell the Kernel that those 4 Tasks (4 vCPUs in VM) don't belong together somehow, there is no Numa Support on Proxmox at all. No matter what anyone says.
So only CPU-Pinning is a solution at the moment.
Its possible to create some hook scripts that are really clever and balance at least the VM's between Sockets or Numa-Nodes out (on startup at least). So it will get much easier. Pinning yourself 20 VM's is very challanging.
But real numa-support means also that the kernel can Rotate the VM-Tasks while the VM's are running, between the Cores of a Numa-Node.
Moving to another Numa-Node (still with near memory) but another L3-Cache should be supported either.
Numa is not only about Near and Far Memory, its about L3-Cache either. L3-Cache is actually a bigger performance factor. The reason is, that an Application that uses multiple CPUs (Multithreading Apps) can use L3-Cache to share data between tasks/cores, which is insanely fast.
If the Cores of a VM are spreaded around (Not on the same CCD on AMD-Server Systems), the Application cannot use the L3-Cache and needs to use memory, which is 3x slower.
I benchmarked that even with Iperf3 and Multitasking Archiving, with pinning i get 3x more performance. Iperf3 without pinning: 14-15GB/s, with pinning to same CCD (same L3-Cache): over 50GB/s.
The benchmarks are here on this Forums in another Thread.
But L3-Cache on intel-Server Systems is not that big of a deal, because of the Monotholic Design.
On AMD-Servers (Milan/Rome/Genoa) its an extreme huge issue, so huge that Proxmox makes absolutely no sense on those Systems without CPU-Pinning.
So the conclusion is, Sockets (in VM-Settings) in Proxmox is absolutely Senseless. Numa option in VM-Settings is absolutely senseless in my opinion either.
Cheers