Hello Proxmox Community,
I'm encountering issues getting full performance from my dual EPYC 7713 VM on PVE 8.3.4. Specifically, SMT passthrough seems to be failing, and numactl binding within the guest also fails.
Host Hardware:
Code snippet
bios: ovmf<br>boot: order=scsi0;ide2;net0<br>cores: 64<br>cpu: host<br># efidisk0: ...<br># ide2: ...<br>machine: q35<br>memory: 946045<br># meta: ...<br>name: Aibase<br>net0: e1000=BC:24:11
9:5B:0F,bridge=vmbr0,tag=20 # Using e1000 temporarily, aware virtio-net is better<br>numa: 1<br>ostype: l26 # Linux 6.x Kernel<br>scsi0: Datapool:vm-101-disk-1,backup=0,iothread=1,size=6900G<br>scsihw: virtio-scsi-single<br># smbios1: ...<br>sockets: 2<br># vmgenid: ...<br>
Guest VM OS:
Despite cpu: host and host SMT being ON, the guest VM only sees 1 thread per core:
Attempts to bind processes to a NUMA node using numactl inside the guest fail. The binding options are ignored:
Is this a known issue or bug with Proxmox VE 8.3.4 (Kernel 6.8.x) and AMD EPYC 7003 (Milan) CPUs regarding SMT passthrough when using cpu: host? Why might SMT fail to pass through, and why might numactl binding fail within the guest despite the guest seeing the NUMA structure (for the limited threads it detects)?
Are there any specific host kernel parameters, KVM options (args:?), or other settings known to resolve this? I am currently waiting for a potential BIOS/Firmware update from the manufacturer (Gooxi) for the potentially outdated microcode (0x0a001119).
Any insights or suggestions would be greatly appreciated!
I'm encountering issues getting full performance from my dual EPYC 7713 VM on PVE 8.3.4. Specifically, SMT passthrough seems to be failing, and numactl binding within the guest also fails.
Host Hardware:
- CPU: 2 x AMD EPYC 7713 (64-Core Processor)
- RAM: 1 TB DDR4
- Server Model: Gooxi SR201-D12R-NV (Motherboard reported as Gooxi G1DLRO-B, but this seems incorrect as it's single-socket; the system is definitely dual-socket).
- Host SMT: Confirmed ON (Host OS sees 256 threads, 2 threads/core - see lscpu below)
- Host Microcode: 0x0a001119 (Possibly outdated, SRSO warning in dmesg, waiting for manufacturer BIOS update)
- Proxmox VE Version: (Output of pveversion)
pve-manager/8.3.4/65224a0f9cd294a3 (running kernel: 6.8.12-8-pve)<br> - Host lscpu confirms 256 threads / SMT ON: (Key lines)
Architecture: x86_64<br>CPU(s): 256<br>Thread(s) per core: 2<br>Core(s) per socket: 64<br>Socket(s): 2<br>NUMA node(s): 2<br>NUMA node0 CPU(s): 0-63,128-191<br>NUMA node1 CPU(s): 64-127,192-255<br>Model name: AMD EPYC 7713 64-Core Processor<br>Virtualization: AMD-V<br> - Host numactl --hardware confirms 2 nodes, 256 threads correctly mapped.
Code snippet
bios: ovmf<br>boot: order=scsi0;ide2;net0<br>cores: 64<br>cpu: host<br># efidisk0: ...<br># ide2: ...<br>machine: q35<br>memory: 946045<br># meta: ...<br>name: Aibase<br>net0: e1000=BC:24:11
Guest VM OS:
- OS: Ubuntu 24.04.2 LTS (Fresh install)
- Kernel: 6.8.0-55-generic (uname -r)
- Python/apt/netplan issues previously encountered were fixed.
Despite cpu: host and host SMT being ON, the guest VM only sees 1 thread per core:
- Guest lscpu:
Thread(s) per core: 1<br>CPU(s): 128<br>Socket(s): 2<br>Core(s) per socket: 64<br>NUMA node(s): 2<br>NUMA node0 CPU(s): 0-63<br>NUMA node1 CPU(s): 64-127<br> - Guest numactl --hardware: Shows 2 nodes, CPUs 0-63 on node 0, 64-127 on node 1.
Attempts to bind processes to a NUMA node using numactl inside the guest fail. The binding options are ignored:
- Command run in guest: sudo numactl --cpunodebind=0 --membind=0 sleep 600 &
- Result from numactl -s -p <pid>:
policy: default<br>cpubind: 0 1 # <-- Should be just 0<br>nodebind: 0 1 # <-- Should be just 0<br>membind: 0 1 # <-- Should be just 0<br>
- Confirmed SMT ON in Host OS (lscpu). BIOS setting assumed ON but pending double-check/update.
- Using standard recommended VM settings (q35, cpu: host, numa: 1, sockets: 2, cores: 64).
- Performed full VM Shutdown/Start after config changes.
- Tried explicit cpu: EPYC-Milan -> Same result (128 threads, SMT OFF).
- Tried forcing topology via args: -smp 256,... -> Failed VM start due to config conflicts. Reverted.
- Updated Proxmox Host (apt update && apt dist-upgrade on PVE 8.3.4) and rebooted -> No change in guest SMT status.
- Checked guest kernel parameters (/proc/cmdline) -> No nosmt.
- Fixed unrelated Python/apt/netplan issues within the guest.
Is this a known issue or bug with Proxmox VE 8.3.4 (Kernel 6.8.x) and AMD EPYC 7003 (Milan) CPUs regarding SMT passthrough when using cpu: host? Why might SMT fail to pass through, and why might numactl binding fail within the guest despite the guest seeing the NUMA structure (for the limited threads it detects)?
Are there any specific host kernel parameters, KVM options (args:?), or other settings known to resolve this? I am currently waiting for a potential BIOS/Firmware update from the manufacturer (Gooxi) for the potentially outdated microcode (0x0a001119).
Any insights or suggestions would be greatly appreciated!