[SOLVED] Split Lock Detected on Intel Xeon Silver 4310 CPUs Under Load in Proxmox – Troubleshooting Steps?

The Inc.

Member
Feb 4, 2021
3
1
8
42
Hi,

I’m new to the forum and running into a problem with one of my servers. We are actively moving from ESXi to Proxmox, and we have successfully converted Dell 7920 and R740 servers, which have been running perfectly. However, our latest batch of R750 servers is experiencing issues.

I’m running a Proxmox cluster with multiple nodes, and I’ve encountered an issue with one of the servers that has Intel Xeon Silver 4310 CPUs. Specifically, when the system is under load (e.g., hosting Windows VMs), I see “split lock” messages in the logs. This causes noticeable performance degradation and sluggishness, particularly in Windows 11 and Server 2022 VMs.

Here are some details about the affected node:
• Server Model: Dell R750 with Intel Xeon Silver 4310 CPUs (2 sockets, 12 cores per socket, 48 threads total)
• Proxmox Version: 8.3.2
• Kernel: Linux 6.8.12-5-pve
• CPU Flags: Split lock detection enabled (split_lock_detect flag present)
• Virtualization: Intel VT-x enabled, KVM hypervisor used
• Operating System: Windows 11, Windows Server 2022 VMs
• Vulnerabilities: Microcode update mitigations in place

Interestingly, no split lock messages are detected when starting a Linux VM on the same node. This makes it seem like the issue is specifically related to Windows VMs.

Additional Information:
• The configuration of this server is exactly the same as another node in the cluster, which uses Intel Xeon Silver 4210R CPUs (40 threads) and is not showing any split lock warnings. Performance is stable across both nodes.

Troubleshooting I’ve tried:
• Ensured that the server and Proxmox are fully updated, including microcode updates.
• Checked kernel logs for any related errors.
• Attempted to adjust CPU pinning and virtualization options for the VMs.
• Disabled hyper-threading, but the issue persists.

Does anyone have experience with similar issues on Intel Xeon 4310 CPUs under Proxmox? Specifically, is there a way to mitigate or fix split lock errors? Could this be related to the microcode or specific kernel settings? Any suggestions on further diagnostics or configuration changes would be greatly appreciated.

Thanks in advance for your help!

Bash:
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 1/KVM/1938821 took a split_lock trap at address: 0x7eedd050
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 6/KVM/1938826 took a split_lock trap at address: 0x7eedd050
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 5/KVM/1938825 took a split_lock trap at address: 0x7eedd050
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 4/KVM/1938824 took a split_lock trap at address: 0x7eedd050
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 7/KVM/1938827 took a split_lock trap at address: 0x7eedd050
Jan 20 12:13:30 server1 kernel: x86/split lock detection: #AC: CPU 3/KVM/1938823 took a split_lock trap at address: 0x7eedd050

Bash:
With Split lock messages:

root@server1:~# lscpu
CPU(s):                   48
  On-line CPU(s) list:    0-47
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         Intel
  Model name:             Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz
Virtualization features: 
  Virtualization:         VT-x
Caches (sum of all):     
  L1d:                    1.1 MiB (24 instances)
  L1i:                    768 KiB (24 instances)
  L2:                     30 MiB (24 instances)
  L3:                     36 MiB (2 instances)
NUMA:                     
  NUMA node(s):           2

Bash:
Without Split lock messages:

root@server2:~# lscpu
CPU(s):                   40
  On-line CPU(s) list:    0-39
Vendor ID:                GenuineIntel
  BIOS Vendor ID:         Intel
  Model name:             Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz
Virtualization features: 
  Virtualization:         VT-x
Caches (sum of all):     
  L1d:                    640 KiB (20 instances)
  L1i:                    640 KiB (20 instances)
  L2:                     20 MiB (20 instances)
  L3:                     27.5 MiB (2 instances)
NUMA:                     
  NUMA node(s):           2
 
I actually did, but unfortunately no solution yet.

But...

I've implemented the;
If you are aware of the risks, you can disable split lock mitigation temporarily by running sysctl -w kernel.split_lock_mitigate=0 (see the kernel docs). You can disable it permanently by creating a file /etc/sysctl.d/50-split-lock.conf with contents kernel.split_lock_mitigate=0, and running sysctl -p /etc/sysctl.d/50-split-lock.conf. You will still get the warnings in the journal, but the bpftrace script should report no more artificial slowdowns.
again and saw from the bpftrace no trails of slowing down CPU's anymore, so hopefully this could be the fix, which we will test tomorrow!
 
Last edited:
After the applied solution, the system seems to be stable and performing as it should!

Solution applied:
Bash:
echo "kernel.split_lock_mitigate=0" > /etc/sysctl.d/50-split-lock.conf
cat /etc/sysctl.d/50-split-lock.conf #check if the .conf is created
sysctl -p /etc/sysctl.d/50-split-lock.conf
reboot
 
  • Like
Reactions: aaron
I would call it a workaround ;)

You still have the cause of the split lock, most likely some application in a VM, but disabled the penalty slow down. But then again, that is most likely the only option unless you build and maintain that software yourself.
 
  • Like
Reactions: The Inc.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!