Wrong CPU cache stats in all VM's

Feni

Well-Known Member
Jun 22, 2017
35
17
48
39
Hi there, I've recently been dealing with some strange performance issues in cache-heavy applications. My host doesn't get properly loaded (50 ~ 70% max cpu) unless I perform CPU-only heavy tasks. When looking around I found out that all my VM's report oversized L2 and L3 caches, which is a problem when applications count on that cache being available.

My current setup is this:
Host:
Code:
HP Proliant DL380 G6
2x Intel Xeon X5670's (6x 2.93ghz)
48GB RAM
Proxmox 5.1

VM1:
Code:
net0: virtio=A2:D8:63:2A:E1:E4,bridge=vmbr0,queues=8
numa: 1
onboot: 1
ostype: l26
scsi0: nvme:vm-100-disk-1,discard=on,size=200G
scsi1: data:vm-100-disk-1,discard=on,iothread=1,size=1181G
scsihw: virtio-scsi-single
smbios1: uuid=6e5d88e1-cd1e-457f-b89c-dec092e875cc
sockets: 2
agent: 1
balloon: 0
boot: cdn
bootdisk: scsi0
cores: 6
cpu: host
hotplug: disk,network,usb
ide2: local:iso/debian-9.0.0-amd64-DVD-1.iso,media=cdrom
memory: 8192
name: debian

VM2:
Code:
agent: 1
balloon: 0
bios: ovmf
bootdisk: scsi1
cores: 6
cpu: host
efidisk0: local-zfs:vm-101-disk-2,size=128K
hostpci0: 10:00,pcie=1
ide2: local:iso/virtio-win-0.1.141.iso,media=cdrom,size=309208K
machine: q35
memory: 12292
name: win10
net0: virtio=4E:2E:B3:E2:70:64,bridge=vmbr0
numa: 1
ostype: win10
parent: B20171121
scsi0: data:vm-101-disk-1,discard=on,size=500G
scsi1: local-zfs:vm-101-disk-1,discard=on,size=64G
scsihw: virtio-scsi-pci

When looking at lscpu output, I noticed that 16M L3 cache is being reported instead of the correct 12M. L2 cache is also wrong:
Host:
Code:
root@pve:~# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    1
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
Stepping:              2
CPU MHz:               2933.000
CPU max MHz:           2933.0000
CPU min MHz:           1600.0000
BogoMIPS:              5864.76
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0,2,4,6,8,10
NUMA node1 CPU(s):     1,3,5,7,9,11
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca                                                                                                              cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1g                                                                                                             b rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_t                                                                                                             sc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx                                                                                                             16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb tpr_shadow vnmi flexp                                                                                                             riority ept vpid dtherm ida arat

VM1:
Code:
root@debian:~$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    1
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
Stepping:              2
CPU MHz:               2932.498
BogoMIPS:              5864.99
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
L3 cache:              16384K
NUMA node0 CPU(s):     0-5
NUMA node1 CPU(s):     6-11
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes hypervisor lahf_lm tsc_adjust arat

VM2 (Win10) is identical, it reports 16M of L3 cache that isn't there, it should be 12M. The CPU settings in Proxmox (type= host, KVM64, Westmere or Qemu64) have no effect on the reported L2 and L3 cache in my VM's.

I've dug around many threads in both Qemu, KVM and Proxmox VE related fora and found very little to explain this behaviour. Can anyone point me in the direction how to have my host report correct L2 and L3 caches to my VM's and explain how the host can forward wrong cache settings?
 
Aha, thanks a lot! That means I should be able to patch it to something else. It's not a production system anyway so I'll look into changing these values.

Edit:
Well, many interdependent variables exist in cpu.c which makes predicting behaviour of those settings problematic at least. Probably best to first file a bugreport at Qemu and see what they have to say.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!