kvm/kvm_intel module cause CPU hogging

faluyt

Active Member
Oct 15, 2016
7
0
41
36
Hi,

I have a server running Proxmox 5.3.11. The kworker always uses too much CPU that caused the server hangs. After debugging, I see a lot of "kvm Tained" in the dmesg log. Some example:

Code:
....
[661233.727943] NMI backtrace for cpu 4
[661233.727945] CPU: 4 PID: 807 Comm: rs:main Q:Reg Tainted: G      D          4.15.18-11-pve #1
[661233.727945] Hardware name: MSI MS-7816/H87-G43 (MS-7816), BIOS V2.14B14 07/13/2018
[661233.727946] RIP: 0033:0x7f4b9aae98ef
[661233.727947] RSP: 002b:00007f4b9941dab0 EFLAGS: 00000202
[661233.727948] RAX: 0000000000000085 RBX: 00007f4b94002070 RCX: 0000000000003987
[661233.727949] RDX: fffffffffffffd90 RSI: 0000000000000000 RDI: 00007f4b94000020
[661233.727949] RBP: 00007f4b94000020 R08: 000055b74ab7f510 R09: 0000000000000000
[661233.727950] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[661233.727951] R13: 0000000000000270 R14: 00007f4b940022e0 R15: 0000000000000080
[661233.727952] FS:  00007f4b9941e700(0000) GS:ffff9c29deb00000(0000) knlGS:0000000000000000
[661233.727953] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[661233.727953] CR2: ffffb001ac5b877b CR3: 00000007f5d08001 CR4: 00000000001626e0
[661233.727954] NMI backtrace for cpu 0
[661233.727956] CPU: 0 PID: 24709 Comm: kvm Tainted: G      D          4.15.18-11-pve #1
[661233.727956] Hardware name: MSI MS-7816/H87-G43 (MS-7816), BIOS V2.14B14 07/13/2018
[661233.727978] RIP: 0010:paging64_walk_addr_generic+0x1ca/0x7d0 [kvm]
[661233.727978] RSP: 0018:ffffaf4b43ddb960 EFLAGS: 00000246
[661233.727979] RAX: 0000000000000003 RBX: 0000000000000004 RCX: 0000000000804063
[661233.727980] RDX: 0000000000000003 RSI: 8000000000804063 RDI: 0000000000000004
[661233.727981] RBP: ffffaf4b43ddb9f8 R08: 0000000000000003 R09: 0000000000000001
[661233.727981] R10: 0000000000000000 R11: ffff9c2925c20000 R12: 00000000000001aa
[661233.727982] R13: 0000000000804063 R14: ffff9c2947400380 R15: ffffaf4b43ddba08
[661233.727983] FS:  00007f6d173ff700(0000) GS:ffff9c29dea00000(0000) knlGS:000000007f3ec000
[661233.727983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[661233.727984] CR2: ffffb001ac5b877b CR3: 00000007f2c12006 CR4: 00000000001626f0
[661233.727984] Call Trace:
[661233.727998]  ? kvm_irq_delivery_to_apic+0x1e8/0x2b0 [kvm]
[661233.728010]  paging64_gva_to_gpa+0x44/0xb0 [kvm]
[661233.728020]  ? kvm_vcpu_mmap+0x20/0x20 [kvm]
[661233.728024]  ? vmx_vcpu_load+0x6a/0x300 [kvm_intel]
[661233.728027]  ? vmx_vcpu_load+0x248/0x300 [kvm_intel]
[661233.728035]  ? kvm_io_bus_get_first_dev+0x58/0x110 [kvm]
[661233.728045]  ? kvm_io_bus_read+0x48/0x110 [kvm]
[661233.728047]  ? __update_load_avg_se.isra.38+0x1bc/0x1d0
[661233.728048]  ? __update_load_avg_se.isra.38+0x1bc/0x1d0
[661233.728059]  kvm_fetch_guest_virt+0x6f/0xe0 [kvm]
[661233.728070]  __do_insn_fetch_bytes+0x1ae/0x220 [kvm]
[661233.728078]  x86_decode_insn+0x79d/0x1310 [kvm]
[661233.728080]  ? vmx_get_cs_db_l_bits+0x1c/0x40 [kvm_intel]
[661233.728087]  x86_emulate_instruction+0x13c/0x6e0 [kvm]
[661233.728089]  ? vmexit_fill_RSB+0x10/0x40 [kvm_intel]
[661233.728090]  ? vmexit_fill_RSB+0x10/0x40 [kvm_intel]
[661233.728092]  handle_apic_access+0x4e/0x70 [kvm_intel]
[661233.728094]  vmx_handle_exit+0xb5/0x1560 [kvm_intel]
[661233.728096]  ? vmexit_fill_RSB+0x10/0x40 [kvm_intel]
[661233.728098]  ? vmx_vcpu_run+0x418/0x5e0 [kvm_intel]
[661233.728105]  kvm_arch_vcpu_ioctl_run+0x95c/0x16d0 [kvm]
[661233.728107]  ? futex_wake+0x90/0x170
[661233.728114]  ? kvm_arch_vcpu_load+0x68/0x250 [kvm]
[661233.728120]  kvm_vcpu_ioctl+0x339/0x620 [kvm]
[661233.728128]  ? kvm_vcpu_ioctl+0x339/0x620 [kvm]
[661233.728130]  do_vfs_ioctl+0xa6/0x620
[661233.728141]  ? kvm_on_user_return+0x70/0xa0 [kvm]
[661233.728142]  SyS_ioctl+0x79/0x90
[661233.728144]  ? exit_to_usermode_loop+0xa5/0xd0
[661233.728146]  do_syscall_64+0x73/0x130
[661233.728147]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[661233.728148] RIP: 0033:0x7f6e6140c017
[661233.728149] RSP: 002b:00007f6d173fc538 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[661233.728150] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6e6140c017
[661233.728150] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000032
[661233.728151] RBP: 0000000000000000 R08: 00007f6e53861ef0 R09: 000001a64a020188
[661233.728152] R10: 000000003b9aca00 R11: 0000000000000246 R12: 00007f6e53be1000
[661233.728152] R13: 00007f6e79d28000 R14: 0000000000000000 R15: 00007f6e53be1000
[661233.728153] Code: 00 00 00 41 8b 3f 4c 89 e8 44 89 e9 48 c1 e8 07 83 e0 01 8d 57 ff 4c 63 c2 4d 8d 14 80 4c 01 d0 4d 89 ea 4d 23 94 c6 e0 00 00 00 <49> 8b 86 30 01 00 00 48 d3 e8 83 e0 01 49 09 c2 0f 85 d6 01 00
[661233.728174] NMI backtrace for cpu 2 skipped: idling at intel_idle+0x7b/0x130
[661233.728175] NMI backtrace for cpu 6
[661233.728177] CPU: 6 PID: 3873 Comm: kvm Tainted: G      D          4.15.18-11-pve #1
[661233.728177] Hardware name: MSI MS-7816/H87-G43 (MS-7816), BIOS V2.14B14 07/13/2018
[661233.728180] RIP: 0010:intel_guest_get_msrs+0x18/0x70
[661233.728181] RSP: 0018:ffffaf4b4811fc78 EFLAGS: 00000086
[661233.728182] RAX: ffff9c29deb8f3a0 RBX: ffff9c29b27dd628 RCX: ffff9c29b27d8000
[661233.728183] RDX: 0000000000000004 RSI: 000063ddc0000000 RDI: ffffaf4b4811fc9c
[661233.728183] RBP: ffffaf4b4811fc78 R08: 0000000000000000 R09: 0000000000000000
[661233.728184] R10: ffffaf4b4811fdd8 R11: 0000000000000000 R12: 0000000000000000
[661233.728185] R13: ffff9c29b27d8000 R14: ffff9c29b2f952b0 R15: 0000000000000001
[661233.728186] FS:  00007f8195fff700(0000) GS:ffff9c29deb80000(0000) knlGS:00000000fe520000
[661233.728186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[661233.728187] CR2: 0000000009da3764 CR3: 00000007f2ec4004 CR4: 00000000001626e0
[661233.728188] Call Trace:
[661233.728189]  perf_guest_get_msrs+0x1a/0x30
[661233.728194]  atomic_switch_perf_msrs+0x2d/0xa0 [kvm_intel]
[661233.728197]  vmx_vcpu_run+0x13a/0x5e0 [kvm_intel]
[661233.728210]  kvm_arch_vcpu_ioctl_run+0x841/0x16d0 [kvm]
[661233.728212]  ? futex_wake+0x90/0x170
[661233.728223]  ? kvm_arch_vcpu_load+0x68/0x250 [kvm]
[661233.728229]  kvm_vcpu_ioctl+0x339/0x620 [kvm]
[661233.728236]  ? kvm_vcpu_ioctl+0x339/0x620 [kvm]
[661233.728237]  do_vfs_ioctl+0xa6/0x620
[661233.728244]  ? kvm_on_user_return+0x70/0xa0 [kvm]
[661233.728245]  SyS_ioctl+0x79/0x90
[661233.728247]  do_syscall_64+0x73/0x130
[661233.728248]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[661233.728249] RIP: 0033:0x7f82e0264017
[661233.728249] RSP: 002b:00007f8195ffc538 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[661233.728250] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f82e0264017
[661233.728251] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000033
[661233.728251] RBP: 0000000000000000 R08: 00007f82d2461ef0 R09: 00000000000000ff
[661233.728251] R10: 00007f82f8b81000 R11: 0000000000000246 R12: 00007f82d27e1000
[661233.728252] R13: 00007f82f8b80000 R14: 0000000000000000 R15: 00007f82d27e1000
[661233.728253] Code: 07 00 00 00 00 31 c0 c3 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 c7 c0 a0 f3 00 00 48 89 e5 65 48 03 05 30 47 20 64 <48> 8b 90 80 0c 00 00 48 8b 35 6a fc 64 01 48 05 88 0c 00 00 48
...

My PVE version:
Code:
root@hostname~ # pveversion -v
proxmox-ve: 5.3-1 (running kernel: 4.15.18-11-pve)
pve-manager: 5.3-11 (running version: 5.3-11/d4907f84)
pve-kernel-4.15: 5.3-2
pve-kernel-4.15.18-11-pve: 4.15.18-33
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-47
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-38
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-33
pve-container: 2.0-34
pve-docs: 5.3-3
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-17
pve-firmware: 2.0-6
pve-ha-manager: 2.0-6
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 2.12.1-2
pve-xtermjs: 3.10.1-1
qemu-server: 5.0-46
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3

Anyone here facing the same issue like me and do you have any solution?
Thanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!