100% CPU on Host after scheduled backup

Enrico90

New Member
Jun 3, 2016
10
0
1
33
Hello, I have seen this problem happening two times on two different and unrelated servers.
Here is my configuration:
Dedicated server with proxmox 4.2
only 1 VM running inside, with Debian 8.4, that takes ~80% RAM and 6 of 8 physical cores.
At 3:00 AM a NFS backup starts; after that I can no longer reach the website hosted on the VM.
Symptoms are these:
-zabbix agent inside the VM reports low cpu usage.
-ram is not full
-proxmox server views 100% for each core (so 600% CPU usage)
-no KVM connection possible
-no TCP connection possible (network is natted and I used KVM port forwarding)
-after KVM stop and restart, everything is back to normal.

In my log I see lines like:


Jun 3 10:24:10 ns326125 kernel: [ 480.345124] lzop D ffff88082904fa58 0 1950 1926 0x00000000
Jun 3 10:24:10 ns326125 kernel: [ 480.345127] ffff88082904fa58 ffffffffffffffff ffff88085b9c1b80 ffff88084d15d280
Jun 3 10:24:10 ns326125 kernel: [ 480.345129] ffff880829050000 ffff88087fc97180 7fffffffffffffff ffffffff81845170
Jun 3 10:24:10 ns326125 kernel: [ 480.345130] ffff88082904fbb0 ffff88082904fa70 ffffffff81844885 0000000000000000
Jun 3 10:24:10 ns326125 kernel: [ 480.345131] Call Trace:
Jun 3 10:24:10 ns326125 kernel: [ 480.345137] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345139] [<ffffffff81844885>] schedule+0x35/0x80
Jun 3 10:24:10 ns326125 kernel: [ 480.345141] [<ffffffff81847ac5>] schedule_timeout+0x235/0x2d0
Jun 3 10:24:10 ns326125 kernel: [ 480.345143] [<ffffffff811eadf1>] ? ___slab_alloc+0x341/0x430
Jun 3 10:24:10 ns326125 kernel: [ 480.345146] [<ffffffff810f568c>] ? ktime_get+0x3c/0xb0
Jun 3 10:24:10 ns326125 kernel: [ 480.345147] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345149] [<ffffffff81843d9b>] io_schedule_timeout+0xbb/0x140
Jun 3 10:24:10 ns326125 kernel: [ 480.345150] [<ffffffff8184518b>] bit_wait_io+0x1b/0x70
Jun 3 10:24:10 ns326125 kernel: [ 480.345151] [<ffffffff81844c3f>] __wait_on_bit+0x5f/0x90
Jun 3 10:24:10 ns326125 kernel: [ 480.345153] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345154] [<ffffffff81844cf1>] out_of_line_wait_on_bit+0x81/0xb0
Jun 3 10:24:10 ns326125 kernel: [ 480.345157] [<ffffffff810c3dc0>] ? autoremove_wake_function+0x40/0x40
Jun 3 10:24:10 ns326125 kernel: [ 480.345168] [<ffffffffc08f7c94>] nfs_wait_on_request+0x34/0x40 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345174] [<ffffffffc08fc9ce>] nfs_updatepage+0x15e/0x920 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345179] [<ffffffffc08ecdd4>] nfs_write_end+0x154/0x500 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345181] [<ffffffff813fed9f>] ? iov_iter_copy_from_user_atomic+0x8f/0x230
Jun 3 10:24:10 ns326125 kernel: [ 480.345184] [<ffffffff8118cf94>] generic_perform_write+0x114/0x1c0
Jun 3 10:24:10 ns326125 kernel: [ 480.345186] [<ffffffff8118f126>] __generic_file_write_iter+0x1a6/0x1f0
Jun 3 10:24:10 ns326125 kernel: [ 480.345188] [<ffffffff812297b3>] ? touch_atime+0x33/0xd0
Jun 3 10:24:10 ns326125 kernel: [ 480.345189] [<ffffffff8118f254>] generic_file_write_iter+0xe4/0x1e0
Jun 3 10:24:10 ns326125 kernel: [ 480.345193] [<ffffffffc08ec54a>] nfs_file_write+0x9a/0x160 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345195] [<ffffffff8120c1bb>] new_sync_write+0x9b/0xe0
Jun 3 10:24:10 ns326125 kernel: [ 480.345197] [<ffffffff8120c226>] __vfs_write+0x26/0x40
Jun 3 10:24:10 ns326125 kernel: [ 480.345198] [<ffffffff8120c899>] vfs_write+0xa9/0x190
Jun 3 10:24:10 ns326125 kernel: [ 480.345199] [<ffffffff8120c7d4>] ? vfs_read+0x114/0x130
Jun 3 10:24:10 ns326125 kernel: [ 480.345201] [<ffffffff8120d675>] SyS_write+0x55/0xc0
Jun 3 10:24:10 ns326125 kernel: [ 480.345203] [<ffffffff818489b6>] entry_SYSCALL_64_fastpath+0x16/0x75


Jun 3 04:47:25 ns326125 kernel: [574672.937159] audit: type=1400 audit(1464922045.521:227): apparmor="DENIED" operation="sendmsg" profile="/usr/sbin/named" name="/run/systemd/journal/dev-log" pid=1567 comm="named" requested_mask="w" denied_mask="w" fsuid=109 ouid=0
Jun 3 04:47:26 ns326125 kernel: [574674.029074] audit: type=1400 audit(1464922046.613:228): apparmor="DENIED" operation="sendmsg" profile="/usr/sbin/named" name="/run/systemd/journal/dev-log" pid=1568 comm="named" requested_mask="w" denied_mask="w" fsuid=109 ouid=0



any idea??

thanks
 
sorry, actually CPU usage inside VM was not low. That was a wrong assumption.


here it is some recovered log from the VM. backup starts at 03:00


Jun 3 03:01:14 prestashop-impact kernel: [313013.861098] ata3: hard resetting link
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] sending NMI to all CPUs:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] NMI backtrace for cpu 1
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] task: ffff8806f9ecca20 ti: ffff8806f9ed4000 task.ti: ffff8806f9ed4000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RIP: 0010:[<ffffffff81051ad6>] [<ffffffff81051ad6>] native_write_msr_safe+0x6/0x10
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RSP: 0018:ffff880715c43e08 EFLAGS: 00000046
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RAX: 0000000000000400 RBX: 0000000000000001 RCX: 0000000000000830
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RDX: 0000000000000001 RSI: 0000000000000400 RDI: 0000000000000830
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RBP: ffffffff818e28e0 R08: ffffffff818e28e0 R09: 00000000000037fe
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] R10: 61206f7420494d4e R11: 3a73555043206c6c R12: 0000000000000001
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] R13: 000000000000a0ea R14: 0000000000000002 R15: 0000000000080000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] FS: 0000000000000000(0000) GS:ffff880715c40000(0000) knlGS:0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CR2: 00007f0ff7b4a000 CR3: 00000006c5c91000 CR4: 00000000000006e0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Stack:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81049f6d 0000000000000002 0000000000000082 ffff880715c4d660
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81853780 0000000000000001 ffffffff818e2e40 0000000000000047
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81853780 ffffffff810469c3 ffff880715c4d660 ffffffff810c5add
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Call Trace:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] <IRQ>
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81049f6d>] ? __x2apic_send_IPI_mask+0xad/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810469c3>] ? arch_trigger_all_cpu_backtrace+0xc3/0x140
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810c5add>] ? rcu_check_callbacks+0x60d/0x630
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d02d0>] ? tick_sched_handle.isra.16+0x60/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81074ba0>] ? update_process_times+0x40/0x70
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d0290>] ? tick_sched_handle.isra.16+0x20/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d030c>] ? tick_sched_timer+0x3c/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8108b187>] ? __run_hrtimer+0x67/0x1c0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8108b539>] ? hrtimer_interrupt+0xe9/0x220
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff815178eb>] ? smp_apic_timer_interrupt+0x3b/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8151597d>] ? apic_timer_interrupt+0x6d/0x80
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] <EOI>
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81099c35>] ? sched_clock_local+0x15/0x80
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8101c8b0>] ? idle_notifier_unregister+0x20/0x20
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81051c12>] ? native_safe_halt+0x2/0x10
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8101c8c9>] ? default_idle+0x19/0xb0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810a83e0>] ? cpu_startup_entry+0x340/0x400
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81042cbf>] ? start_secondary+0x20f/0x2d0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Code: c2 48 89 d0 c3 89 f9 0f 32 31 c9 48 c1 e2 20 89 c0 89 0e 48 09 c2 48 89 d0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 89 f0 89 f9 0f 30 <31> c0 c3 0f 1f 80 00 00 00 00 89 f9 0f 33 48 c1 e2 20 89 c0 48
Jun 3 03:01:37 prestashop-impact kernel: [313035.971500] NMI backtrace for cpu 0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971503] CPU: 0 PID: 242 Comm: vballoon Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313035.971504] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Jun 3 03:01:37 prestashop-impact kernel: [313035.971505] task: ffff8806f66ce390 ti: ffff8806f8e68000 task.ti: ffff8806f8e68000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971506] RIP: 0010:[<ffffffffa016e486>] [<ffffffffa016e486>] e1000_intr+0x16/0x120 [e1000]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971530] RSP: 0018:ffff880715c03ef0 EFLAGS: 00000086
Jun 3 03:01:37 prestashop-impact kernel: [313035.971531] RAX: 0000000000000080 RBX: ffff8806f5c02000 RCX: 000000000000ac2f
Jun 3 03:01:37 prestashop-impact kernel: [313035.971532] RDX: 000000000000ac2f RSI: ffff8806f5c02000 RDI: 000000000000000a
Jun 3 03:01:37 prestashop-impact kernel: [313035.971533] RBP: 000000000000000a R08: 0000000000000000 R09: ffff880714c00008
Jun 3 03:01:37 prestashop-impact kernel: [313035.971533] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003a
Jun 3 03:01:37 prestashop-impact kernel: [313035.971534] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971535] FS: 0000000000000000(0000) GS:ffff880715c00000(0000) knlGS:0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971536] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 3 03:01:37 prestashop-impact kernel: [313035.971537] CR2: 00007f6315cff000 CR3: 00000006f6a28000 CR4: 00000000000006f0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971539] Stack:
Jun 3 03:01:37 prestashop-impact kernel: [313035.971539] ffff8806f618a7c0 ffffffff810bb375 ffff880715007800 ffff880715007800
Jun 3 03:01:37 prestashop-impact kernel: [313035.971541] ffff8807150078c4 000000000000003a 000000000000003a 0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971542] 0000000000000000 ffffffff810bb508 ffff880715007800 ffffffff818e2a00
Jun 3 03:01:37 prestashop-impact kernel: [313035.971543] Call Trace:
Jun 3 03:01:37 prestashop-impact kernel: [313035.971544] <IRQ>
Jun 3 03:01:37 prestashop-impact kernel: [313035.971545] [<ffffffff810bb375>] ? handle_irq_event_percpu+0x35/0x190
Jun 3 03:01:37 prestashop-impact kernel: [313035.971549] [<ffffffff810bb508>] ? handle_irq_event+0x38/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313035.971551] [<ffffffff810be743>] ? handle_fasteoi_irq+0x83/0x150
Jun 3 03:01:37 prestashop-impact kernel: [313035.971554] [<ffffffff810150cd>] ? handle_irq+0x1d/0x30
Jun 3 03:01:37 prestashop-impact kernel: [313035.971556] [<ffffffff81517819>] ? do_IRQ+0x49/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971558] [<ffffffff8151566d>] ? common_interrupt+0x6d/0x6d
Jun 3 03:01:37 prestashop-impact kernel: [313035.971559] <EOI>
Jun 3 03:01:37 prestashop-impact kernel: [313035.971559] [<ffffffff812bf4cb>] ? iowrite8+0x1b/0x40
Jun 3 03:01:37 prestashop-impact kernel: [313035.971570] [<ffffffffa018d0ae>] ? vp_set+0x4e/0x60 [virtio_pci]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971572] [<ffffffffa026a65a>] ? balloon+0xea/0x2a0 [virtio_balloon]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971573] [<ffffffffa026a570>] ? update_balloon_stats+0xe0/0xe0 [virtio_balloon]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971576] [<ffffffff8108809d>] ? kthread+0xbd/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971577] [<ffffffff81087fe0>] ? kthread_create_on_node+0x180/0x180
Jun 3 03:01:37 prestashop-impact kernel: [313035.971579] [<ffffffff81514958>] ? ret_from_fork+0x58/0x90
Jun 3 03:01:37 prestashop-impact kernel: [313035.971581] [<ffffffff81087fe0>] ? kthread_create_on_node+0x180/0x180
Jun 3 03:01:37 prestashop-impact kernel: [313035.971581] Code: 51 94 ef e0 48 c7 c7 38 4b 18 a0 31 c0 e9 95 d8 39 e1 0f 1f 00 0f 1f 44 00 00 53 48 89 f3 48 8b 86 28 0c 00 00 8b 80 c0 00 00 00 <85> c0 0f 84 f1 00 00 00 48 8b 96 10 10 00 00 83 e2 04 0f 85 a2
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] NMI backtrace for cpu 3
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] CPU: 3 PID: 11431 Comm: kworker/3:1 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!