100% CPU on Host after scheduled backup

Enrico90

New Member
Jun 3, 2016
10
0
1
34
Hello, I have seen this problem happening two times on two different and unrelated servers.
Here is my configuration:
Dedicated server with proxmox 4.2
only 1 VM running inside, with Debian 8.4, that takes ~80% RAM and 6 of 8 physical cores.
At 3:00 AM a NFS backup starts; after that I can no longer reach the website hosted on the VM.
Symptoms are these:
-zabbix agent inside the VM reports low cpu usage.
-ram is not full
-proxmox server views 100% for each core (so 600% CPU usage)
-no KVM connection possible
-no TCP connection possible (network is natted and I used KVM port forwarding)
-after KVM stop and restart, everything is back to normal.

In my log I see lines like:


Jun 3 10:24:10 ns326125 kernel: [ 480.345124] lzop D ffff88082904fa58 0 1950 1926 0x00000000
Jun 3 10:24:10 ns326125 kernel: [ 480.345127] ffff88082904fa58 ffffffffffffffff ffff88085b9c1b80 ffff88084d15d280
Jun 3 10:24:10 ns326125 kernel: [ 480.345129] ffff880829050000 ffff88087fc97180 7fffffffffffffff ffffffff81845170
Jun 3 10:24:10 ns326125 kernel: [ 480.345130] ffff88082904fbb0 ffff88082904fa70 ffffffff81844885 0000000000000000
Jun 3 10:24:10 ns326125 kernel: [ 480.345131] Call Trace:
Jun 3 10:24:10 ns326125 kernel: [ 480.345137] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345139] [<ffffffff81844885>] schedule+0x35/0x80
Jun 3 10:24:10 ns326125 kernel: [ 480.345141] [<ffffffff81847ac5>] schedule_timeout+0x235/0x2d0
Jun 3 10:24:10 ns326125 kernel: [ 480.345143] [<ffffffff811eadf1>] ? ___slab_alloc+0x341/0x430
Jun 3 10:24:10 ns326125 kernel: [ 480.345146] [<ffffffff810f568c>] ? ktime_get+0x3c/0xb0
Jun 3 10:24:10 ns326125 kernel: [ 480.345147] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345149] [<ffffffff81843d9b>] io_schedule_timeout+0xbb/0x140
Jun 3 10:24:10 ns326125 kernel: [ 480.345150] [<ffffffff8184518b>] bit_wait_io+0x1b/0x70
Jun 3 10:24:10 ns326125 kernel: [ 480.345151] [<ffffffff81844c3f>] __wait_on_bit+0x5f/0x90
Jun 3 10:24:10 ns326125 kernel: [ 480.345153] [<ffffffff81845170>] ? bit_wait_timeout+0xa0/0xa0
Jun 3 10:24:10 ns326125 kernel: [ 480.345154] [<ffffffff81844cf1>] out_of_line_wait_on_bit+0x81/0xb0
Jun 3 10:24:10 ns326125 kernel: [ 480.345157] [<ffffffff810c3dc0>] ? autoremove_wake_function+0x40/0x40
Jun 3 10:24:10 ns326125 kernel: [ 480.345168] [<ffffffffc08f7c94>] nfs_wait_on_request+0x34/0x40 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345174] [<ffffffffc08fc9ce>] nfs_updatepage+0x15e/0x920 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345179] [<ffffffffc08ecdd4>] nfs_write_end+0x154/0x500 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345181] [<ffffffff813fed9f>] ? iov_iter_copy_from_user_atomic+0x8f/0x230
Jun 3 10:24:10 ns326125 kernel: [ 480.345184] [<ffffffff8118cf94>] generic_perform_write+0x114/0x1c0
Jun 3 10:24:10 ns326125 kernel: [ 480.345186] [<ffffffff8118f126>] __generic_file_write_iter+0x1a6/0x1f0
Jun 3 10:24:10 ns326125 kernel: [ 480.345188] [<ffffffff812297b3>] ? touch_atime+0x33/0xd0
Jun 3 10:24:10 ns326125 kernel: [ 480.345189] [<ffffffff8118f254>] generic_file_write_iter+0xe4/0x1e0
Jun 3 10:24:10 ns326125 kernel: [ 480.345193] [<ffffffffc08ec54a>] nfs_file_write+0x9a/0x160 [nfs]
Jun 3 10:24:10 ns326125 kernel: [ 480.345195] [<ffffffff8120c1bb>] new_sync_write+0x9b/0xe0
Jun 3 10:24:10 ns326125 kernel: [ 480.345197] [<ffffffff8120c226>] __vfs_write+0x26/0x40
Jun 3 10:24:10 ns326125 kernel: [ 480.345198] [<ffffffff8120c899>] vfs_write+0xa9/0x190
Jun 3 10:24:10 ns326125 kernel: [ 480.345199] [<ffffffff8120c7d4>] ? vfs_read+0x114/0x130
Jun 3 10:24:10 ns326125 kernel: [ 480.345201] [<ffffffff8120d675>] SyS_write+0x55/0xc0
Jun 3 10:24:10 ns326125 kernel: [ 480.345203] [<ffffffff818489b6>] entry_SYSCALL_64_fastpath+0x16/0x75


Jun 3 04:47:25 ns326125 kernel: [574672.937159] audit: type=1400 audit(1464922045.521:227): apparmor="DENIED" operation="sendmsg" profile="/usr/sbin/named" name="/run/systemd/journal/dev-log" pid=1567 comm="named" requested_mask="w" denied_mask="w" fsuid=109 ouid=0
Jun 3 04:47:26 ns326125 kernel: [574674.029074] audit: type=1400 audit(1464922046.613:228): apparmor="DENIED" operation="sendmsg" profile="/usr/sbin/named" name="/run/systemd/journal/dev-log" pid=1568 comm="named" requested_mask="w" denied_mask="w" fsuid=109 ouid=0



any idea??

thanks
 
sorry, actually CPU usage inside VM was not low. That was a wrong assumption.


here it is some recovered log from the VM. backup starts at 03:00


Jun 3 03:01:14 prestashop-impact kernel: [313013.861098] ata3: hard resetting link
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] sending NMI to all CPUs:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] NMI backtrace for cpu 1
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] task: ffff8806f9ecca20 ti: ffff8806f9ed4000 task.ti: ffff8806f9ed4000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RIP: 0010:[<ffffffff81051ad6>] [<ffffffff81051ad6>] native_write_msr_safe+0x6/0x10
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RSP: 0018:ffff880715c43e08 EFLAGS: 00000046
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RAX: 0000000000000400 RBX: 0000000000000001 RCX: 0000000000000830
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RDX: 0000000000000001 RSI: 0000000000000400 RDI: 0000000000000830
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] RBP: ffffffff818e28e0 R08: ffffffff818e28e0 R09: 00000000000037fe
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] R10: 61206f7420494d4e R11: 3a73555043206c6c R12: 0000000000000001
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] R13: 000000000000a0ea R14: 0000000000000002 R15: 0000000000080000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] FS: 0000000000000000(0000) GS:ffff880715c40000(0000) knlGS:0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] CR2: 00007f0ff7b4a000 CR3: 00000006c5c91000 CR4: 00000000000006e0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Stack:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81049f6d 0000000000000002 0000000000000082 ffff880715c4d660
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81853780 0000000000000001 ffffffff818e2e40 0000000000000047
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] ffffffff81853780 ffffffff810469c3 ffff880715c4d660 ffffffff810c5add
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Call Trace:
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] <IRQ>
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81049f6d>] ? __x2apic_send_IPI_mask+0xad/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810469c3>] ? arch_trigger_all_cpu_backtrace+0xc3/0x140
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810c5add>] ? rcu_check_callbacks+0x60d/0x630
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d02d0>] ? tick_sched_handle.isra.16+0x60/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81074ba0>] ? update_process_times+0x40/0x70
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d0290>] ? tick_sched_handle.isra.16+0x20/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810d030c>] ? tick_sched_timer+0x3c/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8108b187>] ? __run_hrtimer+0x67/0x1c0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8108b539>] ? hrtimer_interrupt+0xe9/0x220
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff815178eb>] ? smp_apic_timer_interrupt+0x3b/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8151597d>] ? apic_timer_interrupt+0x6d/0x80
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] <EOI>
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81099c35>] ? sched_clock_local+0x15/0x80
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8101c8b0>] ? idle_notifier_unregister+0x20/0x20
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81051c12>] ? native_safe_halt+0x2/0x10
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff8101c8c9>] ? default_idle+0x19/0xb0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff810a83e0>] ? cpu_startup_entry+0x340/0x400
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] [<ffffffff81042cbf>] ? start_secondary+0x20f/0x2d0
Jun 3 03:01:37 prestashop-impact kernel: [313034.880014] Code: c2 48 89 d0 c3 89 f9 0f 32 31 c9 48 c1 e2 20 89 c0 89 0e 48 09 c2 48 89 d0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 89 f0 89 f9 0f 30 <31> c0 c3 0f 1f 80 00 00 00 00 89 f9 0f 33 48 c1 e2 20 89 c0 48
Jun 3 03:01:37 prestashop-impact kernel: [313035.971500] NMI backtrace for cpu 0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971503] CPU: 0 PID: 242 Comm: vballoon Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313035.971504] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Jun 3 03:01:37 prestashop-impact kernel: [313035.971505] task: ffff8806f66ce390 ti: ffff8806f8e68000 task.ti: ffff8806f8e68000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971506] RIP: 0010:[<ffffffffa016e486>] [<ffffffffa016e486>] e1000_intr+0x16/0x120 [e1000]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971530] RSP: 0018:ffff880715c03ef0 EFLAGS: 00000086
Jun 3 03:01:37 prestashop-impact kernel: [313035.971531] RAX: 0000000000000080 RBX: ffff8806f5c02000 RCX: 000000000000ac2f
Jun 3 03:01:37 prestashop-impact kernel: [313035.971532] RDX: 000000000000ac2f RSI: ffff8806f5c02000 RDI: 000000000000000a
Jun 3 03:01:37 prestashop-impact kernel: [313035.971533] RBP: 000000000000000a R08: 0000000000000000 R09: ffff880714c00008
Jun 3 03:01:37 prestashop-impact kernel: [313035.971533] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003a
Jun 3 03:01:37 prestashop-impact kernel: [313035.971534] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971535] FS: 0000000000000000(0000) GS:ffff880715c00000(0000) knlGS:0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971536] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 3 03:01:37 prestashop-impact kernel: [313035.971537] CR2: 00007f6315cff000 CR3: 00000006f6a28000 CR4: 00000000000006f0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971539] Stack:
Jun 3 03:01:37 prestashop-impact kernel: [313035.971539] ffff8806f618a7c0 ffffffff810bb375 ffff880715007800 ffff880715007800
Jun 3 03:01:37 prestashop-impact kernel: [313035.971541] ffff8807150078c4 000000000000003a 000000000000003a 0000000000000000
Jun 3 03:01:37 prestashop-impact kernel: [313035.971542] 0000000000000000 ffffffff810bb508 ffff880715007800 ffffffff818e2a00
Jun 3 03:01:37 prestashop-impact kernel: [313035.971543] Call Trace:
Jun 3 03:01:37 prestashop-impact kernel: [313035.971544] <IRQ>
Jun 3 03:01:37 prestashop-impact kernel: [313035.971545] [<ffffffff810bb375>] ? handle_irq_event_percpu+0x35/0x190
Jun 3 03:01:37 prestashop-impact kernel: [313035.971549] [<ffffffff810bb508>] ? handle_irq_event+0x38/0x60
Jun 3 03:01:37 prestashop-impact kernel: [313035.971551] [<ffffffff810be743>] ? handle_fasteoi_irq+0x83/0x150
Jun 3 03:01:37 prestashop-impact kernel: [313035.971554] [<ffffffff810150cd>] ? handle_irq+0x1d/0x30
Jun 3 03:01:37 prestashop-impact kernel: [313035.971556] [<ffffffff81517819>] ? do_IRQ+0x49/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971558] [<ffffffff8151566d>] ? common_interrupt+0x6d/0x6d
Jun 3 03:01:37 prestashop-impact kernel: [313035.971559] <EOI>
Jun 3 03:01:37 prestashop-impact kernel: [313035.971559] [<ffffffff812bf4cb>] ? iowrite8+0x1b/0x40
Jun 3 03:01:37 prestashop-impact kernel: [313035.971570] [<ffffffffa018d0ae>] ? vp_set+0x4e/0x60 [virtio_pci]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971572] [<ffffffffa026a65a>] ? balloon+0xea/0x2a0 [virtio_balloon]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971573] [<ffffffffa026a570>] ? update_balloon_stats+0xe0/0xe0 [virtio_balloon]
Jun 3 03:01:37 prestashop-impact kernel: [313035.971576] [<ffffffff8108809d>] ? kthread+0xbd/0xe0
Jun 3 03:01:37 prestashop-impact kernel: [313035.971577] [<ffffffff81087fe0>] ? kthread_create_on_node+0x180/0x180
Jun 3 03:01:37 prestashop-impact kernel: [313035.971579] [<ffffffff81514958>] ? ret_from_fork+0x58/0x90
Jun 3 03:01:37 prestashop-impact kernel: [313035.971581] [<ffffffff81087fe0>] ? kthread_create_on_node+0x180/0x180
Jun 3 03:01:37 prestashop-impact kernel: [313035.971581] Code: 51 94 ef e0 48 c7 c7 38 4b 18 a0 31 c0 e9 95 d8 39 e1 0f 1f 00 0f 1f 44 00 00 53 48 89 f3 48 8b 86 28 0c 00 00 8b 80 c0 00 00 00 <85> c0 0f 84 f1 00 00 00 48 8b 96 10 10 00 00 83 e2 04 0f 85 a2
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] NMI backtrace for cpu 3
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] CPU: 3 PID: 11431 Comm: kworker/3:1 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
Jun 3 03:01:37 prestashop-impact kernel: [313014.200064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014