Testsystem with lxc, strange load

hight

Member
Jan 18, 2016
8
0
21
49
Hi guys, coming from xen and just testing proxmox 4. these days with lxc containers/lvm. Notice pretty high load randomly for minutes/hours without anything productive.
It's just a deb8 host with one container running (pretty much no activity on it, just a basic php7 testsystem)
2016-01-20 11_57_49-Looking for better performance understanding _ Proxmox Support Forum.png 2016-01-20 12_03_00-root@lena (136.243.156.34 peer 136.243.156.1) - byobu.png 2016-01-20 12_14_35-root@lena (136.243.156.34 peer 136.243.156.1) - byobu.png 2016-01-20 12_09_42-Cortana.png 2016-01-20 12_23_56-root@lena (136.243.156.34 peer 136.243.156.1) - byobu.png

A vm-reboot brings this down to 0.0x, no activity is shown by top on vm until it jumps up again after random time to loads between 2. and 5. . Any idea what might go strange here?
 
happens on a 2nd server too with same hardware except SSDs instead of HDDs and zfs instead of lvm.
Host is a Debian 8. Same issue here, at some point the server starts to cause untrackable load to the system until it's restarted.

2016-01-24 02_07_49-Munin __ localdomain __ localhost.localdomain.png

In logs:
Code:
Jan 28 11:04:58 forum kernel: [163784.545088] INFO: task ps:27839 blocked for more than 120 seconds.
Jan 28 11:04:58 forum kernel: [163784.545117]       Tainted: P           O    4.2.6-1-pve #1
Jan 28 11:04:58 forum kernel: [163784.545140] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 28 11:04:58 forum kernel: [163784.545181] ps              D ffff881031516a00     0 27839  12383 0x00000104
Jan 28 11:04:58 forum kernel: [163784.545184]  ffff8803260dbbe8 0000000000000082 ffff880fea3be740 ffff880f28cabb00
Jan 28 11:04:58 forum kernel: [163784.545186]  0000000000000246 ffff8803260dc000 ffff8803260dbc38 ffff880fd5130a30
Jan 28 11:04:58 forum kernel: [163784.545187]  ffff880fd825c840 fffffffffffffe00 ffff8803260dbc08 ffffffff81804257
Jan 28 11:04:58 forum kernel: [163784.545188] Call Trace:
Jan 28 11:04:58 forum kernel: [163784.545193]  [<ffffffff81804257>] schedule+0x37/0x80
Jan 28 11:04:58 forum kernel: [163784.545196]  [<ffffffff812fa8d3>] request_wait_answer+0x163/0x280
Jan 28 11:04:58 forum kernel: [163784.545198]  [<ffffffff810bd790>] ? wait_woken+0x90/0x90
Jan 28 11:04:58 forum kernel: [163784.545200]  [<ffffffff812faa80>] __fuse_request_send+0x90/0xa0
Jan 28 11:04:58 forum kernel: [163784.545201]  [<ffffffff812faab7>] fuse_request_send+0x27/0x30
Jan 28 11:04:58 forum kernel: [163784.545202]  [<ffffffff81305198>] fuse_direct_io+0x3a8/0x5b0
Jan 28 11:04:58 forum kernel: [163784.545203]  [<ffffffff813053e4>] __fuse_direct_read+0x44/0x60
Jan 28 11:04:58 forum kernel: [163784.545205]  [<ffffffff81305440>] fuse_direct_read_iter+0x40/0x60
Jan 28 11:04:58 forum kernel: [163784.545206]  [<ffffffff811fca34>] new_sync_read+0x94/0xd0
Jan 28 11:04:58 forum kernel: [163784.545209]  [<ffffffff811fdfb5>] SyS_read+0x55/0xc0
Jan 28 11:06:58 forum kernel: [163904.358279] INFO: task ps:27839 blocked for more than 120 seconds.
Jan 28 11:06:58 forum kernel: [163904.358307]       Tainted: P           O    4.2.6-1-pve #1
Jan 28 11:06:58 forum kernel: [163904.358331] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 28 11:06:58 forum kernel: [163904.358371] ps              D ffff881031516a00     0 27839  12383 0x00000104
Jan 28 11:06:58 forum kernel: [163904.358374]  ffff8803260dbbe8 0000000000000082 ffff880fea3be740 ffff880f28cabb00
Jan 28 11:06:58 forum kernel: [163904.358376]  0000000000000246 ffff8803260dc000 ffff8803260dbc38 ffff880fd5130a30
Jan 28 11:06:58 forum kernel: [163904.358377]  ffff880fd825c840 fffffffffffffe00 ffff8803260dbc08 ffffffff81804257
Jan 28 11:06:58 forum kernel: [163904.358378] Call Trace:
Jan 28 11:06:58 forum kernel: [163904.358383]  [<ffffffff81804257>] schedule+0x37/0x80
Jan 28 11:06:58 forum kernel: [163904.358386]  [<ffffffff812fa8d3>] request_wait_answer+0x163/0x280
Jan 28 11:06:58 forum kernel: [163904.358388]  [<ffffffff810bd790>] ? wait_woken+0x90/0x90
Jan 28 11:06:58 forum kernel: [163904.358390]  [<ffffffff812faa80>] __fuse_request_send+0x90/0xa0
Jan 28 11:06:58 forum kernel: [163904.358391]  [<ffffffff812faab7>] fuse_request_send+0x27/0x30
Jan 28 11:06:58 forum kernel: [163904.358392]  [<ffffffff81305198>] fuse_direct_io+0x3a8/0x5b0
Jan 28 11:06:58 forum kernel: [163904.358393]  [<ffffffff813053e4>] __fuse_direct_read+0x44/0x60
Jan 28 11:06:58 forum kernel: [163904.358395]  [<ffffffff81305440>] fuse_direct_read_iter+0x40/0x60
Jan 28 11:06:58 forum kernel: [163904.358397]  [<ffffffff811fca34>] new_sync_read+0x94/0xd0
Jan 28 11:06:58 forum kernel: [163904.358398]  [<ffffffff811fca96>] __vfs_read+0x26/0x40
Jan 28 11:06:58 forum kernel: [163904.358399]  [<ffffffff811fd0ea>] vfs_read+0x8a/0x130
Jan 28 11:06:58 forum kernel: [163904.358400]  [<ffffffff811fdfb5>] SyS_read+0x55/0xc0
Jan 28 11:06:58 forum kernel: [163904.358402]  [<ffffffff81808372>] entry_SYSCALL_64_fastpath+0x16/0x75
Jan 28 11:08:58 forum kernel: [164024.171239] INFO: task ps:27839 blocked for more than 120 seconds.
Jan 28 11:08:58 forum kernel: [164024.171267]       Tainted: P           O    4.2.6-1-pve #1
Jan 28 11:08:58 forum kernel: [164024.171290] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 28 11:08:58 forum kernel: [164024.171331] ps              D ffff881031516a00     0 27839  12383 0x00000104
Jan 28 11:08:58 forum kernel: [164024.171334]  ffff8803260dbbe8 0000000000000082 ffff880fea3be740 ffff880f28cabb00
Jan 28 11:08:58 forum kernel: [164024.171335]  0000000000000246 ffff8803260dc000 ffff8803260dbc38 ffff880fd5130a30
Jan 28 11:08:58 forum kernel: [164024.171336]  ffff880fd825c840 fffffffffffffe00 ffff8803260dbc08 ffffffff81804257
Jan 28 11:08:58 forum kernel: [164024.171338] Call Trace:
Jan 28 11:08:58 forum kernel: [164024.171343]  [<ffffffff81804257>] schedule+0x37/0x80
Jan 28 11:08:58 forum kernel: [164024.171345]  [<ffffffff812fa8d3>] request_wait_answer+0x163/0x280
Jan 28 11:08:58 forum kernel: [164024.171348]  [<ffffffff810bd790>] ? wait_woken+0x90/0x90
Jan 28 11:08:58 forum kernel: [164024.171350]  [<ffffffff812faa80>] __fuse_request_send+0x90/0xa0
Jan 28 11:08:58 forum kernel: [164024.171351]  [<ffffffff812faab7>] fuse_request_send+0x27/0x30
Jan 28 11:08:58 forum kernel: [164024.171352]  [<ffffffff81305198>] fuse_direct_io+0x3a8/0x5b0
Jan 28 11:08:58 forum kernel: [164024.171353]  [<ffffffff813053e4>] __fuse_direct_read+0x44/0x60
Jan 28 11:08:58 forum kernel: [164024.171354]  [<ffffffff81305440>] fuse_direct_read_iter+0x40/0x60
Jan 28 11:08:58 forum kernel: [164024.171356]  [<ffffffff811fca34>] new_sync_read+0x94/0xd0
Jan 28 11:08:58 forum kernel: [164024.171357]  [<ffffffff811fca96>] __vfs_read+0x26/0x40
Jan 28 11:08:58 forum kernel: [164024.171358]  [<ffffffff811fd0ea>] vfs_read+0x8a/0x130
Jan 28 11:08:58 forum kernel: [164024.171359]  [<ffffffff811fdfb5>] SyS_read+0x55/0xc0
Jan 28 11:08:58 forum kernel: [164024.171361]  [<ffffffff81808372>] entry_SYSCALL_64_fastpath+0x16/0x75
[/quote]
 
Code:
Jan 28 12:10:58 jessie kernel: [164143.984328]       Tainted: P           O    4.2.6-1-pve #1
Jan 28 12:10:58 jessie kernel: [164143.984395]  ffff8803260dbbe8 0000000000000082 ffff880fea3be740 ffff880f28cabb00
Jan 28 12:10:58 jessie kernel: [164143.984398]  ffff880fd825c840 fffffffffffffe00 ffff8803260dbc08 ffffffff81804257
Jan 28 12:10:58 jessie kernel: [164143.984404]  [<ffffffff81804257>] schedule+0x37/0x80
Jan 28 12:10:58 jessie kernel: [164143.984410]  [<ffffffff810bd790>] ? wait_woken+0x90/0x90
Jan 28 12:10:58 jessie kernel: [164143.984413]  [<ffffffff812faab7>] fuse_request_send+0x27/0x30
Jan 28 12:10:58 jessie kernel: [164143.984415]  [<ffffffff813053e4>] __fuse_direct_read+0x44/0x60
Jan 28 12:10:58 jessie kernel: [164143.984418]  [<ffffffff811fca34>] new_sync_read+0x94/0xd0
Jan 28 12:10:58 jessie kernel: [164143.984420]  [<ffffffff811fd0ea>] vfs_read+0x8a/0x130
Jan 28 12:10:58 jessie kernel: [164143.984423]  [<ffffffff81808372>] entry_SYSCALL_64_fastpath+0x16/0x75

Jan 28 15:30:30 jessie kernel: [176097.293811]  [<ffffffff8104d943>] start_secondary+0x183/0x1c0
Jan 28 15:30:30 jessie kernel: [176097.293812] NMI backtrace for cpu 3
Jan 28 15:30:30 jessie kernel: [176097.293814] task: ffff880fe45a5880 ti: ffff880e2bf80000 task.ti: ffff880e2bf80000
Jan 28 15:30:30 jessie kernel: [176097.293816] RAX: 0000000000000000 RBX: 0000000000080000 RCX: 0000000000000100
Jan 28 15:30:30 jessie kernel: [176097.293817] R10: 0000000000000004 R11: 0000000000000002 R12: 00000000000000ff
Jan 28 15:30:30 jessie kernel: [176097.293819] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 28 15:30:30 jessie kernel: [176097.293821] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 28 15:30:30 jessie kernel: [176097.293823]  000000022bf83dc8 000000000000e540 0000000000000003 0000000000000003
Jan 28 15:30:30 jessie kernel: [176097.293824]  [<ffffffff8105649e>] __x2apic_send_IPI_mask+0x13e/0x160
Jan 28 15:30:30 jessie kernel: [176097.293826]  [<ffffffff814cbb13>] sysrq_handle_showallcpus+0x13/0x20
Jan 28 15:30:30 jessie kernel: [176097.293827]  [<ffffffff81268438>] proc_reg_write+0x48/0x70
Jan 28 15:30:30 jessie kernel: [176097.293829]  [<ffffffff811fa1da>] ? filp_close+0x5a/0x80
Jan 28 15:30:30 jessie kernel: [176097.293831]  [<ffffffff81808372>] entry_SYSCALL_64_fastpath+0x16/0x75
Jan 28 15:30:30 jessie kernel: [176097.293833] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P           O    4.2.6-1-pve #1
Jan 28 15:30:30 jessie kernel: [176097.293835] RIP: 0010:[<ffffffff814581ef>]  [<ffffffff814581ef>] intel_idle+0xcf/0x140
Jan 28 15:30:30 jessie kernel: [176097.293836] RDX: 0000000000000000 RSI: ffffffff81eb9680 RDI: 0000000001e0d000
Jan 28 15:30:30 jessie kernel: [176097.293838] R13: 0000000000000005 R14: 0000000000000006 R15: ffff880fea3dc000
Jan 28 15:30:30 jessie kernel: [176097.293840] CR2: 00007fcfb5ddc000 CR3: 0000000001e0d000 CR4: 00000000003426e0
Jan 28 15:30:30 jessie kernel: [176097.293842] Stack:
Jan 28 15:30:30 jessie kernel: [176097.293843]  00ff881031513b00 ffffffff81eb98d8 00000000d641c7a8 ffffffff81f3ff60
Jan 28 15:30:30 jessie kernel: [176097.293845]  [<ffffffff8168b917>] cpuidle_enter+0x17/0x20
Jan 28 15:30:30 jessie kernel: [176097.293847]  [<ffffffff810bdeb7>] cpu_startup_entry+0x297/0x360
Jan 28 15:30:30 jessie kernel: [176097.293848] NMI backtrace for cpu 5
Jan 28 15:30:30 jessie kernel: [176097.293850] task: ffff880fea3e8000 ti: ffff880fea3f0000 task.ti: ffff880fea3f0000
Jan 28 15:30:30 jessie kernel: [176097.293852] RAX: 0000000000000040 RBX: 0000000000000020 RCX: 0000000000000001
Jan 28 15:30:30 jessie kernel: [176097.293854] R10: 00000001029fa943 R11: 0000000000001691 R12: 0000000000000040
Jan 28 15:30:30 jessie kernel: [176097.293856] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 28 15:30:30 jessie kernel: [176097.293857] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 28 15:30:30 jessie kernel: [176097.293859]  ffff880fea3f3e88 ffffffff8168b775 ffff88103155e200 0000a068d643ed64
Jan 28 15:30:30 jessie kernel: [176097.293861]  [<ffffffff8168b775>] cpuidle_enter_state+0xb5/0x220
Jan 28 15:30:30 jessie kernel: [176097.293862]  [<ffffffff8168b8f3>] ? cpuidle_select+0x13/0x20
Jan 28 15:30:30 jessie kernel: [176097.293864] Code: 3f 01 00 48 89 d1 48 2d f8 3f 00 00 0f 01 c8 65 48 8b 04 25 04 3f 01 00 48 8b 80 08 c0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <65> 48 8b 04 25 04 3f 01 00 f0 80 a0 0a c0 ff ff df 0f ae f0 65 
Jan 28 15:30:30 jessie kernel: [176097.293865] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P           O    4.2.6-1-pve #1
Jan 28 15:30:30 jessie kernel: [176097.293867] task: ffff880fea3e8ec0 ti: ffff880fea3f4000 task.ti: ffff880fea3f4000
Jan 28 15:30:30 jessie kernel: [176097.293868] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
Jan 28 15:30:30 jessie kernel: [176097.293870] R10: 00000001029fa911 R11: 000000000000282f R12: 0000000000000020
Jan 28 15:30:30 jessie kernel: [176097.293872] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 28 15:30:30 jessie kernel: [176097.293874] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 28 15:30:30 jessie kernel: [176097.293875]  ffff880fea3f7e88 ffffffff8168b775 ffff88103159e200 0000a068d6cfb87a
Jan 28 15:30:30 jessie kernel: [176097.293877]  [<ffffffff8168b775>] cpuidle_enter_state+0xb5/0x220
Jan 28 15:30:30 jessie kernel: [176097.293878]  [<ffffffff8168b8f3>] ? cpuidle_select+0x13/0x20
Jan 28 15:30:30 jessie kernel: [176097.293880] Code: 3f 01 00 48 89 d1 48 2d f8 3f 00 00 0f 01 c8 65 48 8b 04 25 04 3f 01 00 48 8b 80 08 c0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <65> 48 8b 04 25 04 3f 01 00 f0 80 a0 0a c0 ff ff df 0f ae f0 65 
Jan 28 15:30:30 jessie kernel: [176097.293882] Hardware name: FUJITSU D3401-H1/D3401-H1, BIOS V5.0.0.11 R1.7.0.SR.2 for D3401-H1x                11/25/2015
Jan 28 15:30:30 jessie kernel: [176097.293883] RIP: 0010:[<ffffffff814581ef>]  [<ffffffff814581ef>] intel_idle+0xcf/0x140
Jan 28 15:30:30 jessie kernel: [176097.293885] RAX: 0000000000000040 RBX: 0000000000000020 RCX: 0000000000000001
Jan 28 15:30:30 jessie kernel: [176097.293886] RBP: ffff880fea3fbe28 R08: 00000000004b1df8 R09: 0000000000000018
Jan 28 15:30:30 jessie kernel: [176097.293887] R13: 0000000000000005 R14: 0000000000000006 R15: ffff880fea3f8000
Jan 28 15:30:30 jessie kernel: [176097.293888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 28 15:30:30 jessie kernel: [176097.293889] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 28 15:30:30 jessie kernel: [176097.293890] Stack:
Jan 28 15:30:30 jessie kernel: [176097.293891]  ffff880fea3fbe88 ffffffff8168b775 ffff8810315de200 0000a068d647f7c2
Jan 28 15:30:30 jessie kernel: [176097.293893] Call Trace:
Jan 28 15:30:30 jessie kernel: [176097.293894]  [<ffffffff8168b917>] cpuidle_enter+0x17/0x20
Jan 28 15:30:30 jessie kernel: [176097.293895]  [<ffffffff8168b8f3>] ? cpuidle_select+0x13/0x20
Jan 28 15:30:30 jessie kernel: [176097.293896]  [<ffffffff8104d943>] start_secondary+0x183/0x1c0


Code:
proxmox-ve: 4.1-34 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-5 (running version: 4.1-5/f910ef5c)
pve-kernel-4.2.6-1-pve: 4.2.6-34
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-31
qemu-server: 4.0-49
pve-firmware: 1.1-7
libpve-common-perl: 4.0-45
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-3
pve-container: 1.0-39
pve-firewall: 2.0-15
pve-ha-manager: 1.0-19
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-6
lxcfs: 0.13-pve3
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5.2-2
 
cause:
node: nobody 8579 0.0 0.0 8684 780 ? D 18:30 0:00 \_ ps --no-header -eo s
VM: nobody 8580 0.0 0.0 8684 780 ? D 17:30 0:00 ps --no-header -eo s

guess it's an issue with running Munin inside a VM. Stopped the service and the issue hasn't come up since on 5 servers (all same configuration, same VMs for testing)