I see this in syslog just before the web interface froze:
Note: The above output is from syslog, I removed the repeating timestamps to adhere to the 10k character forum post limit.
Both htop & iotop respond normally, system load on PVE is only at 4.05.
There's no iotop on the NFS box, and it's htop also responds, says it's load is 10.30.
I restarted pvedaemon & it didn't help.
I've been having I/O errors
(see this thread: http://forum.proxmox.com/threads/4385-ext3-I-O-error)
preventing backups from completing, so I'm doing them manually now in preparation for a PVE re-install.
Would someone kindly explain what this is & if it's related?
Code:
Aug 19 09:23:00 bascule kernel: INFO: task kswapd0:84 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kswapd0 D 0000000000000000 0 84 2 0x00000000
kernel: ffff88085b0a1750 0000000000000046 0000000000000000 0000000000000000
kernel: 0000000000000004 ffff8808596be400 ffff880859124740 0000000000000010
kernel: ffff88085b0a16e0 000000000000fb08 ffff88085b0a1fd8 ffff88085b0aade0
kernel: Call Trace:
kernel: [<ffffffff810c9d2d>] ? cpu_quiet_msk+0x7d/0x130
kernel: [<ffffffff8156de02>] io_schedule+0x52/0x70
kernel: [<ffffffffa02d349e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
kernel: [<ffffffff8156e662>] __wait_on_bit+0x62/0x90
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffff8156e709>] out_of_line_wait_on_bit+0x79/0x90
kernel: [<ffffffff81085be0>] ? wake_bit_function+0x0/0x50
kernel: [<ffffffffa02d347f>] nfs_wait_on_request+0x2f/0x40 [nfs]
kernel: [<ffffffffa02d8a93>] nfs_sync_mapping_wait+0x113/0x260 [nfs]
kernel: [<ffffffffa02d8c6b>] nfs_wb_page+0x8b/0xf0 [nfs]
kernel: [<ffffffffa02c7b00>] nfs_release_page+0x60/0x80 [nfs]
kernel: [<ffffffff810f3592>] try_to_release_page+0x32/0x60
kernel: [<ffffffff81101b4d>] shrink_page_list+0x57d/0x840
kernel: [<ffffffff8113d2d3>] ? mem_cgroup_del_lru_list+0x23/0xb0
kernel: [<ffffffff8113d3d9>] ? mem_cgroup_del_lru+0x39/0x40
kernel: [<ffffffff811010a8>] ? isolate_pages_global+0x198/0x290
kernel: [<ffffffff8110247b>] shrink_list+0x2fb/0x8d0
kernel: [<ffffffff810fd087>] ? get_dirty_limits+0x27/0x2d0
kernel: [<ffffffff81102dfa>] shrink_zone+0x3aa/0x550
kernel: [<ffffffff81103cbd>] kswapd+0x70d/0x800
kernel: [<ffffffff81100f10>] ? isolate_pages_global+0x0/0x290
kernel: [<ffffffff81085ba0>] ? autoremove_wake_function+0x0/0x40
kernel: [<ffffffff811035b0>] ? kswapd+0x0/0x800
kernel: [<ffffffff811035b0>] ? kswapd+0x0/0x800
kernel: [<ffffffff810857f6>] kthread+0x96/0xb0
kernel: [<ffffffff8101422a>] child_rip+0xa/0x20
kernel: [<ffffffff81085760>] ? kthread+0x0/0xb0
Aug 19 09:23:00 bascule kernel: [<ffffffff81014220>] ? child_rip+0x0/0x20
Aug 19 09:23:10 bascule proxwww[21639]: Starting new child 21639
Aug 19 09:23:34 bascule proxwww[21682]: Starting new child 21682
Aug 19 09:23:37 bascule proxwww[21686]: Starting new child 21686
Aug 19 09:23:44 bascule proxwww[21698]: Starting new child 21698
Aug 19 09:25:00 bascule kernel: INFO: task kswapd0:84 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kswapd0 D 0000000000000000 0 84 2 0x00000000
kernel: ffff88085b0a1750 0000000000000046 0000000000000000 0000000000000000
kernel: 0000000000000004 ffff8808596be400 ffff880859124740 0000000000000010
kernel: ffff88085b0a16e0 000000000000fb08 ffff88085b0a1fd8 ffff88085b0aade0
kernel: Call Trace:
kernel: [<ffffffff810c9d2d>] ? cpu_quiet_msk+0x7d/0x130
kernel: [<ffffffff8156de02>] io_schedule+0x52/0x70
kernel: [<ffffffffa02d349e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
kernel: [<ffffffff8156e662>] __wait_on_bit+0x62/0x90
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffff8156e709>] out_of_line_wait_on_bit+0x79/0x90
kernel: [<ffffffff81085be0>] ? wake_bit_function+0x0/0x50
kernel: [<ffffffffa02d347f>] nfs_wait_on_request+0x2f/0x40 [nfs]
kernel: [<ffffffffa02d8a93>] nfs_sync_mapping_wait+0x113/0x260 [nfs]
kernel: [<ffffffffa02d8c6b>] nfs_wb_page+0x8b/0xf0 [nfs]
kernel: [<ffffffffa02c7b00>] nfs_release_page+0x60/0x80 [nfs]
kernel: [<ffffffff810f3592>] try_to_release_page+0x32/0x60
kernel: [<ffffffff81101b4d>] shrink_page_list+0x57d/0x840
kernel: [<ffffffff8113d2d3>] ? mem_cgroup_del_lru_list+0x23/0xb0
kernel: [<ffffffff8113d3d9>] ? mem_cgroup_del_lru+0x39/0x40
kernel: [<ffffffff811010a8>] ? isolate_pages_global+0x198/0x290
kernel: [<ffffffff8110247b>] shrink_list+0x2fb/0x8d0
kernel: [<ffffffff810fd087>] ? get_dirty_limits+0x27/0x2d0
kernel: [<ffffffff81102dfa>] shrink_zone+0x3aa/0x550
kernel: [<ffffffff81103cbd>] kswapd+0x70d/0x800
kernel: [<ffffffff81100f10>] ? isolate_pages_global+0x0/0x290
kernel: [<ffffffff81085ba0>] ? autoremove_wake_function+0x0/0x40
kernel: [<ffffffff811035b0>] ? kswapd+0x0/0x800
kernel: [<ffffffff811035b0>] ? kswapd+0x0/0x800
kernel: [<ffffffff810857f6>] kthread+0x96/0xb0
kernel: [<ffffffff8101422a>] child_rip+0xa/0x20
kernel: [<ffffffff81085760>] ? kthread+0x0/0xb0
kernel: [<ffffffff81014220>] ? child_rip+0x0/0x20
kernel: INFO: task cstream:20710 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: cstream D ffff88002830fc60 0 20710 20708 0x00000000
kernel: ffff880027837148 0000000000000082 ffff880027837118 ffff880027837114
kernel: 0000000000000000 0000000000000000 ffff8800278370e8 0000000000000097
kernel: 0000000000000000 000000000000fb08 ffff880027837fd8 ffff880858405bc0
kernel: Call Trace:
kernel: [<ffffffff8156de02>] io_schedule+0x52/0x70
kernel: [<ffffffffa02d349e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
kernel: [<ffffffff8156e662>] __wait_on_bit+0x62/0x90
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffffa02d3490>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
kernel: [<ffffffff8156e709>] out_of_line_wait_on_bit+0x79/0x90
kernel: [<ffffffff81085be0>] ? wake_bit_function+0x0/0x50
kernel: [<ffffffffa02d347f>] nfs_wait_on_request+0x2f/0x40 [nfs]
kernel: [<ffffffffa02d8a93>] nfs_sync_mapping_wait+0x113/0x260 [nfs]
kernel: [<ffffffffa02d8c6b>] nfs_wb_page+0x8b/0xf0 [nfs]
kernel: [<ffffffffa02c7b00>] nfs_release_page+0x60/0x80 [nfs]
kernel: [<ffffffff810f3592>] try_to_release_page+0x32/0x60
kernel: [<ffffffff81101b4d>] shrink_page_list+0x57d/0x840
kernel: [<ffffffff8113d2d3>] ? mem_cgroup_del_lru_list+0x23/0xb0
kernel: [<ffffffff8113d3d9>] ? mem_cgroup_del_lru+0x39/0x40
kernel: [<ffffffff811010a8>] ? isolate_pages_global+0x198/0x290
kernel: [<ffffffff8110247b>] shrink_list+0x2fb/0x8d0
kernel: [<ffffffff8113d3d9>] ? mem_cgroup_del_lru+0x39/0x40
kernel: [<ffffffff81101001>] ? isolate_pages_global+0xf1/0x290
kernel: [<ffffffff81102dfa>] shrink_zone+0x3aa/0x550
kernel: [<ffffffff810f99e7>] ? get_page_from_freelist+0x157/0x850
kernel: [<ffffffff81104222>] do_try_to_free_pages+0xc2/0x3c0
kernel: [<ffffffff81104636>] try_to_free_pages+0x76/0x80
kernel: [<ffffffff81100f10>] ? isolate_pages_global+0x0/0x290
kernel: [<ffffffff810fa601>] __alloc_pages_nodemask+0x3f1/0x700
kernel: [<ffffffff8112b8fc>] alloc_pages_current+0x8c/0xe0
kernel: [<ffffffff811336f7>] new_slab+0x247/0x300
kernel: [<ffffffff81135d37>] __slab_alloc+0x137/0x480
kernel: [<ffffffff812b96eb>] ? radix_tree_preload+0x3b/0xb0
kernel: [<ffffffff812b96eb>] ? radix_tree_preload+0x3b/0xb0
kernel: [<ffffffff811362fa>] kmem_cache_alloc+0x12a/0x140
kernel: [<ffffffff812b96eb>] radix_tree_preload+0x3b/0xb0
kernel: [<ffffffff810f46fa>] add_to_page_cache_locked+0x7a/0x160
kernel: [<ffffffff810f480e>] add_to_page_cache_lru+0x2e/0x90
kernel: [<ffffffff810f5b89>] grab_cache_page_write_begin+0x99/0xc0
kernel: [<ffffffffa02d9158>] ? nfs_updatepage+0x1f8/0x570 [nfs]
kernel: [<ffffffffa02c7bfc>] nfs_write_begin+0x7c/0x1f0 [nfs]
kernel: [<ffffffff810f4cd6>] generic_file_buffered_write+0x116/0x290
kernel: [<ffffffff810f6299>] __generic_file_aio_write+0x259/0x470
kernel: [<ffffffff810f6512>] generic_file_aio_write+0x62/0xd0
kernel: [<ffffffffa02c89d6>] nfs_file_write+0x136/0x210 [nfs]
kernel: [<ffffffff811443c9>] do_sync_write+0xf9/0x140
kernel: [<ffffffff81085ba0>] ? autoremove_wake_function+0x0/0x40
kernel: [<ffffffff8156d728>] ? thread_return+0x51/0x6d9
kernel: [<ffffffff81253526>] ? security_file_permission+0x16/0x20
kernel: [<ffffffff81144a3b>] vfs_write+0xcb/0x1a0
kernel: [<ffffffff81144c05>] sys_write+0x55/0x90
Aug 19 09:25:00 bascule kernel: [<ffffffff810131f2>] system_call_fastpath+0x16/0x1b
Both htop & iotop respond normally, system load on PVE is only at 4.05.
There's no iotop on the NFS box, and it's htop also responds, says it's load is 10.30.
I restarted pvedaemon & it didn't help.
I've been having I/O errors
(see this thread: http://forum.proxmox.com/threads/4385-ext3-I-O-error)
preventing backups from completing, so I'm doing them manually now in preparation for a PVE re-install.
Would someone kindly explain what this is & if it's related?