After upgrade from 2.3 to 3.0 first day of work was fine.
But on the second day i got this.
By the way, server still worked before i saw this, but almost all of operations like ps or top hangs after executing.
And a little bit more additional information:
This is two node cluser.
Both nodes have similar installed software, like below:
It looks like a old bug from here
https://bugzilla.redhat.com/show_bug.cgi?id=603938
or here
http://bugs.centos.org/view.php?id=4515
So i think problem is in kernel, and update is strongly recommended.
But on the second day i got this.
Code:
May 27 23:00:01 idol vzdump[431722]: INFO: Starting Backup of VM 136 (openvz)May 27 23:00:02 idol kernel: EXT3-fs: barriers disabled
May 27 23:00:02 idol kernel: kjournald starting. Commit interval 5 seconds
May 27 23:00:02 idol kernel: EXT3-fs (dm-5): using internal journal
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 324636380
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 174129940
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 181264390
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 401885131
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 401885132
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 401885164
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 284287008
May 27 23:00:02 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 284286987
<skipped>
May 27 23:00:03 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 172935472
May 27 23:00:03 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 172935471
May 27 23:00:03 idol kernel: ext3_orphan_cleanup: deleting unreferenced inode 172935470
May 27 23:00:03 idol kernel: EXT3-fs (dm-5): 134 orphan inodes deleted
May 27 23:00:03 idol kernel: EXT3-fs (dm-5): recovery complete
May 27 23:00:03 idol kernel: EXT3-fs (dm-5): mounted filesystem with ordered data mode
May 27 23:03:48 idol pvestatd[4067]: VM 112 qmp command failed - VM 112 qmp command 'balloon' failed - got timeout
May 27 23:03:48 idol pvestatd[4067]: WARNING: VM 112 qmp command 'balloon' failed - got timeout
May 27 23:03:49 idol pvestatd[4067]: status update time (6.851 seconds)
May 27 23:03:55 idol pvestatd[4067]: WARNING: unable to connect to VM 104 socket - timeout after 31 retries
May 27 23:03:58 idol pvestatd[4067]: WARNING: unable to connect to VM 120 socket - timeout after 31 retries
May 27 23:04:02 idol pvestatd[4067]: status update time (9.855 seconds)
May 27 23:04:05 idol pvestatd[4067]: WARNING: unable to connect to VM 112 socket - timeout after 31 retries
May 27 23:04:08 idol pvestatd[4067]: WARNING: unable to connect to VM 104 socket - timeout after 31 retries
May 27 23:04:11 idol pvestatd[4067]: WARNING: unable to connect to VM 120 socket - timeout after 31 retries
May 27 23:04:14 idol pvestatd[4067]: WARNING: unable to connect to VM 132 socket - timeout after 31 retries
May 27 23:04:15 idol pvestatd[4067]: status update time (12.884 seconds)
May 27 23:04:18 idol pvestatd[4067]: WARNING: unable to connect to VM 112 socket - timeout after 31 retries
May 27 23:04:21 idol pvestatd[4067]: WARNING: unable to connect to VM 104 socket - timeout after 31 retries
May 27 23:04:24 idol pvestatd[4067]: WARNING: unable to connect to VM 120 socket - timeout after 31 retries
May 27 23:04:27 idol pvestatd[4067]: WARNING: unable to connect to VM 132 socket - timeout after 31 retries
May 27 23:04:28 idol pvestatd[4067]: status update time (12.845 seconds)
May 27 23:04:31 idol pvestatd[4067]: WARNING: unable to connect to VM 112 socket - timeout after 31 retries
May 27 23:04:34 idol pvestatd[4067]: WARNING: unable to connect to VM 104 socket - timeout after 31 retries
<skipped>
May 27 23:06:11 idol pvestatd[4067]: status update time (12.873 seconds)
May 27 23:06:14 idol pvestatd[4067]: WARNING: unable to connect to VM 112 socket - timeout after 31 retries
May 27 23:06:17 idol pvestatd[4067]: WARNING: unable to connect to VM 104 socket - timeout after 31 retries
May 27 23:06:17 idol kernel: INFO: task kjournald:1928 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: kjournald D ffff8824697e3220 0 1928 2 0 0x00000000
May 27 23:06:17 idol kernel: ffff882453c17c70 0000000000000046 0000000000000000 ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff882453c17c30 ffffffff8140b5ec ffff882453c17c20 ffffffff8109f5b3
May 27 23:06:17 idol kernel: ffff882453c17c20 0000000000000282 ffff882453c17fd8 ffff882453c17fd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff8109f5b3>] ? ktime_get_ts+0xb3/0xf0
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c77a0>] sync_buffer+0x40/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c470>] __wait_on_bit+0x60/0x90
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c51c>] out_of_line_wait_on_bit+0x7c/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff811c80f6>] __wait_on_buffer+0x26/0x30
May 27 23:06:17 idol kernel: [<ffffffffa01c7100>] journal_commit_transaction+0xc00/0x1200 [jbd]
May 27 23:06:17 idol kernel: [<ffffffff8107c8d8>] ? lock_timer_base.isra.76+0x38/0x70
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff8107e3c4>] ? try_to_del_timer_sync+0x84/0xe0
May 27 23:06:17 idol kernel: [<ffffffffa01ccd05>] kjournald+0xe5/0x230 [jbd]
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffffa01ccc20>] ? kjournald+0x0/0x230 [jbd]
May 27 23:06:17 idol kernel: [<ffffffff81094d68>] kthread+0x88/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100c22a>] child_rip+0xa/0x20
May 27 23:06:17 idol kernel: [<ffffffff81094ce0>] ? kthread+0x0/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100c220>] ? child_rip+0x0/0x20
May 27 23:06:17 idol kernel: INFO: task flush-253:3:2163 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: flush-253:3 D ffff88246bab8340 0 2163 2 0 0x00000000
May 27 23:06:17 idol kernel: ffff8824539d37b0 0000000000000046 0000000000000000 ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff8824539d3770 ffffffff8140b5ec ffff8824539d3740 ffff882469cb3800
May 27 23:06:17 idol kernel: ffff88246ae57200 00000001071d9d68 ffff8824539d3fd8 ffff8824539d3fd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c77a0>] sync_buffer+0x40/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c31b>] __wait_on_bit_lock+0x5b/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c3fc>] out_of_line_wait_on_bit_lock+0x7c/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff811c92f0>] ? end_buffer_async_write+0x0/0x180
May 27 23:06:17 idol kernel: [<ffffffff811c8136>] __lock_buffer+0x36/0x40
May 27 23:06:17 idol kernel: [<ffffffff811c9e85>] __block_write_full_page+0x185/0x4e0
May 27 23:06:17 idol kernel: [<ffffffff81045550>] ? flush_tlb_others_ipi+0x120/0x130
May 27 23:06:17 idol kernel: [<ffffffff811c92f0>] ? end_buffer_async_write+0x0/0x180
May 27 23:06:17 idol kernel: [<ffffffff811c76f0>] ? generic_submit_bh_handler+0x0/0x10
May 27 23:06:17 idol kernel: [<ffffffff8126f250>] ? prio_tree_next+0x70/0x170
May 27 23:06:17 idol kernel: [<ffffffff81045658>] ? flush_tlb_page+0x48/0xb0
May 27 23:06:17 idol kernel: [<ffffffff811ca686>] generic_block_write_full_page+0x106/0x110
May 27 23:06:17 idol kernel: [<ffffffff811ca6a8>] block_write_full_page_endio+0x18/0x20
May 27 23:06:17 idol kernel: [<ffffffff811ca6c5>] block_write_full_page+0x15/0x20
May 27 23:06:17 idol kernel: [<ffffffffa01df3dd>] ext3_ordered_writepage+0x1ed/0x240 [ext3]
May 27 23:06:17 idol kernel: [<ffffffff81137887>] __writepage+0x17/0x40
May 27 23:06:17 idol kernel: [<ffffffff811380b6>] write_cache_pages+0x1f6/0x440
May 27 23:06:17 idol kernel: [<ffffffff81137870>] ? __writepage+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff81138324>] generic_writepages+0x24/0x30
May 27 23:06:17 idol kernel: [<ffffffff81138dbd>] do_writepages+0x3d/0x50
May 27 23:06:17 idol kernel: [<ffffffff811be472>] __writeback_single_inode+0xb2/0x280
May 27 23:06:17 idol kernel: [<ffffffff811be6c3>] writeback_single_inode+0x83/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811ae370>] ? iput+0x30/0x70
May 27 23:06:17 idol kernel: [<ffffffff811bf233>] writeback_sb_inodes+0x103/0x1e0
May 27 23:06:17 idol kernel: [<ffffffff811bf45f>] writeback_inodes_wb+0xff/0x170
May 27 23:06:17 idol kernel: [<ffffffff811bf78b>] wb_writeback+0x2bb/0x3f0
May 27 23:06:17 idol kernel: [<ffffffff811bfa46>] wb_do_writeback+0x186/0x240
May 27 23:06:17 idol kernel: [<ffffffff8107b900>] ? process_timeout+0x0/0x10
May 27 23:06:17 idol kernel: [<ffffffff811bfb5b>] bdi_writeback_task+0x5b/0x1b0
May 27 23:06:17 idol kernel: [<ffffffff81094fa7>] ? bit_waitqueue+0x17/0xc0
May 27 23:06:17 idol kernel: [<ffffffff8114d4d2>] bdi_start_fn+0x92/0x100
May 27 23:06:17 idol kernel: [<ffffffff8114d440>] ? bdi_start_fn+0x0/0x100
May 27 23:06:17 idol kernel: [<ffffffff81094d68>] kthread+0x88/0x90
May 27 23:06:17 idol kernel: [<ffffffff810096d2>] ? __switch_to+0xc2/0x2f0
May 27 23:06:17 idol kernel: [<ffffffff8100c22a>] child_rip+0xa/0x20
May 27 23:06:17 idol kernel: [<ffffffff81094ce0>] ? kthread+0x0/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100c220>] ? child_rip+0x0/0x20
May 27 23:06:17 idol kernel: INFO: task mysqld:5279 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: mysqld D ffff88243d899020 0 5279 4297 140 0x00000000
May 27 23:06:17 idol kernel: ffff88242e87bce8 0000000000000082 0000000000000000 ffffffff8140b5ec
May 27 23:06:17 idol kernel: ffff88242e87bca8 ffffffff811208b3 ffff88242e87bc68 0000000000000286
May 27 23:06:17 idol kernel: ffff88242e87bc98 00000001071d8c83 ffff88242e87bfd8 ffff88242e87bfd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff811208b3>] ? find_get_pages_tag+0x43/0x130
May 27 23:06:17 idol kernel: [<ffffffff8111fe90>] ? sync_page+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff8111fecb>] sync_page+0x3b/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c470>] __wait_on_bit+0x60/0x90
May 27 23:06:17 idol kernel: [<ffffffff81120060>] wait_on_page_bit+0x80/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8112039a>] wait_on_page_writeback_range.part.36+0xea/0x180
May 27 23:06:17 idol kernel: [<ffffffff81120d45>] wait_on_page_writeback_range+0x15/0x20
May 27 23:06:17 idol kernel: [<ffffffff81120dbf>] filemap_write_and_wait_range+0x6f/0x80
May 27 23:06:17 idol kernel: [<ffffffff811c4333>] vfs_fsync_range+0xa3/0x180
May 27 23:06:17 idol kernel: [<ffffffff811c447d>] vfs_fsync+0x1d/0x20
May 27 23:06:17 idol kernel: [<ffffffff811c46a6>] do_fsync+0x66/0xa0
May 27 23:06:17 idol kernel: [<ffffffff811c4e40>] sys_fsync+0x10/0x20
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:17 idol kernel: INFO: task mysqld:29906 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: mysqld D ffff882323f96780 0 29906 4297 140 0x00000000
May 27 23:06:17 idol kernel: ffff88231eb23ce8 0000000000000082 0000000000000000 ffffffff8140b5ec
May 27 23:06:17 idol kernel: ffff88231eb23ca8 ffffffff811208b3 ffff88231eb23c68 0000000000000286
May 27 23:06:17 idol kernel: ffff88231eb23c98 00000001071d957c ffff88231eb23fd8 ffff88231eb23fd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff811208b3>] ? find_get_pages_tag+0x43/0x130
May 27 23:06:17 idol kernel: [<ffffffff8111fe90>] ? sync_page+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff8111fecb>] sync_page+0x3b/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c470>] __wait_on_bit+0x60/0x90
May 27 23:06:17 idol kernel: [<ffffffff81120060>] wait_on_page_bit+0x80/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8112039a>] wait_on_page_writeback_range.part.36+0xea/0x180
May 27 23:06:17 idol kernel: [<ffffffff81120d45>] wait_on_page_writeback_range+0x15/0x20
May 27 23:06:17 idol kernel: [<ffffffff81120dbf>] filemap_write_and_wait_range+0x6f/0x80
May 27 23:06:17 idol kernel: [<ffffffff811c4333>] vfs_fsync_range+0xa3/0x180
May 27 23:06:17 idol kernel: [<ffffffff811c447d>] vfs_fsync+0x1d/0x20
May 27 23:06:17 idol kernel: [<ffffffff811c46a6>] do_fsync+0x66/0xa0
May 27 23:06:17 idol kernel: [<ffffffff811c4e40>] sys_fsync+0x10/0x20
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:17 idol kernel: INFO: task mysqld:29918 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: mysqld D ffff882337eb81c0 0 29918 4297 140 0x00000000
May 27 23:06:17 idol kernel: ffff88231eb69938 0000000000000082 0000000000000000 ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff88231eb698f8 ffffffff8140b5ec 00000000001e9cb8 ffff88014a650e50
May 27 23:06:17 idol kernel: 00000000000353fa 00000001071d9c16 ffff88231eb69fd8 ffff88231eb69fd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c77a0>] sync_buffer+0x40/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c31b>] __wait_on_bit_lock+0x5b/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c3fc>] out_of_line_wait_on_bit_lock+0x7c/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff811c8136>] __lock_buffer+0x36/0x40
May 27 23:06:17 idol kernel: [<ffffffff811c87a5>] __sync_dirty_buffer+0x35/0xe0
May 27 23:06:17 idol kernel: [<ffffffff811c8863>] sync_dirty_buffer+0x13/0x20
May 27 23:06:17 idol kernel: [<ffffffffa01c597a>] journal_dirty_data+0x19a/0x230 [jbd]
May 27 23:06:17 idol kernel: [<ffffffffa01dfdc0>] ext3_journal_dirty_data+0x20/0x50 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dfe15>] journal_dirty_data_fn+0x25/0x30 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dea6d>] walk_page_buffers+0x8d/0xd0 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dfdf0>] ? journal_dirty_data_fn+0x0/0x30 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01e32bb>] ext3_ordered_write_end+0xab/0x1c0 [ext3]
May 27 23:06:17 idol kernel: [<ffffffff81120f45>] generic_file_buffered_write_iter+0x175/0x2a0
May 27 23:06:17 idol kernel: [<ffffffff810727e6>] ? current_fs_time+0x16/0x60
May 27 23:06:17 idol kernel: [<ffffffff811213e8>] __generic_file_write_iter+0x188/0x380
May 27 23:06:17 idol kernel: [<ffffffff81121663>] __generic_file_aio_write+0x83/0xa0
May 27 23:06:17 idol kernel: [<ffffffff81121708>] generic_file_aio_write+0x88/0x100
May 27 23:06:17 idol kernel: [<ffffffff81192a1e>] do_sync_write+0xfe/0x140
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff81278157>] ? __strncpy_from_user+0x27/0x60
May 27 23:06:17 idol kernel: [<ffffffff811931b1>] vfs_write+0xa1/0x190
May 27 23:06:17 idol kernel: [<ffffffff8119350a>] sys_write+0x4a/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:17 idol kernel: INFO: task nginx:18752 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: nginx D ffff882381ad2140 0 18752 18748 114 0x00000000
May 27 23:06:17 idol kernel: ffff8823802ed938 0000000000000082 ffff8823802ed8fc ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff8823802ed8f8 ffffffff8160bbc0 000000000005ce65 ffff88014a791070
May 27 23:06:17 idol kernel: 0000000000000000 00000001071da1bd ffff8823802edfd8 ffff8823802edfd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c77a0>] sync_buffer+0x40/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c31b>] __wait_on_bit_lock+0x5b/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811c7760>] ? sync_buffer+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff8151c3fc>] out_of_line_wait_on_bit_lock+0x7c/0x90
May 27 23:06:17 idol kernel: [<ffffffff81095340>] ? wake_bit_function+0x0/0x50
May 27 23:06:17 idol kernel: [<ffffffff811c8136>] __lock_buffer+0x36/0x40
May 27 23:06:17 idol kernel: [<ffffffff811c87a5>] __sync_dirty_buffer+0x35/0xe0
May 27 23:06:17 idol kernel: [<ffffffff811c8863>] sync_dirty_buffer+0x13/0x20
May 27 23:06:17 idol kernel: [<ffffffffa01c597a>] journal_dirty_data+0x19a/0x230 [jbd]
May 27 23:06:17 idol kernel: [<ffffffffa01dfdc0>] ext3_journal_dirty_data+0x20/0x50 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dfe15>] journal_dirty_data_fn+0x25/0x30 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dea6d>] walk_page_buffers+0x8d/0xd0 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01dfdf0>] ? journal_dirty_data_fn+0x0/0x30 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01e32bb>] ext3_ordered_write_end+0xab/0x1c0 [ext3]
May 27 23:06:17 idol kernel: [<ffffffff81120f45>] generic_file_buffered_write_iter+0x175/0x2a0
May 27 23:06:17 idol kernel: [<ffffffff811213e8>] __generic_file_write_iter+0x188/0x380
May 27 23:06:17 idol kernel: [<ffffffff81121663>] __generic_file_aio_write+0x83/0xa0
May 27 23:06:17 idol kernel: [<ffffffff81121708>] generic_file_aio_write+0x88/0x100
May 27 23:06:17 idol kernel: [<ffffffff81192a1e>] do_sync_write+0xfe/0x140
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff81194bd9>] ? __fput+0x199/0x220
May 27 23:06:17 idol kernel: [<ffffffff811931b1>] vfs_write+0xa1/0x190
May 27 23:06:17 idol kernel: [<ffffffff8119350a>] sys_write+0x4a/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:17 idol kernel: INFO: task kvm:438885 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: kvm D ffff88213e87a200 0 438885 1 0 0x00000000
May 27 23:06:17 idol kernel: ffff881853787a68 0000000000000082 0000000000000000 ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff881853787a28 ffffffff8140b5ec 0000000000000001 0000000000000200
May 27 23:06:17 idol kernel: 0000000000000000 00000001071ddbd0 ffff881853787fd8 ffff881853787fd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811d1bcc>] __blockdev_direct_IO_newtrunc+0xa8c/0xce0
May 27 23:06:17 idol kernel: [<ffffffff811d1e7c>] __blockdev_direct_IO+0x5c/0xd0
May 27 23:06:17 idol kernel: [<ffffffffa01e2100>] ? ext3_get_block+0x0/0x130 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01e118a>] ext3_direct_IO+0xea/0x330 [ext3]
May 27 23:06:17 idol kernel: [<ffffffffa01e2100>] ? ext3_get_block+0x0/0x130 [ext3]
May 27 23:06:17 idol kernel: [<ffffffff811200b8>] mapping_direct_IO.isra.25+0x48/0x70
May 27 23:06:17 idol kernel: [<ffffffff81123542>] generic_file_read_iter+0x622/0x6a0
May 27 23:06:17 idol kernel: [<ffffffff8100bcce>] ? apic_timer_interrupt+0xe/0x20
May 27 23:06:17 idol kernel: [<ffffffff8111fbcf>] ? generic_segment_checks+0x7f/0xb0
May 27 23:06:17 idol kernel: [<ffffffff8112365f>] generic_file_aio_read+0x9f/0xb0
May 27 23:06:17 idol kernel: [<ffffffff81192b5e>] do_sync_read+0xfe/0x140
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff8107068e>] ? do_exit+0x46e/0x900
May 27 23:06:17 idol kernel: [<ffffffff8119333e>] vfs_read+0x9e/0x190
May 27 23:06:17 idol kernel: [<ffffffff811935ca>] sys_pread64+0x7a/0xa0
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:17 idol kernel: INFO: task lvremove:438062 blocked for more than 120 seconds.
May 27 23:06:17 idol kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 27 23:06:17 idol kernel: lvremove D ffff8819e14493e0 0 438062 431722 0 0x00000000
May 27 23:06:17 idol kernel: ffff88182368fa88 0000000000000082 ffffea00444f9d80 ffff8823d1151ac0
May 27 23:06:17 idol kernel: ffff88182368fa48 ffffffff8140b5ec 0000000000000008 0000000000001000
May 27 23:06:17 idol kernel: 0000000000000000 00000001071d90ef ffff88182368ffd8 ffff88182368ffd8
May 27 23:06:17 idol kernel: Call Trace:
May 27 23:06:17 idol kernel: [<ffffffff8140b5ec>] ? dm_table_unplug_all+0x5c/0x100
May 27 23:06:17 idol kernel: [<ffffffff8151b563>] io_schedule+0x73/0xc0
May 27 23:06:17 idol kernel: [<ffffffff811d1bcc>] __blockdev_direct_IO_newtrunc+0xa8c/0xce0
May 27 23:06:17 idol kernel: [<ffffffff81120346>] ? wait_on_page_writeback_range.part.36+0x96/0x180
May 27 23:06:17 idol kernel: [<ffffffff811d1e7c>] __blockdev_direct_IO+0x5c/0xd0
May 27 23:06:17 idol kernel: [<ffffffff811cec90>] ? blkdev_get_blocks+0x0/0xd0
May 27 23:06:17 idol kernel: [<ffffffff811cef07>] blkdev_direct_IO+0x57/0x60
May 27 23:06:17 idol kernel: [<ffffffff811cec90>] ? blkdev_get_blocks+0x0/0xd0
May 27 23:06:17 idol kernel: [<ffffffff811200b8>] mapping_direct_IO.isra.25+0x48/0x70
May 27 23:06:17 idol kernel: [<ffffffff81123542>] generic_file_read_iter+0x622/0x6a0
May 27 23:06:17 idol kernel: [<ffffffff8126c817>] ? kobject_put+0x27/0x60
May 27 23:06:17 idol kernel: [<ffffffff8140a710>] ? dm_blk_open+0x70/0x80
May 27 23:06:17 idol kernel: [<ffffffff811cfd69>] ? __blkdev_get+0x1b9/0x3e0
May 27 23:06:17 idol kernel: [<ffffffff8112365f>] generic_file_aio_read+0x9f/0xb0
May 27 23:06:17 idol kernel: [<ffffffff811cef52>] blkdev_aio_read+0x42/0x90
May 27 23:06:17 idol kernel: [<ffffffff811917e4>] ? nameidata_to_filp+0x44/0x60
May 27 23:06:17 idol kernel: [<ffffffff81192b5e>] do_sync_read+0xfe/0x140
May 27 23:06:17 idol kernel: [<ffffffff81095300>] ? autoremove_wake_function+0x0/0x40
May 27 23:06:17 idol kernel: [<ffffffff811a6afa>] ? vfs_ioctl+0x2a/0xa0
May 27 23:06:17 idol kernel: [<ffffffff811a712e>] ? do_vfs_ioctl+0x7e/0x570
May 27 23:06:17 idol kernel: [<ffffffff8151c56d>] ? mutex_lock+0x1d/0x50
May 27 23:06:17 idol kernel: [<ffffffff8119333e>] vfs_read+0x9e/0x190
May 27 23:06:17 idol kernel: [<ffffffff8119347a>] sys_read+0x4a/0x90
May 27 23:06:17 idol kernel: [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
May 27 23:06:20 idol pvestatd[4067]: WARNING: unable to connect to VM 120 socket - timeout after 31 retries
By the way, server still worked before i saw this, but almost all of operations like ps or top hangs after executing.
And a little bit more additional information:
This is two node cluser.
Code:
root@idol:/var/log# pvecm status
Version: 6.2.0
Config Version: 2
Cluster Name: XXX
Cluster Id: 571
Cluster Member: Yes
Cluster Generation: 212
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: idol
Node ID: 1
Multicast addresses: x.x.x.x
Node addresses: x.x.x.x
Both nodes have similar installed software, like below:
Code:
root@idol:/var/log# pveversion -v
pve-manager: 3.0-20 (pve-manager/3.0/0428106c)
running kernel: 2.6.32-20-pve
proxmox-ve-2.6.32: 3.0-100
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-15
pve-firmware: 1.0-22
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-6
vncterm: 1.1-3
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-12
ksm-control-daemon: 1.1-1
It looks like a old bug from here
https://bugzilla.redhat.com/show_bug.cgi?id=603938
or here
http://bugs.centos.org/view.php?id=4515
So i think problem is in kernel, and update is strongly recommended.
Last edited: