Hi there,
I've been using proxmox for couple years without any major issue. I've just replaced my proxmox server with a brand new Intel Xeon CPU E31220, RAM 8GB. Since then I've experienced multpile crashes. Here is the message displayed on console:
Could the "writeback*" entries suggest some kind of disk controller issue (the machine has a Smart Array G6 controller) ?
Any help appreciated...
I've been using proxmox for couple years without any major issue. I've just replaced my proxmox server with a brand new Intel Xeon CPU E31220, RAM 8GB. Since then I've experienced multpile crashes. Here is the message displayed on console:
Code:
Feb 14 10:06:24 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 102 socket - timeout after 31 retries
Feb 14 10:06:27 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 101 socket - timeout after 31 retries
Feb 14 10:06:30 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 100 socket - timeout after 31 retries
Feb 14 10:06:30 proxmox-plateforme pvestatd[1915]: status update time (12.053 seconds)
Feb 14 10:06:33 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 103 socket - timeout after 31 retries
Feb 14 10:06:36 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 102 socket - timeout after 31 retries
Feb 14 10:06:39 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 101 socket - timeout after 31 retries
Feb 14 10:06:42 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 100 socket - timeout after 31 retries
Feb 14 10:06:42 proxmox-plateforme pvestatd[1915]: status update time (12.052 seconds)
Feb 14 10:06:45 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 103 socket - timeout after 31 retries
Feb 14 10:06:48 proxmox-plateforme kernel: INFO: task scsi_eh_1:305 blocked for more than 120 seconds.
Feb 14 10:06:48 proxmox-plateforme kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 14 10:06:48 proxmox-plateforme kernel: scsi_eh_1 D ffff88020a24afd0 0 305 2 0 0x00000000
Feb 14 10:06:48 proxmox-plateforme kernel: ffff8802095b1bb0 0000000000000046 ffffffffa000a456 ffff88000001a680
Feb 14 10:06:48 proxmox-plateforme kernel: ffff8802095b1cb0 0000000000000000 0000000000000001 ffff880000033a00
Feb 14 10:06:48 proxmox-plateforme kernel: ffff88020e0b6090 ffff88020a24b580 ffff8802095b1fd8 ffff8802095b1fd8
Feb 14 10:06:48 proxmox-plateforme kernel: Call Trace:
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8109cacf>] ? up+0x2f/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528ce5>] schedule_timeout+0x215/0x2e0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528953>] wait_for_common+0x123/0x190
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81059ed0>] ? default_wake_function+0x0/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa0001b4b>] ? enqueue_cmd_and_start_io+0x11b/0x180 [hpsa]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528a7d>] wait_for_completion+0x1d/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa0004016>] hpsa_eh_device_reset_handler+0x116/0x3c0 [hpsa]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8136f5cc>] scsi_eh_ready_devs+0x23c/0x860
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff813702e3>] scsi_error_handler+0x4f3/0x6d0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8136fdf0>] ? scsi_error_handler+0x0/0x6d0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff810964f6>] kthread+0x96/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096460>] ? kthread+0x0/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: INFO: task kjournald:386 blocked for more than 120 seconds.
Feb 14 10:06:48 proxmox-plateforme kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 14 10:06:48 proxmox-plateforme kernel: kjournald D ffff880209aa0200 0 386 2 0 0x00000000
Feb 14 10:06:48 proxmox-plateforme kernel: ffff880209a99c50 0000000000000046 ffff880209a99c10 ffffffff8141e08c
Feb 14 10:06:48 proxmox-plateforme kernel: ffff880209a99bc0 ffffffff81012b79 ffff880209a99c00 ffffffff810a1959
Feb 14 10:06:48 proxmox-plateforme kernel: 0000000003d3d0e8 ffff880209aa07b0 ffff880209a99fd8 ffff880209a99fd8
Feb 14 10:06:48 proxmox-plateforme kernel: Call Trace:
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8141e08c>] ? dm_table_unplug_all+0x5c/0x100
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81012b79>] ? read_tsc+0x9/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff810a1959>] ? ktime_get_ts+0xa9/0xe0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caab0>] ? sync_buffer+0x0/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff815285f3>] io_schedule+0x73/0xc0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caaf5>] sync_buffer+0x45/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528fbf>] __wait_on_bit+0x5f/0x90
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caab0>] ? sync_buffer+0x0/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81529068>] out_of_line_wait_on_bit+0x78/0x90
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096b10>] ? wake_bit_function+0x0/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cbce6>] __wait_on_buffer+0x26/0x30
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa00b1f1e>] journal_commit_transaction+0x9fe/0x12f0 [jbd]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8107fbfc>] ? lock_timer_base+0x3c/0x70
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8108085b>] ? try_to_del_timer_sync+0x7b/0xe0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa00b7358>] kjournald+0xe8/0x250 [jbd]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096ad0>] ? autoremove_wake_function+0x0/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa00b7270>] ? kjournald+0x0/0x250 [jbd]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff810964f6>] kthread+0x96/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096460>] ? kthread+0x0/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: INFO: task flush-253:0:920 blocked for more than 120 seconds.
Feb 14 10:06:48 proxmox-plateforme kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 14 10:06:48 proxmox-plateforme kernel: flush-253:0 D ffff88020cc46d10 0 920 2 0 0x00000000
Feb 14 10:06:48 proxmox-plateforme kernel: ffff88020d2d77b0 0000000000000046 0000000000000000 ffffffff8141e08c
Feb 14 10:06:48 proxmox-plateforme kernel: ffff8802094e6338 0000000000000008 0000000003d2ba98 0000000000800000
Feb 14 10:06:48 proxmox-plateforme kernel: ffff88020d2d7800 ffff88020cc472c0 ffff88020d2d7fd8 ffff88020d2d7fd8
Feb 14 10:06:48 proxmox-plateforme kernel: Call Trace:
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8141e08c>] ? dm_table_unplug_all+0x5c/0x100
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caab0>] ? sync_buffer+0x0/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff815285f3>] io_schedule+0x73/0xc0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caaf5>] sync_buffer+0x45/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528e6a>] __wait_on_bit_lock+0x5a/0xc0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caab0>] ? sync_buffer+0x0/0x50
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81528f48>] out_of_line_wait_on_bit_lock+0x78/0x90
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096b10>] ? wake_bit_function+0x0/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113df1c>] ? test_clear_page_writeback+0x8c/0x180
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cc0a0>] ? end_buffer_async_write+0x0/0x180
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cbe66>] __lock_buffer+0x36/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cca04>] __block_write_full_page+0x484/0x4b0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81125f34>] ? end_page_writeback+0x44/0x60
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cc0a0>] ? end_buffer_async_write+0x0/0x180
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caa40>] ? generic_submit_bh_handler+0x0/0x10
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caa40>] ? generic_submit_bh_handler+0x0/0x10
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811caa40>] ? generic_submit_bh_handler+0x0/0x10
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cd4d7>] generic_block_write_full_page+0x137/0x140
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cd4f8>] block_write_full_page_endio+0x18/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811cd515>] block_write_full_page+0x15/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffffa00d37ad>] ext3_ordered_writepage+0x1ed/0x240 [ext3]
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113be47>] __writepage+0x17/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113cd4b>] write_cache_pages+0x1cb/0x480
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113be30>] ? __writepage+0x0/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113d024>] generic_writepages+0x24/0x30
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8113d065>] do_writepages+0x35/0x40
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c195d>] __writeback_single_inode+0xdd/0x290
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c1b4a>] writeback_single_inode+0x3a/0xc0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c1e41>] writeback_sb_inodes+0xf1/0x210
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c20b0>] writeback_inodes_wb+0x150/0x1a0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c23db>] wb_writeback+0x2db/0x430
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81527e54>] ? thread_return+0xba/0x7e6
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c26d9>] wb_do_writeback+0x1a9/0x250
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8107fd10>] ? process_timeout+0x0/0x10
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811c27e3>] bdi_writeback_task+0x63/0x1b0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff810969a7>] ? bit_waitqueue+0x17/0xc0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811511f0>] ? bdi_start_fn+0x0/0x110
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81151285>] bdi_start_fn+0x95/0x110
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff811511f0>] ? bdi_start_fn+0x0/0x110
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff810964f6>] kthread+0x96/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff81096460>] ? kthread+0x0/0xa0
Feb 14 10:06:48 proxmox-plateforme kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Feb 14 10:06:48 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 102 socket - timeout after 31 retries
Feb 14 10:06:51 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 101 socket - timeout after 31 retries
Feb 14 10:06:54 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 100 socket - timeout after 31 retries
Feb 14 10:06:54 proxmox-plateforme pvestatd[1915]: status update time (12.053 seconds)
Feb 14 10:06:57 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 103 socket - timeout after 31 retries
Feb 14 10:07:00 proxmox-plateforme pvestatd[1915]: WARNING: unable to connect to VM 102 socket - timeout after 31 retries
Could the "writeback*" entries suggest some kind of disk controller issue (the machine has a Smart Array G6 controller) ?
Any help appreciated...