Kernel panics with 4.2

anigwei

Member
Nov 25, 2015
33
1
8
Barcelona (Catalunya)
about.me
Hi!

I'm deploying a new server (Intel S2600WT2R) and Proxmox 4.2. It has a Hardware (megaraid) RAID1 of 2.8Tb.

After doing intensive I/O (dumping initial VMs) I've seen some strange kernel panics and I don't know what is the cause. Any ideas?

Thank you!!

Entire log: http://pastebin.com/PLKnEABP

19.124998] vmbr0: port 1(eth0) entered forwarding state
[ 19.125007] vmbr0: port 1(eth0) entered forwarding state
[ 19.125147] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
[ 20.492941] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 20.524592] ip_set: protocol 6
[ 98.459628] device tap400i0 entered promiscuous mode
[ 98.464789] vmbr0: port 2(tap400i0) entered forwarding state
[ 98.464797] vmbr0: port 2(tap400i0) entered forwarding state
[ 101.937493] kvm: zapping shadow pages for mmio generation wraparound
[ 101.942927] kvm: zapping shadow pages for mmio generation wraparound
[ 106.693088] kvm [3151]: vcpu0 unhandled rdmsr: 0x570
[ 106.693248] kvm [3151]: vcpu1 unhandled rdmsr: 0x570
[ 106.693396] kvm [3151]: vcpu2 unhandled rdmsr: 0x570
[ 106.693493] kvm [3151]: vcpu3 unhandled rdmsr: 0x570
[ 106.693623] kvm [3151]: vcpu4 unhandled rdmsr: 0x570
[ 106.693788] kvm [3151]: vcpu5 unhandled rdmsr: 0x570
[ 480.529253] INFO: task lvs:3298 blocked for more than 120 seconds.
[ 480.529350] Tainted: P O 4.4.8-1-pve #1
[ 480.529440] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 480.529564] lvs D ffff8807c13c79f8 0 3298 2461 0x00000000
[ 480.529569] ffff8807c13c79f8 ffff880466ffe800 ffff88046c3b44c0 ffff88086589a940
[ 480.529572] ffff8807c13c8000 ffff88046e797180 7fffffffffffffff ffff88086589a940
[ 480.529573] ffff88045d8f0500 ffff8807c13c7a10 ffffffff818448a5 0000000000000000
[ 480.529575] Call Trace:
[ 480.529585] [<ffffffff818448a5>] schedule+0x35/0x80
[ 480.529588] [<ffffffff81847ae5>] schedule_timeout+0x235/0x2d0
[ 480.529593] [<ffffffff813bc570>] ? generic_make_request+0x110/0x1f0
[ 480.529595] [<ffffffff81843dbb>] io_schedule_timeout+0xbb/0x140
[ 480.529600] [<ffffffff8124c7bc>] do_blockdev_direct_IO+0x1b1c/0x2be0
[ 480.529606] [<ffffffff813cd001>] ? exact_lock+0x11/0x20
[ 480.529609] [<ffffffff81247640>] ? I_BDEV+0x20/0x20
[ 480.529611] [<ffffffff8124d8c3>] __blockdev_direct_IO+0x43/0x50
[ 480.529613] [<ffffffff81247d18>] blkdev_direct_IO+0x58/0x80
[ 480.529616] [<ffffffff8118ebdf>] generic_file_read_iter+0x46f/0x5c0
[ 480.529618] [<ffffffff812480e7>] blkdev_read_iter+0x37/0x40
[ 480.529623] [<ffffffff8120bfe4>] new_sync_read+0x94/0xd0
[ 480.529624] [<ffffffff8120c046>] __vfs_read+0x26/0x40
[ 480.529626] [<ffffffff8120c686>] vfs_read+0x86/0x130
[ 480.529629] [<ffffffff8120d4f5>] SyS_read+0x55/0xc0
[ 480.529631] [<ffffffff818489b6>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 548.495487] device tap100i0 entered promiscuous mode
 
Hi,

Something strange is happening related to storage...

A VM also gets that kind of panics! Related to jbd2/vda2-8 (Storage?).

[ 230.361136] sched: RT throttling activated
[ 360.048079] INFO: task jbd2/vda2-8:147 blocked for more than 120 seconds.
[ 360.048151] Not tainted 3.19.0-49-generic #55~14.04.1-Ubuntu
[ 360.048194] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.048242] jbd2/vda2-8 D ffff880036aa3b88 0 147 2 0x00000000
[ 360.048249] ffff880036aa3b88 ffff8800b8f613a0 0000000000013e80 ffff880036aa3fd8
[ 360.048251] 0000000000013e80 ffff880139b5a740 ffff8800b8f613a0 ffff880036aa3c30
[ 360.048253] ffff88013fc14778 ffff880036aa3c30 ffff88013ffc2898 0000000000000002
[ 360.048255] Call Trace:
[ 360.048287] [<ffffffff817b4380>] ? bit_wait+0x50/0x50
[ 360.048289] [<ffffffff817b3b50>] io_schedule+0xa0/0x130
[ 360.048291] [<ffffffff817b43ac>] bit_wait_io+0x2c/0x50
[ 360.048292] [<ffffffff817b3fe5>] __wait_on_bit+0x65/0x90
[ 360.048294] [<ffffffff817b4380>] ? bit_wait+0x50/0x50
[ 360.048296] [<ffffffff817b4082>] out_of_line_wait_on_bit+0x72/0x80
[ 360.048310] [<ffffffff810b4fa0>] ? autoremove_wake_function+0x40/0x40
[ 360.048320] [<ffffffff81220156>] __wait_on_buffer+0x36/0x40
[ 360.048326] [<ffffffff812bafbf>] jbd2_journal_commit_transaction+0x183f/0x1a80
[ 360.048331] [<ffffffff810dbeef>] ? try_to_del_timer_sync+0x4f/0x70
[ 360.048334] [<ffffffff812beb5b>] kjournald2+0xbb/0x240
[ 360.048336] [<ffffffff810b4f60>] ? prepare_to_wait_event+0x110/0x110
[ 360.048337] [<ffffffff812beaa0>] ? commit_timeout+0x10/0x10
[ 360.048344] [<ffffffff810938d2>] kthread+0xd2/0xf0
[ 360.048346] [<ffffffff81093800>] ? kthread_create_on_node+0x1c0/0x1c0
[ 360.048350] [<ffffffff817b7b58>] ret_from_fork+0x58/0x90
[ 360.048352] [<ffffffff81093800>] ? kthread_create_on_node+0x1c0/0x1c0
[ 360.048376] INFO: task java:1422 blocked for more than 120 seconds.
[ 360.048417] Not tainted 3.19.0-49-generic #55~14.04.1-Ubuntu
[ 360.048456] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 
those are not kernel panics, but the kernel tells you that a kernel task has hung (waited) for more than 2 minutes. usual this indicates either a deadlock in kernel space (which would be a bug) or severe I/O congestion (your case). if your host I/O is so congested that it comes to a crawl, it's only natural that VMs that use the same storage experience the same congestion ;)
 
those are not kernel panics, but the kernel tells you that a kernel task has hung (waited) for more than 2 minutes. usual this indicates either a deadlock in kernel space (which would be a bug) or severe I/O congestion (your case). if your host I/O is so congested that it comes to a crawl, it's only natural that VMs that use the same storage experience the same congestion ;)

Hi Fabian,

Thanks for your answer.

I was thinking of I/O congestion too.... but I have other servers (some older) and I've never seen I/O problems while transferring VM into them.

This is a brand new Intel S2600WT2R with a LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (Intel branded) and a RAID1 of SATA disks.

While transferring that VMS via NFS (and while that messages appeared), an IOSTAT showed me about 80Mb/s speed.

Thanks!