kernel:BUG: soft lockup - CPU#6 stuck for 67s!

ictdude

Active Member
May 18, 2008
88
0
26
I am running Proxmox V 3.1-12/93bf03d4

And lately i get error:

Message from syslogd: kernel:BUG: soft lockup - CPU#6 stuck for 67s! [kswapd0:140]

At that moment i cant login to the Gui of proxmox. And need to do a reboot.
What can be the problem ? And how to diagnose this ?

Any Advise ? Is this software related ?
 
sometimes we see this problem and connected this to problem with cheap network adapters, which have only two msix lines.
 
I saw this only happening on lxc nodes (completly taking them down)

From the logs, i only saw:
Code:
Feb  6 19:50:50 dx411-s11 kernel: [564033.729200]  0000000000000286 00000000b1327e45 ffff8829c95e3c90 ffffffff813f9523
Feb  6 19:50:50 dx411-s11 kernel: [564033.729208]  ffff8829c95e3cc8 ffffffff81191ffb ffff882f2c0c5400 ffff882f2c0c5400
Feb  6 19:50:50 dx411-s11 kernel: [564033.729227]  [<ffffffff813f9523>] dump_stack+0x63/0x90
Feb  6 19:50:50 dx411-s11 kernel: [564033.729240]  [<ffffffff81191ffb>] ? find_lock_task_mm+0x3b/0x80
Feb  6 19:50:50 dx411-s11 kernel: [564033.729249]  [<ffffffff811fe78f>] ? mem_cgroup_iter+0x1cf/0x380
Feb  6 19:50:50 dx411-s11 kernel: [564033.729258]  [<ffffffff812014f7>] mem_cgroup_oom_synchronize+0x347/0x360
Feb  6 19:50:50 dx411-s11 kernel: [564033.729267]  [<ffffffff81192cc4>] pagefault_out_of_memory+0x44/0xc0
Feb  6 19:50:50 dx411-s11 kernel: [564033.729276]  [<ffffffff8106b723>] __do_page_fault+0x3e3/0x410
Feb  6 19:50:50 dx411-s11 kernel: [564033.729286]  [<ffffffff8185e3f8>] page_fault+0x28/0x30
Feb  6 19:50:50 dx411-s11 kernel: [564033.729297] memory: usage 1047064kB, limit 1048576kB, failcnt 763346
Feb  6 19:50:50 dx411-s11 kernel: [564033.729302] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Feb  6 19:50:50 dx411-s11 kernel: [564033.729337] Memory cgroup stats for /lxc/3493/user.slice: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Feb  6 19:50:50 dx411-s11 kernel: [564033.729393] Memory cgroup stats for /lxc/3493/user.slice/user-0.slice/session-c173.scope: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Feb  6 19:50:50 dx411-s11 kernel: [564033.729456] Memory cgroup stats for /lxc/3493/user.slice/user-0.slice/session-c176.scope: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Feb  6 19:50:50 dx411-s11 kernel: [564033.729513] Memory cgroup stats for /lxc/3493/user.slice/user-0.slice/session-c327.scope: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB

(repeating many times)

And soon after

Code:
Feb  6 19:56:29 dx411-s11 kernel: [564373.050767] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [bash:27082]
Feb  6 19:56:29 dx411-s11 kernel: [564373.050835] Modules linked in: xt_recent tcp_diag inet_diag nfnetlink_queue bluetooth dm_snapshot ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security$
Feb  6 19:56:29 dx411-s11 kernel: [564373.050941] CPU: 31 PID: 27082 Comm: bash Tainted: P  O L  4.4.35-2-pve #1
Feb  6 19:56:29 dx411-s11 kernel: [564373.050943] Hardware name: Dell Inc. PowerEdge R420/0K29HN, BIOS 2.4.2 01/29/2015
Feb  6 19:56:29 dx411-s11 kernel: [564373.050945] task: ffff882c4a470e00 ti: ffff88300f22c000 task.ti: ffff88300f22c000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050946] RIP: 0010:[<ffffffff811a1e26>]  [<ffffffff811a1e26>] get_lru_size+0x16/0x40
Feb  6 19:56:29 dx411-s11 kernel: [564373.050956] RSP: 0000:ffff88300f22f748  EFLAGS: 00000282
Feb  6 19:56:29 dx411-s11 kernel: [564373.050957] RAX: 0000000000000000 RBX: ffff88300f22f968 RCX: 0000000000000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050958] RDX: ffff88187fffbf80 RSI: 0000000000000001 RDI: ffff88153d5be340
Feb  6 19:56:29 dx411-s11 kernel: [564373.050959] RBP: ffff88300f22f850 R08: 0000000000000000 R09: 0000000000000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050960] R10: 000000000231571c R11: 0000000000000333 R12: 0000000000000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050961] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88300f22f888
Feb  6 19:56:29 dx411-s11 kernel: [564373.050963] FS:  00007fb6b5eb7700(0000) GS:ffff88301f3c0000(0000) knlGS:0000000000000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050964] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  6 19:56:29 dx411-s11 kernel: [564373.050966] CR2: 0000000000499050 CR3: 0000002c900e4000 CR4: 00000000001426e0
Feb  6 19:56:29 dx411-s11 kernel: [564373.050967] Stack:
Feb  6 19:56:29 dx411-s11 kernel: [564373.050969]  ffffffff811a5a3f ffff882dbbda0400 0000000000000000 0000000000000020
Feb  6 19:56:29 dx411-s11 kernel: [564373.050971]  ffff882dbbb41400 ffff882c00000003 ffff882f12fe0001 ffff882c00000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050973]  ffff88153d5be340 0000000000000000 0000000000000000 0000000000000000
Feb  6 19:56:29 dx411-s11 kernel: [564373.050975] Call Trace:
Feb  6 19:56:29 dx411-s11 kernel: [564373.050980]  [<ffffffff811a5a3f>] ? shrink_lruvec+0x12f/0x750
Feb  6 19:56:29 dx411-s11 kernel: [564373.050987]  [<ffffffff81116d56>] ? css_next_descendant_pre+0x46/0x60
Feb  6 19:56:29 dx411-s11 kernel: [564373.050992]  [<ffffffff811fe78f>] ? mem_cgroup_iter+0x1cf/0x380
Feb  6 19:56:29 dx411-s11 kernel: [564373.050995]  [<ffffffff811a614b>] shrink_zone+0xeb/0x2d0
Feb  6 19:56:29 dx411-s11 kernel: [564373.050997]  [<ffffffff811a64b3>] do_try_to_free_pages+0x183/0x480
Feb  6 19:56:29 dx411-s11 kernel: [564373.051000]  [<ffffffff811a69f4>] try_to_free_mem_cgroup_pages+0xc4/0x1a0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051003]  [<ffffffff811fdd26>] try_charge+0x1a6/0x680
Feb  6 19:56:29 dx411-s11 kernel: [564373.051007]  [<ffffffff81201fdc>] mem_cgroup_try_charge+0x9c/0x1b0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051010]  [<ffffffff8118f480>] __add_to_page_cache_locked+0x60/0x1f0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051012]  [<ffffffff8118f667>] add_to_page_cache_lru+0x37/0x90
Feb  6 19:56:29 dx411-s11 kernel: [564373.051016]  [<ffffffff812e88a4>] ext4_mpage_readpages+0x184/0x920
Feb  6 19:56:29 dx411-s11 kernel: [564373.051022]  [<ffffffff811e1f72>] ? alloc_pages_current+0x92/0x120
Feb  6 19:56:29 dx411-s11 kernel: [564373.051027]  [<ffffffff8129aef6>] ext4_readpages+0x36/0x40
Feb  6 19:56:29 dx411-s11 kernel: [564373.051032]  [<ffffffff8119d757>] __do_page_cache_readahead+0x197/0x230
Feb  6 19:56:29 dx411-s11 kernel: [564373.051034]  [<ffffffff8118f6ed>] ? pagecache_get_page+0x2d/0x1b0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051036]  [<ffffffff811912a0>] filemap_fault+0x360/0x3e0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051040]  [<ffffffff811ccc41>] ? page_add_file_rmap+0x51/0x60
Feb  6 19:56:29 dx411-s11 kernel: [564373.051043]  [<ffffffff812a4066>] ext4_filemap_fault+0x36/0x50
Feb  6 19:56:29 dx411-s11 kernel: [564373.051048]  [<ffffffff811bda10>] __do_fault+0x50/0xe0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051050]  [<ffffffff811c25e3>] handle_mm_fault+0x10f3/0x19c0
Feb  6 19:56:29 dx411-s11 kernel: [564373.051055]  [<ffffffff8106b772>] ? do_page_fault+0x22/0x30
Feb  6 19:56:29 dx411-s11 kernel: [564373.051060]  [<ffffffff8185e3f8>] ? page_fault+0x28/0x30
Feb  6 19:56:29 dx411-s11 kernel: [564373.051062]  [<ffffffff8106b4dd>] __do_page_fault+0x19d/0x410
Feb  6 19:56:29 dx411-s11 kernel: [564373.051067]  [<ffffffff81003885>] ? syscall_trace_enter_phase1+0xc5/0x140
Feb  6 19:56:29 dx411-s11 kernel: [564373.051069]  [<ffffffff8106b772>] do_page_fault+0x22/0x30
Feb  6 19:56:29 dx411-s11 kernel: [564373.051072]  [<ffffffff8185e3f8>] page_fault+0x28/0x30
 
please don't hijack unrelated old threads.. open a new one, with the complete logs and "pveversion -v" output included
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!