Hi,
Ich fasse mal kurz mein setup zusammen. Auf meinem Host läuft aktuell Debian 8.10 mit Proxmox Version 4.4-22/2728f613 und dem PVE kernel 4.4.114-108.
Auf dem host habe ich eine IPv6 route zu einer meiner container eingerichtet. In diesem Container betreibe ich derzeit accel-ppp in einer l2tp/ipsec konfiguration. Die clienten im tunnel erhalten alle eine IPv6 aus einem kleinen 122 subnet. (Öffentliche IPs) Im container richte ich für jede vergebene IPv6 Adresse einen Neighbor Discovery Proxy ein. Beispielsweise:
ip neigh add proxy 2001:41d0:3:ed13:1ccd:a2bb:f18b:10 dev eth0
Das ganze funktioniert auch super. Sobald allerdings ein einziger Neighbor Discovery Proxy hinzugefügt wurde und auch der client über den tunnel verbunden ist und der container nun heruntergefahren wird kommt es zur vollständigen auslastungen eines CPU kerns und der load average geht an die decke.
Der komplette host reagiert danach überhaupt nicht mehr.
Der Kernel spuckt dabei folgendes aus:
Ich fasse mal kurz mein setup zusammen. Auf meinem Host läuft aktuell Debian 8.10 mit Proxmox Version 4.4-22/2728f613 und dem PVE kernel 4.4.114-108.
Auf dem host habe ich eine IPv6 route zu einer meiner container eingerichtet. In diesem Container betreibe ich derzeit accel-ppp in einer l2tp/ipsec konfiguration. Die clienten im tunnel erhalten alle eine IPv6 aus einem kleinen 122 subnet. (Öffentliche IPs) Im container richte ich für jede vergebene IPv6 Adresse einen Neighbor Discovery Proxy ein. Beispielsweise:
ip neigh add proxy 2001:41d0:3:ed13:1ccd:a2bb:f18b:10 dev eth0
Das ganze funktioniert auch super. Sobald allerdings ein einziger Neighbor Discovery Proxy hinzugefügt wurde und auch der client über den tunnel verbunden ist und der container nun heruntergefahren wird kommt es zur vollständigen auslastungen eines CPU kerns und der load average geht an die decke.
Der komplette host reagiert danach überhaupt nicht mehr.
Der Kernel spuckt dabei folgendes aus:
Mar 28 22:15:08 jane pvedaemon[1865]: <root@pam> starting task UPID:jane:000057D2:0000D7D3:5ABBF7CC:vzshutdown:106:root@pam:
Mar 28 22:16:29 jane kernel: [ 634.115013] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:306 dev_watchdog+0x22e/0x240()
Mar 28 22:16:29 jane kernel: [ 634.125229] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.4.114-1-pve #1
Mar 28 22:16:29 jane kernel: [ 634.126805] 0000000000000286 acb48b767558c006 ffff88081f203da0 ffffffff81402343
Mar 28 22:16:29 jane kernel: [ 634.128318] 0000000000000000 ffff8807f606e280 0000000000000000 ffff8807f5a90000
Mar 28 22:16:29 jane kernel: [ 634.130512] [<ffffffff81083546>] warn_slowpath_common+0x86/0xc0
Mar 28 22:16:29 jane kernel: [ 634.133286] [<ffffffff810ef997>] call_timer_fn+0x37/0x140
Mar 28 22:16:29 jane kernel: [ 634.136083] [<ffffffff8108840e>] irq_exit+0x8e/0x90
Mar 28 22:16:29 jane kernel: [ 634.138809] [<ffffffff816eeb57>] cpuidle_enter+0x17/0x20
Mar 28 22:16:29 jane kernel: [ 634.141521] [<ffffffff818679ac>] rest_init+0x7c/0x80
Mar 28 22:16:29 jane kernel: [ 634.144303] [<ffffffff81f7a623>] x86_64_start_kernel+0x14a/0x16d
Mar 28 22:16:35 jane kernel: [ 640.382705] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lxc-start:13935]
Mar 28 22:16:35 jane kernel: [ 640.391995] Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012
Mar 28 22:16:35 jane kernel: [ 640.394264] RSP: 0018:ffff880035fdb450 EFLAGS: 00000286
Mar 28 22:16:35 jane kernel: [ 640.397343] R10: ffff8806ebe6e000 R11: ffff88074a14a900 R12: ffffffff81f19038
Mar 28 22:16:35 jane kernel: [ 640.400416] CR2: 00000000006cd000 CR3: 00000000bd164000 CR4: 0000000000160670
Mar 28 22:16:35 jane kernel: [ 640.403462] 0000000081834477 0000000000000010 ffff8806ebe6fc28 ffff8806ebe6e198
Mar 28 22:16:35 jane kernel: [ 640.406467] [<ffffffff81750304>] ? netif_rx_internal+0x44/0x110
Mar 28 22:16:35 jane kernel: [ 640.409410] [<ffffffff81800cc7>] ip6_output+0x57/0x110
Mar 28 22:16:35 jane kernel: [ 640.412298] [<ffffffff81822800>] ? ipv6_icmp_sysctl_init+0x40/0x40
Mar 28 22:16:35 jane kernel: [ 640.414473] [<ffffffff81825f13>] __ipv6_dev_mc_dec+0xc3/0x120
Mar 28 22:16:35 jane kernel: [ 640.416697] [<ffffffff8175e812>] neigh_ifdown+0xc2/0xf0
Mar 28 22:16:35 jane kernel: [ 640.419674] [<ffffffff8174a7f5>] call_netdevice_notifiers_info+0x35/0x60
Mar 28 22:16:35 jane kernel: [ 640.422348] [<ffffffff8174c89b>] unregister_netdevice_many+0x1b/0xa0
Mar 28 22:16:35 jane kernel: [ 640.424774] [<ffffffff8108cf83>] ? ns_capable+0x13/0x20
Mar 28 22:16:35 jane kernel: [ 640.426952] [<ffffffff81761e00>] ? rtnetlink_rcv+0x30/0x30
Mar 28 22:16:35 jane kernel: [ 640.428956] [<ffffffff817861fd>] netlink_sendmsg+0x34d/0x3c0
Mar 28 22:16:35 jane kernel: [ 640.430924] [<ffffffff813a78c3>] ? aa_sk_perm+0x73/0x220
Mar 28 22:16:35 jane kernel: [ 640.432760] [<ffffffff81735082>] SyS_sendmsg+0x12/0x20
Mar 28 22:16:43 jane kernel: [ 648.375844] Modules linked in: drbg ansi_cprng authenc echainiv esp4 xfrm4_mode_transport l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox cfg80211 xfrm_user xfrm_algo ipt_MASQUERADE nf_nat_masquerade_ipv4 binfmt_misc veth ip_set pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip6table_filter ip6_tables iptable_filter xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfnetlink_log nfnetlink i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel gpio_ich ppdev kvm video drm_kms_helper irqbypass crct10dif_pclmul drm crc32_pclmul<4>[ 648.382563] Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012
Mar 28 22:16:43 jane kernel: [ 648.386087] RAX: 0000000000000101 RBX: ffffffff81f19034 RCX: 0000000000000101
Mar 28 22:16:43 jane kernel: [ 648.389698] FS: 0000000000000000(0000) GS:ffff88081f240000(0000) knlGS:0000000000000000
Mar 28 22:16:43 jane kernel: [ 648.393432] ffff8807f5a8bde8 ffffffff81873e89 ffff8807f5a8be20 ffffffff8175e947
Mar 28 22:16:43 jane kernel: [ 648.397333] [<ffffffff8175e947>] neigh_periodic_work+0x37/0x1d0
Mar 28 22:16:43 jane kernel: [ 648.401193] [<ffffffff810a3560>] ? kthread_park+0x60/0x60
Mar 28 22:17:03 jane kernel: [ 668.385704] Modules linked in: drbg ansi_cprng authenc echainiv esp4 xfrm4_mode_transport l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox cfg80211 xfrm_user xfrm_algo ipt_MASQUERADE nf_nat_masquerade_ipv4 binfmt_misc veth ip_set pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip6table_filter ip6_tables iptable_filter xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfnetlink_log nfnetlink i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel gpio_ich ppdev kvm video drm_kms_helper irqbypass crct10dif_pclmul drm crc32_pclmul<4>[ 668.397213] RIP: 0010:[<ffffffff810cf3af>] [<ffffffff810cf3af>] queued_write_lock_slowpath+0x3f/0x90
Mar 28 22:17:03 jane kernel: [ 668.401323] R10: ffff8806ebe6e000 R11: ffff88074a14a900 R12: ffffffff81f19038
Mar 28 22:17:03 jane kernel: [ 668.405351] Stack:
Mar 28 22:17:03 jane kernel: [ 668.409399] [<ffffffff81873e89>] _raw_write_lock_bh+0x29/0x30
Mar 28 22:17:03 jane kernel: [ 668.413354] [<ffffffff81800c06>] ip6_finish_output+0xa6/0x110
Mar 28 22:17:03 jane kernel: [ 668.417229] [<ffffffff81822800>] ? ipv6_icmp_sysctl_init+0x40/0x40
Mar 28 22:17:03 jane kernel: [ 668.421298] [<ffffffff81817b6b>] pndisc_destructor+0x5b/0x80
Mar 28 22:17:03 jane kernel: [ 668.425009] [<ffffffff8174a7f5>] call_netdevice_notifiers_info+0x35/0x60
Mar 28 22:17:03 jane kernel: [ 668.428342] [<ffffffff8176125e>] rtnl_delete_link+0x4e/0x80
Mar 28 22:17:03 jane kernel: [ 668.431261] [<ffffffff811f6168>] ? __kmalloc_node_track_caller+0x258/0x310
Mar 28 22:17:03 jane kernel: [ 668.433988] [<ffffffff81785e00>] netlink_unicast+0x190/0x240
Mar 28 22:17:03 jane kernel: [ 668.436506] [<ffffffff813a78c3>] ? aa_sk_perm+0x73/0x220
Mar 28 22:17:03 jane kernel: [ 668.438866] [<ffffffff818741c8>] entry_SYSCALL_64_fastpath+0x1c/0xbb
Mar 28 22:17:08 jane kernel: [ 672.854205] (t=15000 jiffies g=68579 c=68578 q=105197)
Mar 28 22:17:08 jane kernel: [ 672.856278] 0000000000000003 ffffffff81e5cec0 ffff88081f2c3dc0 ffffffff810b4269
Mar 28 22:17:08 jane kernel: [ 672.858575] [<ffffffff810b4269>] dump_cpu_task+0x39/0x40
Mar 28 22:17:08 jane kernel: [ 672.860853] [<ffffffff810b4cf9>] ? account_system_time+0x79/0x120
Mar 28 22:17:08 jane kernel: [ 672.864377] [<ffffffff8110211d>] tick_sched_timer+0x3d/0x70
Mar 28 22:17:08 jane kernel: [ 672.867934] [<ffffffff81877803>] smp_apic_timer_interrupt+0x43/0x60
Mar 28 22:17:08 jane kernel: [ 672.872859] [<ffffffff81750304>] ? netif_rx_internal+0x44/0x110
Mar 28 22:17:08 jane kernel: [ 672.878161] [<ffffffff81788ea3>] ? nf_iterate+0x63/0x80
Mar 28 22:17:08 jane kernel: [ 672.883475] [<ffffffff81824706>] igmp6_group_dropped+0x116/0x220
Mar 28 22:17:08 jane kernel: [ 672.888691] [<ffffffff818195e3>] ndisc_netdev_event+0xa3/0xf0
Mar 28 22:17:08 jane kernel: [ 672.894010] [<ffffffff8174c616>] rollback_registered_many+0x136/0x330
Mar 28 22:17:08 jane kernel: [ 672.898905] [<ffffffff8138c26f>] ? aa_capable+0xff/0x3b0
Mar 28 22:17:08 jane kernel: [ 672.903567] [<ffffffff81761e00>] ? rtnetlink_rcv+0x30/0x30
Mar 28 22:17:08 jane kernel: [ 672.908105] [<ffffffff813a81d1>] ? aa_sock_msg_perm+0x61/0x150
Mar 28 22:17:08 jane kernel: [ 672.912611] [<ffffffff81732754>] ? move_addr_to_user+0xb4/0xd0
Mar 28 22:17:08 jane kernel: [ 672.916170] INFO: rcu_bh self-detected stall on CPU
Mar 28 22:17:08 jane kernel: [ 672.916176] ffff8806f2556200 c807cdebe85a131f ffff88081f2c3da8 ffffffff810b1abf
Mar 28 22:17:08 jane kernel: [ 672.916182] [<ffffffff810b4269>] dump_cpu_task+0x39/0x40
Mar 28 22:17:08 jane kernel: [ 672.916188] [<ffffffff811020e0>] ? tick_sched_do_timer+0x30/0x30
Mar 28 22:17:08 jane kernel: [ 672.916194] [<ffffffff810f30c8>] hrtimer_interrupt+0xa8/0x1a0
Mar 28 22:17:08 jane kernel: [ 672.916201] [<ffffffff81873e89>] _raw_write_lock_bh+0x29/0x30
Mar 28 22:17:08 jane kernel: [ 672.916207] [<ffffffff81800c06>] ip6_finish_output+0xa6/0x110
Mar 28 22:17:08 jane kernel: [ 672.916213] [<ffffffff81822800>] ? ipv6_icmp_sysctl_init+0x40/0x40
Mar 28 22:17:08 jane kernel: [ 672.916220] [<ffffffff81817b6b>] pndisc_destructor+0x5b/0x80
Mar 28 22:17:08 jane kernel: [ 672.916225] [<ffffffff8174a7f5>] call_netdevice_notifiers_info+0x35/0x60
Mar 28 22:17:08 jane kernel: [ 672.916231] [<ffffffff8176125e>] rtnl_delete_link+0x4e/0x80
Mar 28 22:17:08 jane kernel: [ 672.916238] [<ffffffff811f6168>] ? __kmalloc_node_track_caller+0x258/0x310
Mar 28 22:17:08 jane kernel: [ 672.916244] [<ffffffff81785e00>] netlink_unicast+0x190/0x240
Mar 28 22:17:08 jane kernel: [ 672.916251] [<ffffffff813a78c3>] ? aa_sk_perm+0x73/0x220
Mar 28 22:17:08 jane kernel: [ 672.916258] [<ffffffff818741c8>] entry_SYSCALL_64_fastpath+0x1c/0xbb
Mar 28 22:17:08 jane kernel: [ 672.951377] (detected by 1, t=15002 jiffies, g=68579, c=68578, q=105198)
Mar 28 22:17:08 jane kernel: [ 672.953168] 0000000000000000 0000000000000000 0000000000000000 0000000000000003
Mar 28 22:17:08 jane kernel: [ 672.955086] [<ffffffff813a7beb>] ? aa_sock_perm+0x4b/0xe0
Mar 28 22:17:08 jane kernel: [ 672.956999] [<ffffffff818741c8>] ? entry_SYSCALL_64_fastpath+0x1c/0xbb
Mar 28 22:17:35 jane kernel: [ 700.379762] Modules linked in: drbg ansi_cprng authenc echainiv esp4 xfrm4_mode_transport l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox cfg80211 xfrm_user xfrm_algo ipt_MASQUERADE nf_nat_masquerade_ipv4 binfmt_misc veth ip_set pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip6table_filter ip6_tables iptable_filter xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfnetlink_log nfnetlink i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel gpio_ich ppdev kvm video drm_kms_helper irqbypass crct10dif_pclmul drm crc32_pclmul<0>[ 700.387264] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [lxc-start:13935]
Mar 28 22:17:35 jane kernel: [ 700.387315] task: ffff8806f2556200 ti: ffff880035fd8000 task.ti: ffff880035fd8000
Mar 28 22:17:35 jane kernel: [ 700.387321] RBP: ffff880035fdb460 R08: 000000000001ab60 R09: ffff8807fe803600
Mar 28 22:17:35 jane kernel: [ 700.387324] CR2: 00000000006cd000 CR3: 00000000bd164000 CR4: 0000000000160670
Mar 28 22:17:35 jane kernel: [ 700.387327] Call Trace:
Mar 28 22:17:35 jane kernel: [ 700.387339] [<ffffffff811b6448>] ? pcpu_alloc_area+0x2a8/0x3e0
Mar 28 22:17:35 jane kernel: [ 700.387347] [<ffffffff81824035>] NF_HOOK_THRESH.constprop.37+0x45/0xb0
Mar 28 22:17:35 jane kernel: [ 700.387354] [<ffffffff8182691f>] ipv6_dev_mc_dec+0x2f/0x60
Mar 28 22:17:35 jane kernel: [ 700.387360] [<ffffffff810a48c6>] raw_notifier_call_chain+0x16/0x20
Mar 28 22:17:35 jane kernel: [ 700.387367] [<ffffffff8174c89b>] unregister_netdevice_many+0x1b/0xa0
Mar 28 22:17:35 jane kernel: [ 700.387376] [<ffffffff81761ea4>] rtnetlink_rcv_msg+0xa4/0x230
Mar 28 22:17:35 jane kernel: [ 700.387384] [<ffffffff81761df8>] rtnetlink_rcv+0x28/0x30
Mar 28 22:17:35 jane kernel: [ 700.387391] [<ffffffff81734855>] ___sys_sendmsg+0x285/0x2a0
Mar 28 22:17:35 jane kernel: [ 700.387398] [<ffffffff81735082>] SyS_sendmsg+0x12/0x20
Mar 28 22:17:35 jane kernel: [ 700.442972] RSP: 0018:ffff8807f5a8bdb8 EFLAGS: 00000202
Mar 28 22:17:35 jane kernel: [ 700.445599] R13: ffffffff81f18f48 R14: ffffffff81f19034 R15: ffffffff81f18e40
Mar 28 22:17:35 jane kernel: [ 700.448277] ffff8807f5a8bdd8 ffffffff810cf3f0 ffff8807f60ece40 ffff8807fe99b000
Mar 28 22:17:35 jane kernel: [ 700.451044] [<ffffffff81873e89>] _raw_write_lock_bh+0x29/0x30
Mar 28 22:17:35 jane kernel: [ 700.453871] [<ffffffff810a365c>] kthread+0xfc/0x120