LXC container reboot fails - LXC becomes unusable

Discussion in 'Proxmox VE: Installation and configuration' started by denos, Feb 7, 2018.

  1. denos

    denos Member

    Joined:
    Jul 27, 2015
    Messages:
    74
    Likes Received:
    34
    Correct, 4.14.20 (or later) completely resolve the issue.
     
  2. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    Today we had a different issue.

    We terminated a CT 154, and Node went RED. 154 got deleted, Node and all other CTs pinging fine.

    Result of ps aux | grep 154

    root@P158:~# ps aux | grep 154
    root 154 0.0 0.0 0 0 ? S< Mar02 0:00 [netns]
    27 5095 0.0 0.0 113276 1548 ? Ss Mar02 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
    root 5148 0.0 0.0 20576 1548 ? Ss Mar02 0:03 /usr/sbin/dovecot
    114 5389 0.0 0.0 225184 15484 ? S Mar02 0:08 /usr/lib/postgresql/9.4/bin/postgres -D /var/lib/postgresql/9.4/main -c config_file=/etc/postgresql/9.4/main/postgresql.conf
    root 6154 0.0 0.0 0 0 ? S 15:34 0:00 [kworker/10:0]
    root 15428 0.0 0.0 125092 3544 ? Ss Mar04 0:06 /sbin/init
    110 15449 0.0 0.0 47292 5684 ? S 15:41 0:00 smtpd -n smtp -t inet -u -c -o stress= -s 2
    110 15450 0.0 0.0 39960 3232 ? S 15:41 0:00 proxymap -t unix -u
    root 15453 0.0 0.1 544616 59176 ? S 11:30 0:15 pvedaemon worker
    110 15460 0.0 0.0 39960 3268 ? S 15:41 0:00 anvil -l -t unix -u -c
    root 20988 0.0 0.1 111540 63076 ? Ss Mar02 1:52 /usr/lib/systemd/systemd-journald
    root 22626 0.0 0.0 12788 920 pts/19 S+ 15:43 0:00 grep 154
    root 25154 0.2 0.3 1065716 99068 ? Sl Mar04 3:10 /usr/bin/node ./eachDomainWise/domainWise rockstarfly.com 1342
     
  3. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    service pvestatd restart

    This made node green, but all CT still grey.
     
    FibreFoX likes this.
  4. denos

    denos Member

    Joined:
    Jul 27, 2015
    Messages:
    74
    Likes Received:
    34
    That command doesn't show processes from container 154. It's simply matching anything in the output of ps with the text "154" anywhere on the line.
     
    #24 denos, Mar 6, 2018
    Last edited: Mar 6, 2018
  5. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    What i meant is lxc monitor is not running like other issue.

    I didn't mention in detail.

    I assumed we all knew.
     
  6. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,283
    Likes Received:
    506
    could you please test 4.14 (the first release of the 4.14 kernel series) and report whether it works or not? if it does not, this trims down the range of potentially fixing commits quite a lot!
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. denos

    denos Member

    Joined:
    Jul 27, 2015
    Messages:
    74
    Likes Received:
    34
    Will do. If it fails, I'll figure out exactly which version works but it may take a few days as I can only test in the evenings.
     
    FibreFoX likes this.
  8. FibreFoX

    FibreFoX New Member

    Joined:
    Feb 26, 2018
    Messages:
    10
    Likes Received:
    2
    Thanks a lot, I'm hitting this bug too and was confused that I did something wrong. Hopefully you can help pinpointing this annoying bug. Thanks a lot! (I just recently started using PROXMOX, so this is no upgrade-bug for me and it started to get frustrating not being able to use LXC-containers...)
     
    denos likes this.
  9. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    You are lucky.

    I have 25 live nodes, and for last one week, I am having sleepless nights.

    I lost complete faith in Proxmox.
     
    apusgrz likes this.
  10. denos

    denos Member

    Joined:
    Jul 27, 2015
    Messages:
    74
    Likes Received:
    34
    Good news: I bisected the 4.14 kernel releases and determined that 4.14.4 is the first working kernel version. Looking at the change log, I'd bet that this is the fix we're after:
    Code:
    commit 84779085fa10014b9e8208d7e71b54bced73075c
    Author: Vasily Averin <vvs@virtuozzo.com>
    Date:   Thu Nov 2 13:03:42 2017 +0300
    
        lockd: lost rollback of set_grace_period() in lockd_down_net()
        
        commit 3a2b19d1ee5633f76ae8a88da7bc039a5d1732aa upstream.
        
        Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect,
        it removes lockd_manager and disarm grace_period_end for init_net only.
        
        If nfsd was started from another net namespace lockd_up_net() calls
        set_grace_period() that adds lockd_manager into per-netns list
        and queues grace_period_end delayed work.
        
        These action should be reverted in lockd_down_net().
        Otherwise it can lead to double list_add on after restart nfsd in netns,
        and to use-after-free if non-disarmed delayed work will be executed after netns destroy.
        
        Fixes: efda760fe95e ("lockd: fix lockd shutdown race")
        Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
        Signed-off-by: J. Bruce Fields <bfields@redhat.com>
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Fingers crossed and looking forward to the next pve-kernel to test.
     
    FibreFoX and afsal like this.
  11. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    Great news.

    I hope we will get the new update for proxmox soon.

    That will end my sleepless nights.
     
    FibreFoX and afsal like this.
  12. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,283
    Likes Received:
    506
    are all of you mounting or exporting NFS shares inside your containers? if so it would have been a good idea to include this in your reports, as it is a setup that we advise against and do not test at all.

    I can reproduce a hang ONLY when I do either mount or export NFS shares within a container (which requires modifying / disabling AppArmor!), and even then reboot the container in question generates the following kernel BUG trace which would have immediately pointed to NFS as the culprit:
    Code:
    Mar 07 12:50:14 host kernel: ------------[ cut here ]------------
    Mar 07 12:50:14 host kernel: kernel BUG at fs/nfs_common/grace.c:107!
    Mar 07 12:50:14 host kernel: invalid opcode: 0000 [#1] SMP PTI
    Mar 07 12:50:14 host kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss veth rbd libceph nfsv3 nfs_acl nfs lockd grace fscache ip_set ip6table_filter ip6_tables xfs iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack softdog nfnetlink_log nfnetlink dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper ppdev hid_generic cryptd zfs(PO) zunicode(PO) zavl(PO) icp(PO) snd_pcm snd_timer snd soundcore pcspkr joydev input_leds serio_raw shpchp parport_pc parport qemu_fw_cfg mac_hid usbhid hid zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp
    Mar 07 12:50:14 host kernel:  libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse virtio_net virtio_scsi floppy pata_acpi i2c_piix4
    Mar 07 12:50:14 host kernel: CPU: 1 PID: 90 Comm: kworker/u4:2 Tainted: P           O    4.13.13-6-pve #1
    Mar 07 12:50:14 host kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
    Mar 07 12:50:14 host kernel: Workqueue: netns cleanup_net
    Mar 07 12:50:14 host kernel: task: ffff941fe5475f00 task.stack: ffffb9a181d30000
    Mar 07 12:50:14 host kernel: RIP: 0010:grace_exit_net+0x24/0x30 [grace]
    Mar 07 12:50:14 host kernel: RSP: 0000:ffffb9a181d33dc8 EFLAGS: 00010212
    Mar 07 12:50:14 host kernel: RAX: ffff941fe6f209e0 RBX: ffff941f902aaf80 RCX: 0000000000000000
    Mar 07 12:50:14 host kernel: RDX: ffff941f9010ed38 RSI: ffffffffc0ac1020 RDI: ffff941f902aaf80
    Mar 07 12:50:14 host kernel: RBP: ffffb9a181d33dc8 R08: ffff941f9010e0c0 R09: 000000018015000d
    Mar 07 12:50:14 host kernel: R10: ffffb9a181d33d18 R11: 0000000000000000 R12: ffffb9a181d33e20
    Mar 07 12:50:14 host kernel: R13: ffffffffc0ac1018 R14: ffffffffc0ac1020 R15: 0000000000000000
    Mar 07 12:50:14 host kernel: FS:  0000000000000000(0000) GS:ffff941fffd00000(0000) knlGS:0000000000000000
    Mar 07 12:50:14 host kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 07 12:50:14 host kernel: CR2: 000056234982b078 CR3: 000000029700a002 CR4: 00000000003606e0
    Mar 07 12:50:14 host kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Mar 07 12:50:14 host kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Mar 07 12:50:14 host kernel: Call Trace:
    Mar 07 12:50:14 host kernel:  ops_exit_list.isra.8+0x3b/0x70
    Mar 07 12:50:14 host kernel:  cleanup_net+0x1ca/0x2b0
    Mar 07 12:50:14 host kernel:  process_one_work+0x1ee/0x410
    Mar 07 12:50:14 host kernel:  worker_thread+0x4b/0x420
    Mar 07 12:50:14 host kernel:  kthread+0x10c/0x140
    Mar 07 12:50:14 host kernel:  ? process_one_work+0x410/0x410
    Mar 07 12:50:14 host kernel:  ? kthread_create_on_node+0x70/0x70
    Mar 07 12:50:14 host kernel:  ret_from_fork+0x35/0x40
    Mar 07 12:50:14 host kernel: Code: 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 15 79 22 00 00 48 8b 87 88 12 00 00 55 48 89 e5 48 8b 04 d0 48 8b 10 48 39 d0 75 02 5d c3 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 15 49 22
    Mar 07 12:50:14 host kernel: RIP: grace_exit_net+0x24/0x30 [grace] RSP: ffffb9a181d33dc8
    Mar 07 12:50:14 host kernel: ---[ end trace ce4a24d79fcca3bb ]---
    
    I'll verify that the commit in questions fixes the issue (it seems very likely).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  13. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    I don't use NFS.

    I use ZFS.

    Only local drives.

    I have 25 live nodes all with local ZFS drives, all 25 modes are facing issues atleast once every 2 days.
     
  14. FibreFoX

    FibreFoX New Member

    Joined:
    Feb 26, 2018
    Messages:
    10
    Likes Received:
    2
    I can second this, I do not use NFS, only ZFS (raid1/mirror-mode) with two drives, without having special ZFS-configuration (used out-of-box configuration).

    I'm having several QEMU/KVM-VMs running, some are producing some load, running containers are working, but after rebooting/shutting down via webinterface, they are shutting down, but fail to come up again.

    This is not NFS-related AFAIK.
     
  15. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,283
    Likes Received:
    506
    @FibreFoX @Vasu Sreekumar

    then it is likely that your issue is a different one and not the one @denos bisected to - the commit in question is for code that is only used for NFS AFAICT. I tried reproducing with just an NFS mount and/or export on the PVE node itself, but that does not trigger the problem so far.

    @denos: can you confirm whether you are using NFS inside the container or not?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  16. FibreFoX

    FibreFoX New Member

    Joined:
    Feb 26, 2018
    Messages:
    10
    Likes Received:
    2
    @fabian
    Thanks for the response, will try to create something to reproduce. Lets wait for @denos to give some more input :)
     
  17. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    My issue is exactly denos is talking about, and it has no NFS involved
     
  18. Vasu Sreekumar

    Vasu Sreekumar Active Member

    Joined:
    Mar 3, 2018
    Messages:
    123
    Likes Received:
    34
    Reproducing is easy.

    Create 5 LXC CT, run a cron to stop and start each CT every minute.

    Within minutes you will see issue.
     
    FibreFoX likes this.
  19. FibreFoX

    FibreFoX New Member

    Joined:
    Feb 26, 2018
    Messages:
    10
    Likes Received:
    2
    Seems not so easy for the proxmox-team ;) so I'll try to have this reproducable within some virtualbox or something like that.
     
  20. denos

    denos Member

    Joined:
    Jul 27, 2015
    Messages:
    74
    Likes Received:
    34
    Fabian: I do use NFS inside containers on my home server but not on two of the hypervisors that have had a network namespace lockup at work. It has been very easy to duplicate the issue at home (minutes - likely the NFS patch listed) but much harder on the hypervisors at work (up to 3 days of reboots before it occurs). I was excited to have made some progress but I agree with your assessment - the patch looks it's only addressing an NFS namespace issue. I think we're looking at more than one network namespace kernel issue that has been addressed somewhere in 4.14. Which is frustrating for everyone.

    I appreciate that this is a very difficult issue to investigate, especially without steps to duplicate and am grateful for everyone's effort trying to pin it down.

    If you have landed on this thread and want to confirm that it's relevant, wait for the issue to occur then run this command:
    Code:
    grep copy_net_ns /proc/*/stack
    If that returns anything, this thread will be relevant. If not, you have a different issue.

    As noted earlier in this thread, Docker users have reported an error with similar symptoms and a similar stack trace (hang on copy_net_ns):
    https://github.com/coreos/bugs/issues/254
    The bottom post in that thread carries on to this thread:
    https://github.com/moby/moby/issues/5618
    where they indicate kernel patches introduced as recently as Feb / 2018 may be relevant.

    To reiterate, any server running a plain 4.14.20 kernel or later has had no further recurrence of this issue.
     
    Vasu Sreekumar, Alwin and FibreFoX like this.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice