[SOLVED] After 5.4 -> 6 upgrade, LXC containers starting to a blank flashing cursor randomly

Mitch Maris

Active Member
Feb 3, 2017
22
1
43
43
Hello all,

I've recently upgraded my Proxmox host from 5.4 to 6.0. I installed originally from Proxmox ISO.

Upgrade went fine, passed pve5to6 without warning.

On first boot my when I started a container the green |> icon was not showing up. I've got 35 containers or so. All of my containers were in a strange state as described above, all had 0 cpu, 0 ram usage and were at a blank flashing cursor. I stopped the vm, and started again. Some of them would boot normally and some would boot into the same mode.

I have rebooted several times, no change. 'apt install -f' no change.

This is a Standalone node - no cluster defined.

Code:
CPU(s) 48 x Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2 Sockets)

Kernel Version Linux 5.0.15-1-pve #1 SMP PVE 5.0.15-1 (Wed, 03 Jul 2019 10:51:57 +0200)

PVE Manager Version pve-manager/6.0-4/2a719255


Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.15-1-pve)
pve-manager: 6.0-4 (running version: 6.0-4/2a719255)
pve-kernel-5.0: 6.0-5
pve-kernel-helper: 6.0-5
pve-kernel-4.15: 5.4-6
pve-kernel-5.0.15-1-pve: 5.0.15-1
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.10-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-2
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-5
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-61
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-5
pve-cluster: 6.0-4
pve-container: 3.0-3
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-5
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-2
pve-qemu-kvm: 4.0.0-3
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-5
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve1
 
hi,

some questions:

* can you enter these containers with `pct enter CTID`?

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`

* what does the config look like? (`pct config CTID`)
 
  • Like
Reactions: Mitch Maris
hi,

some questions:

* can you enter these containers with `pct enter CTID`?

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`

* what does the config look like? (`pct config CTID`)

I have currently corrected all my container VM's into running states, but will turn them off and back on to replicate this issue, one min please Oguz :D Thanks!
 
I was able to replicate it.


* can you enter these containers with `pct enter CTID`?
root@pve1:~# pct enter 107
Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 397.

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`
root@pve1:~# lxc-start -n 107 -F -l DEBUG -o /tmp/lxc-107.log
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization lxc.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.2 LTS!

Set hostname to <LMD-LegalRank-Prod>.
Failed to create control group inotify object: Too many open files
Failed to allocate manager object: Too many open files
[!!!!!!] Failed to allocate manager object, freezing.
Freezing execution.

* what does the config look like? (`pct config CTID`)
root@pve1:~# pct config 107
Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 397.
 
It looks like to me @fabian this patch wasn't applied or was lost on my upgrade which intentionally increased the limit.

https://git.proxmox.com/?p=pve-container.git;a=commitdiff;h=02209345486fa0ddb3f69b29926dbb9210e15b41


Code:
root@pve1:/etc/sysctl.d# ls -la
total 28
drwxr-xr-x  2 root root  4096 Jul 17 14:56 .
drwxr-xr-x 98 root root 12288 Jul 17 19:55 ..
lrwxrwxrwx  1 root root    14 May 24 15:58 99-sysctl.conf -> ../sysctl.conf
-rw-r--r--  1 root root   324 May 31  2018 protect-links.conf
-rw-r--r--  1 root root   187 Nov 29  2018 pve.conf
-rw-r--r--  1 root root   639 May 17  2018 README.sysctl
 
like the bug status says - it has been applied, but not yet release as part of an updated package. the next pve-container update will include it.
 
to apply this update myself before the patch is applied, this file should be added, yes?

10-pve-ct-inotify-limits.conf

with contents

Code:
fs.inotify.max_queued_events = 8388608
fs.inotify.max_user_instances = 65536
fs.inotify.max_user_watches = 4194304
 
yes, that looks right.