[SOLVED] After 5.4 -> 6 upgrade, LXC containers starting to a blank flashing cursor randomly

Mitch Maris

Active Member
Feb 3, 2017
22
1
43
41
Hello all,

I've recently upgraded my Proxmox host from 5.4 to 6.0. I installed originally from Proxmox ISO.

Upgrade went fine, passed pve5to6 without warning.

On first boot my when I started a container the green |> icon was not showing up. I've got 35 containers or so. All of my containers were in a strange state as described above, all had 0 cpu, 0 ram usage and were at a blank flashing cursor. I stopped the vm, and started again. Some of them would boot normally and some would boot into the same mode.

I have rebooted several times, no change. 'apt install -f' no change.

This is a Standalone node - no cluster defined.

Code:
CPU(s) 48 x Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2 Sockets)

Kernel Version Linux 5.0.15-1-pve #1 SMP PVE 5.0.15-1 (Wed, 03 Jul 2019 10:51:57 +0200)

PVE Manager Version pve-manager/6.0-4/2a719255


Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.15-1-pve)
pve-manager: 6.0-4 (running version: 6.0-4/2a719255)
pve-kernel-5.0: 6.0-5
pve-kernel-helper: 6.0-5
pve-kernel-4.15: 5.4-6
pve-kernel-5.0.15-1-pve: 5.0.15-1
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.10-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-2
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-5
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-61
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-5
pve-cluster: 6.0-4
pve-container: 3.0-3
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-5
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-2
pve-qemu-kvm: 4.0.0-3
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-5
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve1
 
hi,

some questions:

* can you enter these containers with `pct enter CTID`?

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`

* what does the config look like? (`pct config CTID`)
 
  • Like
Reactions: Mitch Maris
hi,

some questions:

* can you enter these containers with `pct enter CTID`?

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`

* what does the config look like? (`pct config CTID`)

I have currently corrected all my container VM's into running states, but will turn them off and back on to replicate this issue, one min please Oguz :D Thanks!
 
I was able to replicate it.


* can you enter these containers with `pct enter CTID`?
root@pve1:~# pct enter 107
Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 397.

* what do you see in the debug logs? `lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log`
root@pve1:~# lxc-start -n 107 -F -l DEBUG -o /tmp/lxc-107.log
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization lxc.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.2 LTS!

Set hostname to <LMD-LegalRank-Prod>.
Failed to create control group inotify object: Too many open files
Failed to allocate manager object: Too many open files
[!!!!!!] Failed to allocate manager object, freezing.
Freezing execution.

* what does the config look like? (`pct config CTID`)
root@pve1:~# pct config 107
Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 397.
 
It looks like to me @fabian this patch wasn't applied or was lost on my upgrade which intentionally increased the limit.

https://git.proxmox.com/?p=pve-container.git;a=commitdiff;h=02209345486fa0ddb3f69b29926dbb9210e15b41


Code:
root@pve1:/etc/sysctl.d# ls -la
total 28
drwxr-xr-x  2 root root  4096 Jul 17 14:56 .
drwxr-xr-x 98 root root 12288 Jul 17 19:55 ..
lrwxrwxrwx  1 root root    14 May 24 15:58 99-sysctl.conf -> ../sysctl.conf
-rw-r--r--  1 root root   324 May 31  2018 protect-links.conf
-rw-r--r--  1 root root   187 Nov 29  2018 pve.conf
-rw-r--r--  1 root root   639 May 17  2018 README.sysctl
 
like the bug status says - it has been applied, but not yet release as part of an updated package. the next pve-container update will include it.
 
to apply this update myself before the patch is applied, this file should be added, yes?

10-pve-ct-inotify-limits.conf

with contents

Code:
fs.inotify.max_queued_events = 8388608
fs.inotify.max_user_instances = 65536
fs.inotify.max_user_watches = 4194304
 
yes, that looks right.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!