HA cluster problem with more than 100 CTs per node

AhmedF · Dec 7, 2015

Hi,

I'm running a HA cluster with 8 nodes and a shared NAS device totaling of 1000 CTs , everything is running very smoothly but when I need to reboot one of the nodes , I first stop rgmanager to relocate the HA CTs to other nodes then reboot the node but once it's started up I see these errors in the syslog and CTs are not coming back to this node like setup with cluster.conf with nofailback=0 "that used to work fine before"

Code:

Dec  7 12:21:16 clusterxxx rgmanager[41730]: [pvevm] got empty cluster VM list
Dec  7 12:21:16 cluster3b1 rgmanager[41732]: [pvevm] Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 388.
Dec  7 12:21:16 cluster3b1 rgmanager[41733]: [pvevm] CT xxxxx is already stopped
Dec  7 12:21:16 cluster3b1 rgmanager[41728]: [pvevm] Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 388.
Dec  7 12:21:16 cluster3b1 rgmanager[41750]: [pvevm] Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 388.

am running

Code:

proxmox-ve-2.6.32: 3.3-139 (running kernel: 2.6.32-34-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-2.6.32-34-pve: 2.6.32-139
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-3
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

can you please advise ?

wbumiller · Dec 7, 2015

Increase the fs.inotify.max_user_instances sysctl value. (Somewhere around twice the amount of containers you want to run should do...)
(And add it to /etc/sysctl.conf to make it permanent)

AhmedF · Dec 7, 2015

wbumiller said:
Increase the fs.inotify.max_user_instances sysctl value. (Somewhere around twice the amount of containers you want to run should do...)
(And add it to /etc/sysctl.conf to make it permanent)

Thanks for your reply , will give this a try.

AhmedF · Dec 7, 2015

That helped and fixed the error with

Code:

Dec  7 12:21:16 cluster3b1 rgmanager[41732]: [pvevm] Unable to create new inotify object: Too many open files at /usr/share/perl5/PVE/INotify.pm line 388.

but after rebooting the same node , still getting these errors

Code:

Dec  7 16:22:59 clusterxxx rgmanager[3273]: stop on pvevm "xxxxx" returned 2 (invalid argument(s))
Dec  7 16:22:59 clusterxxx rgmanager[40489]: [pvevm] got empty cluster VM list

Thanks in advance

Search

Search

HA cluster problem with more than 100 CTs per node

AhmedF

Renowned Member

wbumiller

Proxmox Staff Member

AhmedF

Renowned Member

AhmedF

Renowned Member

We value your privacy