WARNING: Crashed HA cluster with Ceph due to

D

Deleted member 33567

Guest
Code:
root@n01-sxb-pve01:~# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256052
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 256052
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

It seems that pveadmin crashed due to too many opening into Graphite server.

We had a lot of:

Code:
ipcc_send_rec failed: Connection refused

Plenty of such errors showing on netstat plentyfull of connections to my Graphite server. How can one control this?

No service restart helped, I had to reboot each server one by one.

For now I have completely disabled Graphite and InfluxDB.

Please investigate this because this can lead to many users in issues.