Ceph Alerts Module SMTP-Error

Kadrim

Well-Known Member
May 20, 2018
47
2
48
41
Hi there,

i am running proxmox 6.2-10 with ceph 14.2.9

i also installed the ceph-manager-dashboard (to get the alert module) via

apt install ceph-mgr-dashboard

and then configured the alerts module like this:

Bash:
ceph mgr module enable alerts
ceph config set mgr mgr/alerts/smtp_host '172.18.0.112'
ceph config set mgr mgr/alerts/smtp_ssl false
ceph config set mgr mgr/alerts/smtp_port 25
ceph config set mgr mgr/alerts/smtp_destination 'user@example.com'
ceph config set mgr mgr/alerts/smtp_sender 'pve@example.com'
ceph config set mgr mgr/alerts/smtp_user 'pve'
ceph config set mgr mgr/alerts/smtp_password 'my-scrambled-pass'
ceph config set mgr mgr/alerts/smtp_from_name 'Ceph Cluster Foo'

If i then run the test-command or even on purpose do something that will change the health of the ceph no email is beeing sent.

Instead i find the following in my logs:

Code:
# ceph-mgr.pve.log

2020-07-29 11:30:07.220 7f6b7d943700 -1 ceph_set_health_checks check ALERTS_SMTP_ERROR unexpected key count

and the output of ceph health detail:

Bash:
HEALTH_WARN unable to send alert email
ALERTS_SMTP_ERROR unable to send alert email
    [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:727)

On the mail server (mail relay) i can see, that a connection is beeing made, but with an immediate disconnect:

Code:
Jul 29 11:30:07 Mail postfix/smtpd[352]: connect from unknown[172.18.0.2]
Jul 29 11:30:07 Mail postfix/smtpd[352]: lost connection after UNKNOWN from unknown[172.18.0.2]
Jul 29 11:30:07 Mail postfix/smtpd[352]: disconnect from unknown[172.18.0.2] unknown=0/3 commands=0/3

The mail server does work and can send mails, i even stitched together a simple test program in php to see if the smtp auth is working which it does.

Does anyone have a hint for me?

Thanks in advance!
 
So maybe you meant this tracker?
Hm... no I didn't. :D I just thought about the email alert being more completely broken. :rolleyes:

But if i read that correctly this should have already been fixed in 14.2.9? Am i wrong?
Some of the things should have landed in 14.2.9, yes. But I suppose there might still be some bugs floating around.
 
ok, as a workaround i activated implicit ssl on port 465 on my mail-server with an invalid (aka snakeoil) cert, because the machine is only a relay and not available to the public.

to do that, one has to remove the smtp_port and smtp_ssl variable via

Bash:
ceph config rm mgr mgr/alerts/smtp_ssl
ceph config rm mgr mgr/alerts/smtp_port

that works for now but i too think, that the alert module needs some fixing from the ceph team ;-)
 
  • Like
Reactions: Alwin

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!