bayes=undefined autolearn=no autolearn_force=no

poetry

Active Member
May 28, 2020
206
63
33
Can someone explain the following in the logs bayes=undefined autolearn=no autolearn_force=no and what to do to enable this settings? How long does it take after you enable them to start working?

I had bayes and autolearn off for a while now I am getting to the point where I would like to try this settings again.
I have tried to enabled this settings:
Configuration - Spam Detector - Options - Use auto-whitelists - Yes
Configuration - Spam Detector - Options - Use bayesian filter - Yes

I have tried after changing this to also apply pmgconfig sync --restart 1 via command line and rebooted the server.
I am still seeing the same in the logs as before bayes=undefined autolearn=no autolearn_force=no
I am seeing on some messages scores AWL(5.500), AWL(0.375), AWL(2.500)
I am also seeing autolearn=ham on some messages? any explanation how this works and why it's added?
Same for autolearn=spam also see this without any explanation in the manual...

Why is this score not added on all messages?
Where do we see the "learning" status or any more details about this system?

Do User Whitelist and User Blacklists also influence this AWL scores?

I would like to know in detail for each setting bayes, autolearn, autolearn_force how they work and what we can do to make them work well.
Manual is not that helpful in this regard https://pmg.proxmox.com/pmg-docs/pmg-admin-guide.html a lot of generic and dumb down information. It does not tell us any specifics about how it really works? What auto-learning algorithm? What Certain words? How is it trained ect

Auto-learning algorithm
Proxmox Mail Gateway gathers statistical information about spam emails. This information is used by an auto-learning algorithm, meaning the system becomes smarter over time.

use_awl: <boolean> (default = 1)
Use the Auto-Whitelist plugin.

Bayesian Filter - Automatically trained statistical filters
Certain words have a higher probability of occurring in spam emails than in legitimate emails. By being trained to recognize those words, the Bayesian filter checks every email and adjusts the probabilities of it being a spam word or not in its database. This is done automatically.

use_bayes: <boolean> (default = 1)
Whether to use the naive-Bayesian-style classifier.

Anything more in detail from your experience what helped you understand this settings better?
 
Last edited:
The bayes filter is simply using SpamAssassin's Bayes implementation - see the spamassassin docs for more details:
e.g. https://cwiki.apache.org/confluence/display/SPAMASSASSIN/AutolearningNotWorking
Do User Whitelist and User Blacklists also influence this AWL scores?
no as this is simply SpamAssassins Autowhitelist feature - see:
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/autowhitelist

I hope this helps!

We will disable bayes by default in the next major release - as it usually causes mails to get misclassified more often than it helps
 
  • Like
Reactions: poetry

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!