bayes=undefined autolearn=no autolearn_force=no

poetry · Dec 24, 2022

Can someone explain the following in the logs bayes=undefined autolearn=no autolearn_force=no and what to do to enable this settings? How long does it take after you enable them to start working?

I had bayes and autolearn off for a while now I am getting to the point where I would like to try this settings again.
I have tried to enabled this settings:
Configuration - Spam Detector - Options - Use auto-whitelists - Yes
Configuration - Spam Detector - Options - Use bayesian filter - Yes

I have tried after changing this to also apply pmgconfig sync --restart 1 via command line and rebooted the server.
I am still seeing the same in the logs as before bayes=undefined autolearn=no autolearn_force=no
I am seeing on some messages scores AWL(5.500), AWL(0.375), AWL(2.500)
I am also seeing autolearn=ham on some messages? any explanation how this works and why it's added?
Same for autolearn=spam also see this without any explanation in the manual...

Why is this score not added on all messages?
Where do we see the "learning" status or any more details about this system?

Do User Whitelist and User Blacklists also influence this AWL scores?

I would like to know in detail for each setting bayes, autolearn, autolearn_force how they work and what we can do to make them work well.
Manual is not that helpful in this regard https://pmg.proxmox.com/pmg-docs/pmg-admin-guide.html a lot of generic and dumb down information. It does not tell us any specifics about how it really works? What auto-learning algorithm? What Certain words? How is it trained ect

Auto-learning algorithm
Proxmox Mail Gateway gathers statistical information about spam emails. This information is used by an auto-learning algorithm, meaning the system becomes smarter over time.

use_awl: <boolean> (default = 1)
Use the Auto-Whitelist plugin.

Bayesian Filter - Automatically trained statistical filters
Certain words have a higher probability of occurring in spam emails than in legitimate emails. By being trained to recognize those words, the Bayesian filter checks every email and adjusts the probabilities of it being a spam word or not in its database. This is done automatically.

use_bayes: <boolean> (default = 1)
Whether to use the naive-Bayesian-style classifier.

Anything more in detail from your experience what helped you understand this settings better?

Stoiko Ivanov · Dec 27, 2022

The bayes filter is simply using SpamAssassin's Bayes implementation - see the spamassassin docs for more details:
e.g. https://cwiki.apache.org/confluence/display/SPAMASSASSIN/AutolearningNotWorking

poetry said:
Do User Whitelist and User Blacklists also influence this AWL scores?

no as this is simply SpamAssassins Autowhitelist feature - see:
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/autowhitelist

I hope this helps!

We will disable bayes by default in the next major release - as it usually causes mails to get misclassified more often than it helps

Search

Search

bayes=undefined autolearn=no autolearn_force=no

poetry

Active Member

Stoiko Ivanov

Proxmox Staff Member

We value your privacy