[SOLVED] Bayes still not auto learning?

killmasta93

Renowned Member
Aug 13, 2017
958
56
68
30
Hi,
I was wondering how long does Bayes to learn spam and ham? i was reading about it mostly it says 250 emails but just wanted to be sure. This is one of the logs

Code:
0FDE5BAC3ECA4A402: SA score=1/5 time=1.525 bayes=undefined autolearn=no autolearn_force=no hits=DKIM_SIGNED,FORGED_HOTMAIL_RCVD2,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,T_DKIM_INVALID,T_FREEMAIL_DOC_PD


Thank you
 
Thanks for the reply, so far i got around 600ish emails but not sure what you mean level > 12?

Thank you
 
when an email gets analysed for SPAM, it gets a Spam-Score assigned.
this Spam-Score is used for various filtering (e.g. whether to put the email into quarantine; or whether to block it completely; or...)
this Spam-Score is also used to automatically train the bayes filter.
since spamassassin tries hard to not train on false positives, the threshold is rather high: any email that got more than 12 points will be used for auto-training (this is quite high, given that emails with a score of 10 are considered spam-enough to be discarded).

if you manually collect spam-mails to train the bayes db, the Spam-Score doesn't matter (and will mostly be *much* lower, as in <<3), because you - as a human - will hopefully only present real spam for training.
 
when an email gets analysed for SPAM, it gets a Spam-Score assigned.
this Spam-Score is used for various filtering (e.g. whether to put the email into quarantine; or whether to block it completely; or...)
this Spam-Score is also used to automatically train the bayes filter.
since spamassassin tries hard to not train on false positives, the threshold is rather high: any email that got more than 12 points will be used for auto-training (this is quite high, given that emails with a score of 10 are considered spam-enough to be discarded).

if you manually collect spam-mails to train the bayes db, the Spam-Score doesn't matter (and will mostly be *much* lower, as in <<3), because you - as a human - will hopefully only present real spam for training.
Thank you for the reply, that really helped so in theory if i dont get spam 10-12 points the bayes wont be trained, your recommendation is to train bayes db?
 
the bayes won't get trained unless you have a spam-score that is greater than 12. ("10-12 points" qualifies as less than 12, so those mails don't automatically train the database).

so yes: you should train the bayes-db manually.
 
Thanks for the reply, the only way to train the bayes-db would be imap account and moving bad email to the bad folder and good email to the good folder right?
 
no.

you somehow have to call `sa-learn` on good and bad emails.
the problem is mainly how to get the spam/ham mails from the users to the pmg.
one option is to use imap, another one to use rsync, a third one to use an USB-stick,....
 
the only issue there is that the user has to learn to put the good and bad emails then either use usb stick or rsync. I guess im going to have to stick with auto learning
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!