[SOLVED] Bayes still not auto learning?

killmasta93

Renowned Member
Aug 13, 2017
974
59
68
31
Hi,
I was wondering how long does Bayes to learn spam and ham? i was reading about it mostly it says 250 emails but just wanted to be sure. This is one of the logs

Code:
0FDE5BAC3ECA4A402: SA score=1/5 time=1.525 bayes=undefined autolearn=no autolearn_force=no hits=DKIM_SIGNED,FORGED_HOTMAIL_RCVD2,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,T_DKIM_INVALID,T_FREEMAIL_DOC_PD


Thank you
 
Thanks for the reply, so far i got around 600ish emails but not sure what you mean level > 12?

Thank you
 
when an email gets analysed for SPAM, it gets a Spam-Score assigned.
this Spam-Score is used for various filtering (e.g. whether to put the email into quarantine; or whether to block it completely; or...)
this Spam-Score is also used to automatically train the bayes filter.
since spamassassin tries hard to not train on false positives, the threshold is rather high: any email that got more than 12 points will be used for auto-training (this is quite high, given that emails with a score of 10 are considered spam-enough to be discarded).

if you manually collect spam-mails to train the bayes db, the Spam-Score doesn't matter (and will mostly be *much* lower, as in <<3), because you - as a human - will hopefully only present real spam for training.
 
when an email gets analysed for SPAM, it gets a Spam-Score assigned.
this Spam-Score is used for various filtering (e.g. whether to put the email into quarantine; or whether to block it completely; or...)
this Spam-Score is also used to automatically train the bayes filter.
since spamassassin tries hard to not train on false positives, the threshold is rather high: any email that got more than 12 points will be used for auto-training (this is quite high, given that emails with a score of 10 are considered spam-enough to be discarded).

if you manually collect spam-mails to train the bayes db, the Spam-Score doesn't matter (and will mostly be *much* lower, as in <<3), because you - as a human - will hopefully only present real spam for training.
Thank you for the reply, that really helped so in theory if i dont get spam 10-12 points the bayes wont be trained, your recommendation is to train bayes db?
 
the bayes won't get trained unless you have a spam-score that is greater than 12. ("10-12 points" qualifies as less than 12, so those mails don't automatically train the database).

so yes: you should train the bayes-db manually.
 
Thanks for the reply, the only way to train the bayes-db would be imap account and moving bad email to the bad folder and good email to the good folder right?
 
no.

you somehow have to call `sa-learn` on good and bad emails.
the problem is mainly how to get the spam/ham mails from the users to the pmg.
one option is to use imap, another one to use rsync, a third one to use an USB-stick,....
 
the only issue there is that the user has to learn to put the good and bad emails then either use usb stick or rsync. I guess im going to have to stick with auto learning