Bayes and Spam Marking - Question

thebiggeek

Active Member
Jul 23, 2020
41
3
28
52
So I have a Spam Email coming in from someone, it is evidently spam, while most of my rules blocked the first few emails, the following emails from the same sender are getting through. The Sending server is the same, the Subject and Body are all the same. Here are to SA Scores

SA score=2/5 time=2.729 bayes=0.36 autolearn=no autolearn_force=no hits=BAYES_40(-0.001),DCC_CHECK(1.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),KAM_LAZY_DOMAIN_SECURITY(1),RCVD_IN_DNSWL_NONE(-0.0001),RCVD_IN_SORBS_PROBLEMS(0.5),SPF_HELO_NONE(0.001),SPF_NONE(0.001)

SA score=10/5 time=2.767 bayes=0.00 autolearn=no autolearn_force=no hits=BAYES_00(-1.9),DCC_REPUT_70_89(0.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),KAM_LAZY_DOMAIN_SECURITY(1),KAM_LIST3(11),SPF_HELO_NONE(0.001),SPF_NONE(0.001)

Anyone has inputs on how to fix something like this?
 
The Sending server is the same, the Subject and Body are all the same. Here are to SA Scores
Somewhere the body needs to differ between both mails - see the hits: The one rule which changes the result quite harshly is:
KAM_LIST3 (which adds 11 points for the second mail, but not for the first mail)
from the description:
Code:
Mailing List Purveyor Spam
from the rule itself it matches the body+headers of an e-mail (the subject needs to contain some mentioning of contact, qualified leads and some such) - and the body also needs to indicate these things
(you can check yourself by searching for KAM_LIST3 in /usr/share/spamassassin-extra/KAM.cf)

I hope this explains it
 
Somewhere the body needs to differ between both mails - see the hits: The one rule which changes the result quite harshly is:
KAM_LIST3 (which adds 11 points for the second mail, but not for the first mail)
from the description:
Code:
Mailing List Purveyor Spam
from the rule itself it matches the body+headers of an e-mail (the subject needs to contain some mentioning of contact, qualified leads and some such) - and the body also needs to indicate these things
(you can check yourself by searching for KAM_LIST3 in /usr/share/spamassassin-extra/KAM.cf)

I hope this explains it


Thanks, so I am now getting a lot of negative results thanks to AWL and BAYES and am working on improving those. I went through this - thoughtcrimes I have no visibility of the mail, they looked similar. Case in Example I have some personal rules that are giving a high marking for emails that we want to block - having certain keywords, as India related Spam Filters don't exist. They are also getting kicked in, but BAYES and AWL Scores reduce the whole points - Let's Look at a few headers here

SA score=1/5 time=2.692 bayes=0.00 autolearn=no autolearn_force=no hits=AWL(-0.792),
BAYES_00(-1.9),DCC_REPUT_70_89(0.1),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HEADER_FROM_DIFFERENT_DOMAINS(0.001),HTML_MESSAGE(0.001),HTTPS_HTTP_MISMATCH(0.1),KAM_DMARC_STATUS(0.01),RCVD_IN_DNSWL_NONE(-0.0001),RCVD_IN_MSPIKE_H3(0.001),RCVD_IN_MSPIKE_WL(0.001),S3BMS_HEADER_15(4),SPF_HELO_NONE(0.001),SPF_PASS(-0.001)

Now Here, S3BMS_HEADER_14 has triggered and given it 4 Points, but the -0.792 from AWL and -1.9 from BAYES is hurting a bit.

Similarly

SA score=3/5 time=2.820 bayes=0.00 autolearn=no autolearn_force=no hits=AWL(-3.341),BAYES_00(-1.9),DKIM_SIGNED(0.1),DKIM_VALID(-0.1),DKIM_VALID_AU(-0.1),DKIM_VALID_EF(-0.1),HTML_IMAGE_RATIO_02(0.001),HTML_MESSAGE(0.001),KAM_NUMSUBJECT(0.5),MAILING_LIST_MULTI(-1),RCVD_IN_RP_RNBL(1.31),RCVD_IN_SORBS_PROBLEMS(0.5),S3BMS_BODY_45(4),S3BMS_HEADER_19(4),SPF_HELO_NONE(0.001),SPF_PASS(-0.001)

In the last one, we actually had 2 Rules that Triggered from our Lists giving it 8 Points, and in RNBL gave it 1.31 Points - but AWL and Bayes collectively reduced the points. Is there something I can do to avoid this?
 
Thanks, so I am now getting a lot of negative results thanks to AWL and BAYES and am working on improving those. I went through this - thoughtcrimes I have no visibility of the mail, they looked similar. Case in Example I have some personal rules that are giving a high marking for emails that we want to block - having certain keywords, as India related Spam Filters don't exist. They are also getting kicked in, but BAYES and AWL Scores reduce the whole points - Let's Look at a few headers here

SA score=1/5 time=2.692 bayes=0.00 autolearn=no autolearn_force=no hits=AWL(-0.792),
BAYES_00(-1.9),DCC_REPUT_70_89(0.1),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HEADER_FROM_DIFFERENT_DOMAINS(0.001),HTML_MESSAGE(0.001),HTTPS_HTTP_MISMATCH(0.1),KAM_DMARC_STATUS(0.01),RCVD_IN_DNSWL_NONE(-0.0001),RCVD_IN_MSPIKE_H3(0.001),RCVD_IN_MSPIKE_WL(0.001),S3BMS_HEADER_15(4),SPF_HELO_NONE(0.001),SPF_PASS(-0.001)

Now Here, S3BMS_HEADER_14 has triggered and given it 4 Points, but the -0.792 from AWL and -1.9 from BAYES is hurting a bit.

Similarly

SA score=3/5 time=2.820 bayes=0.00 autolearn=no autolearn_force=no hits=AWL(-3.341),BAYES_00(-1.9),DKIM_SIGNED(0.1),DKIM_VALID(-0.1),DKIM_VALID_AU(-0.1),DKIM_VALID_EF(-0.1),HTML_IMAGE_RATIO_02(0.001),HTML_MESSAGE(0.001),KAM_NUMSUBJECT(0.5),MAILING_LIST_MULTI(-1),RCVD_IN_RP_RNBL(1.31),RCVD_IN_SORBS_PROBLEMS(0.5),S3BMS_BODY_45(4),S3BMS_HEADER_19(4),SPF_HELO_NONE(0.001),SPF_PASS(-0.001)

In the last one, we actually had 2 Rules that Triggered from our Lists giving it 8 Points, and in RNBL gave it 1.31 Points - but AWL and Bayes collectively reduced the points. Is there something I can do to avoid this?
Have the same effect, since the day BAYES_xx Rules began to work (automatically) one day for SA Scoring, it last serveral month before BAYES_xx Rules went active. I think about to deactivate both AWL and Bayes... or play around with each option.
 
Last edited:
Have the same effect, since the day BAYES_xx Rules began to work (automatically) one day for SA Scoring, it last serveral month before BAYES_xx Rules went active. I think about to deactivate both AWL and Bayes... or play around with each option.

I have for now disabled AWL, not sure about BAYES as that is an Integral Part too, but the BAYES_XXX, lets see - how is it set on your public systems do you have Bayes Enabled / AWL Enabled?
 
I have for now disabled AWL, not sure about BAYES as that is an Integral Part too, but the BAYES_XXX, lets see - how is it set on your public systems do you have Bayes Enabled / AWL Enabled?
Using PMG Defaults. Means both are on. But currently not sure, if they are contraproductive...
 
Noted, would love to hear what @heutger thinks about AWL and BAYES
For me its only important to see, if it works for me or not i never used sa-learn to train the Bayes.. so means the System autolearn does not actually an good Job for my site. Maybe just giving BAYES_00 Score 0 would also optimize the results, and turning off AWL, who knows
 
So how are you training Bayes if not by SA Learn? Do you have a hook into all MailBoxes?
 
Nothing, BAYES automatically get active with the BAYES_XX Sa Rules after some month Of incoming email flow. It only the quite high BAYES_00 rule decreasing -1.9 points. That seems to much for my site results... so do you use sa-learn (manually)?
 
My five cents on that: I have AWL and Bayes enabled. Both works fine for me, however, Bayes isn't working for either installation with autolearn. My private installation for sure has much too less mails to ever reach the required volume of spams (ham is no problem). My commercial test installation (recently, my colleagues now did a productive installation and also purchased a license therefor, I would recommend everyone to do so to support the product and future development, also you will get commercial support on the particular plans, which may be required for commercial setups) as well didn't reach the volume and our business is solely online business, so we get much mails, for sure, we are no ISP, but it's a reasonable volume. However, although my blacklists are fine upfront, still some spam come through and as the commercial productive setup is a fresh one, I can see the differences in my recent AWL learning against the current state, so I would mandate for AWL and Bayes are both usable. I'm unsure on TxRep, which is the successor of AWL, but as you seem still to have the options for AWL, I believe, PMG is still using AWL and didn't switch to TxRep (@Stoiko Ivanov are there plans to switch to TxRep anytime in future?).
 
, I believe, PMG is still using AWL and didn't switch to TxRep
This is correct

are there plans to switch to TxRep anytime in future?
currently not - In our experience from our various support-channels (which might be biased, because people only seek support if something is not working well) - AWL is not always helpful (meaning we've seen quite a few installations where it caused false negatives/false positives).

From a quick view of the wiki-page: https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TxRep it would also need training just as Bayes (which is currently not available in PMG)
 
This is correct


currently not - In our experience from our various support-channels (which might be biased, because people only seek support if something is not working well) - AWL is not always helpful (meaning we've seen quite a few installations where it caused false negatives/false positives).

From a quick view of the wiki-page: https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TxRep it would also need training just as Bayes (which is currently not available in PMG)

Thanks for response. Training is always somehow required but is also the only way beside blacklists and some settings to improve the filter over time.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!