How to train for spam

EagleEye

New Member
Feb 2, 2018
5
2
3
41
Hi,
I have setup the GW 5.0, but today I get a lot of spam. My old gateway had a function to learn what is spam.
Is there an option in this gateway solution or did I have to define my own kind of rules?
 
- The bayesian filter and auto white list features have auto learning algorithms. (Admin Manual -> Spam Detection Feaetures)
- You can examine X-Spam headers in several emails to see how spamassassin is evaluating them, make small adjustments to the score on one of the rules and examine the headers again. (Admin Manual -> Custom SpamAssassin Configuration)
- collect spam mails and feed them to spamassassin manually
 
collect spam mails and feed them to spamassassin manually
Hi this is what I intent to do, is there a function the gateway to do this?
Or did I have to put the mails manual on the machine and give them to sa?
 
I ran the same old system (scrollout f1) an that had an IMAP account to put whitelist / blacklist / training emails to. It was actually a very convenient way to train the system.
 
I ran the same old system (scrollout f1) an that had an IMAP account to put whitelist / blacklist / training emails to. It was actually a very convenient way to train the system.

Has Proxmox done any work on this kind of feature yet? is it planned for future releases?
There really does need to be an easy way to train for spam messages missed by the system.
 
Unsupervised learning from SpamAssassin rules ( = autolearning ) is enabled. It's efficient & fast. For a site-wide bayesian filter I doubt that any additional (manual) training by users provides better results especially when they provide ham as spam or vice versa. I personally don't see the need for additional manual learning even if you start from scratch.
 
  • Like
Reactions: DerDanilo
Hi Folks,

Sorry to drag up an oldish thread.

We have just moved over from Barracuda to PMG, as a result we are quite rightly due to a lack of learning seeing a lot of SPAM messages getting through.

Whilst I appreciate the 'auto learn' feature and all its benefits when the calls, no screams, to turn back on the Barracuda reach the level I now have to defend there has to be something that can be done to expedite the 'learning'.

We have had the PMG in place now for 2 months and the honesty the catchment rate isn't really improving.

Being able to setup a 'SPAM' / 'HAM' set of mailboxes and using these to teach SA is a must.

Has anyone used the manual SA-LEARN util with PMG, what method of feeding the mail into the system do you use given PMG isn't a full mailserver and hence there are no mbox/mbx/maildir folders to scan?

Cheers
 
I agree, this is a much needed addition !

Hi Folks,

Sorry to drag up an oldish thread.

We have just moved over from Barracuda to PMG, as a result we are quite rightly due to a lack of learning seeing a lot of SPAM messages getting through.

Whilst I appreciate the 'auto learn' feature and all its benefits when the calls, no screams, to turn back on the Barracuda reach the level I now have to defend there has to be something that can be done to expedite the 'learning'.

We have had the PMG in place now for 2 months and the honesty the catchment rate isn't really improving.

Being able to setup a 'SPAM' / 'HAM' set of mailboxes and using these to teach SA is a must.

Has anyone used the manual SA-LEARN util with PMG, what method of feeding the mail into the system do you use given PMG isn't a full mailserver and hence there are no mbox/mbx/maildir folders to scan?

Cheers
 
Hi Folks,

Sorry to drag up an oldish thread.

We have just moved over from Barracuda to PMG, as a result we are quite rightly due to a lack of learning seeing a lot of SPAM messages getting through.

...

I doubt that this will help. Instead, make sure that you have optimized settings and rule system setup.

If you have suitable subscription agreement, just send your config to our support team and they can check your settings and can assist finding the best configuration for your needs.
 
Hi,

I am not using PMG, but I am an old postfix / mail admin. And during many years I try many things for spam protection. In my own opinion if you want to reduce the spam, training spamassassin have a very low inpact.
The most effective setup is to use many tools who has nothing to do with your mail stack like:

- firewall, the effectiveness could be around 10 %
- block many country ip networks as you can ( you must check what country could be blocked in your case - in may case, I block 3 countries: RU, UA, CN ) - in this case my spam rate was reduced with 40 %
- use headers/body filter
- users traings and good black lists
- block any fail auth via smtp/dovecot (I drop in firewall any fail auth from the same IP if the client have 4 or more wrong auth / 60 minutes )
- make your own dns for unwanted countries ,like country .XY dns domain have 127.0.0.24 , so no email can not be sent to you
- denied any clients connection to your smtp if the client do not have a fixed Ip in reverse (like ppp*, ppoe*, dynamic*, etc)

In my country is not possible to run a mailserver on a dynamic IP, and tcp 25 port is forbiden by ANY ISP. Unfortunately many other EU don't use this simple rule. Another problem that help a lot for any spam deliver is the fact that many biggest ISP from EU do not take any action to force their own clients if they recive a documented spam complain (mybe GDPR will change this), even if this ISP is very huge (I do not want to say any names)

So the spam is possible because many huge ISP do not want to solve this.

As I said in my case for with this settings I receive in the worst case around 2-3 spams for around 100 mail-users.
 
So I got a bit sick of having no real solution to this - so I hacked something together....

My path structure is /vmail/$user/ for email. I use a Spam folder which ends up at /vmail/$user/.Spam/cur and /vmail/$user/.Spam/new

Set up an SSH public / private key to allow you to log into your PMG from your mail server. I won't cover this here - the instructions are a google search away if required.

On your mail server end, add the following script to your cron:

Bash:
#!/bin/bash
MAILFILTER=10.1.1.94

for i in /vmail/*/.Spam/cur/* /vmail/*/.Spam/new/*; do
        if [ -f "$i" ]; then
                STATUS=`file "$i"`
                if [[ $STATUS == *"gzip"* ]]; then
                        gunzip -d -c "$i" > /tmp/tempmail.$$
                fi
                if [[ $STATUS == *"bzip2"* ]]; then
                        bzip2 -d -c "$i" > /tmp/tempmail.$$
                fi
                if [[ $STATUS == *"SMTP mail"* ]]; then
                        cp "$i" /tmp/tempmail.$$
                fi

                cat /tmp/tempmail.$$ | ssh root@$MAILFILTER report
                if [ $? != 0 ]; then
                        echo "Error running sa-learn. Aborting."
                        exit 1
                fi
                rm -f "$i"
                rm -f /tmp/tempmail.$$
        fi
done

On the PMG end, add `command="/root/bin/spam-reporter"` to your public key in /root/.ssh/authorized_keys.

Put this on your PMG as /root/bin/spam-reporter:
Bash:
#!/bin/sh 
case "$SSH_ORIGINAL_COMMAND" in 
        report) 
                sa-learn --spam 
                ;; 
        revoke) 
                sa-learn --ham 
                ;; 
        *) 
                echo "Wwwwhat?" 
                ;; 
esac

What will happen is that your mail server will end up reporting each spam message via sa-learn on the PMG system - which should allow your bayes filter to learn your spam a little better...
 
So I got a bit sick of having no real solution to this - so I hacked something together....

My path structure is /vmail/$user/ for email. I use a Spam folder which ends up at /vmail/$user/.Spam/cur and /vmail/$user/.Spam/new

Set up an SSH public / private key to allow you to log into your PMG from your mail server. I won't cover this here - the instructions are a google search away if required.

On your mail server end, add the following script to your cron:

Bash:
#!/bin/bash
MAILFILTER=10.1.1.94

for i in /vmail/*/.Spam/cur/* /vmail/*/.Spam/new/*; do
        if [ -f "$i" ]; then
                STATUS=`file "$i"`
                if [[ $STATUS == *"gzip"* ]]; then
                        gunzip -d -c "$i" > /tmp/tempmail.$$
                fi
                if [[ $STATUS == *"bzip2"* ]]; then
                        bzip2 -d -c "$i" > /tmp/tempmail.$$
                fi
                if [[ $STATUS == *"SMTP mail"* ]]; then
                        cp "$i" /tmp/tempmail.$$
                fi

                cat /tmp/tempmail.$$ | ssh root@$MAILFILTER report
                if [ $? != 0 ]; then
                        echo "Error running sa-learn. Aborting."
                        exit 1
                fi
                rm -f "$i"
                rm -f /tmp/tempmail.$$
        fi
done

On the PMG end, add `command="/root/bin/spam-reporter"` to your public key in /root/.ssh/authorized_keys.

Put this on your PMG as /root/bin/spam-reporter:
Bash:
#!/bin/sh
case "$SSH_ORIGINAL_COMMAND" in
        report)
                sa-learn --spam
                ;;
        revoke)
                sa-learn --ham
                ;;
        *)
                echo "Wwwwhat?"
                ;;
esac

What will happen is that your mail server will end up reporting each spam message via sa-learn on the PMG system - which should allow your bayes filter to learn your spam a little better...

Great job. That looks fine. However, using Sieve to invoke sa-learn on PMG would be my goal or much better seeing Proxmox team integrating PMG with archiving and mail server (e.g. something like SoGo or Kollab).
 
quick question what i dont understand is that how would the spam go to the mail server when it goes strait to the PMG? or do the users put in the spam folder and PMG would try to understand it?
 
It only works by having users throw spam that got through the filters into the Spam folder on their IMAP account. We only really care about spam that made it through the filters - as if its caught and blocked, great.

The annoying part is when spam gets quarantined - as the only way to feed it back as spam is to deliver the spam to the user, who then places it in their spam folder.
 
That’s exactly the issue:

As PMG doesn’t see itself as final destination (e.g. as archiving server or mail server like Kollab) there needs to be an integration with the final server. Beside fetching each spam box from IMAP servers (which would only train spam), having a centralized mailbox for spam and ham (mails need to be copied there, e.g. via user action buttons in mailclient) or (best solution on IMAP setups) using Sieve to invoke learning spam and ham on folder move of messages (needs scripting).

Finally, which would result in a reasonable feature request, it’s not understandable, if mails are already quarantined on PMG (not my preferred setup, but many do so), why don’t invoke sa-learn for spam or ham on deletion or release as that’s the most easy way (mail is still on PMG and doesn’t need to be transferred back, it’s just a call of sa-learn).
 
so lets me get this, this script is for emails that go though the PMG filter so the users can put in the spam folder and the PMG will learn. But what i dont get is this part

The annoying part is when spam gets quarantined - as the only way to feed it back as spam is to deliver the spam to the user, who then places it in their spam folder.

but isnt that a good thing?
 
so lets me get this, this script is for emails that go though the PMG filter so the users can put in the spam folder and the PMG will learn. But what i dont get is this part



but isnt that a good thing?

That's getting the quarantine per absurdum. The idea of the quarantine is not to deliver potential spam and have a single point to check potential spam and decide if you want to release it (if it isn't spam) or delete it (if it is spam). But now you need to release spam just for moving it into the spam folder to get the script fetch the spam to learn it to PMGs sa-learn. That's finally really a stupid requirement. Much easier (and more logical) would be, if spam could be learned by clicking on delete in the quarantine and done. However, for sure the only reasonable argument against that is, that potential stupid users would click on delete and train mails as spam which aren't spam. However, if they would blacklist mail addresses or do other stupid stuff will result in the same. So for sure the filter can only be as such good as the ones using it. I believe, that's reasonable. However, I just rise voice to get that implemented because it really makes much sense for me (maybe a customizable option, to be safe from stupid users in particular environments), for my own setups, I don't need that improvement as I won't ever use the quarantine, I don't like the concept of a quarantine, it's not useful for me, so I also won't ever delete or release any message there.
 
  • Like
Reactions: killmasta93
I still use the quarantine globally, simply because to give PMG a workout, I ended up putting it in front of my mailing list servers that I operate for community projects. The quarantine lets me look at stuff that might be posted to the list if it wasn't for the spam filter.

Some of these are postable by anyone as they serve a special purpose (ie the committee mailing list) and need to be open to non-subscribers.

As I don't think you can control the quarantine for each individual user, it means I have to have it turns on for myself as well...
 
I still use the quarantine globally, simply because to give PMG a workout, I ended up putting it in front of my mailing list servers that I operate for community projects. The quarantine lets me look at stuff that might be posted to the list if it wasn't for the spam filter.

Some of these are postable by anyone as they serve a special purpose (ie the committee mailing list) and need to be open to non-subscribers.

As I don't think you can control the quarantine for each individual user, it means I have to have it turns on for myself as well...

Makes sense, however, I don't prefer to use a quarantine for our company or finally in productive environments as most like us are working in a timely urgent business, so I can't always let my stuff recheck quarantine on mails, which may got caught there. It's much easier, just to deliver mails with tags, so you may consider spam with a subject tag spam as spam, but don't require to use the quarantine.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!