Information for a good start with PMG

Stoiko Ivanov · Jul 15, 2020

heutger said:
@Stoiko Ivanov may be you can explain, why PMG doesn’t use Pyzor by default as well. DCC I read was about license issues (if shipped with software, you need to pay fees, however could be an option in your subscriptions or you could provide an easy install script which need to be invoked manually), but I don’t see any reasons for Pyzor and Pyzor has also good scores and influence on PMG spam detection quality.

TBH - hadn't taken a closer look at pyzor in quite a while - and don't know how well it performs in qualifying spam - would you have any experiences?

OTOH - as for the licensing and pricing - the client and server software should be fine - however as far as I understand it pyzor only works sensibly for most installs with a publicly hosted server - and the one referenced in their docs and in my google-search (public.pyzor.org:24441):
a) currently is not reachable (at least from 2 places I just tested it):

Code:

nc -v public.pyzor.org 24441
nc: connect to public.pyzor.org port 24441 (tcp) failed: Connection refused
nc: connect to public.pyzor.org port 24441 (tcp) failed: Connection refused

b) I couldn't find their AUP/conditions under which the server may be contacted, and whether it requires a subscription/payment/sending of e-mails to them/....

Spamexperts who have originally developed it (my guess based on the github url [0], and that they run 'public.pyzor.org' (see [1]), seem to have merged with solarwinds and on the homepage I could not find anything related to an AUP for public.pyzor.org - however - this is just after searching for 10 minutes - so if you have any pointers to the conditions of using public.pyzor.org (or have great experiences by running your own pyzor server) - please share them!

I hope this explains it!

[0] https://github.com/SpamExperts/pyzor
[1] https://pyzor.readthedocs.io/en/release-1-0-0/introduction.html

heutger · Jul 15, 2020

Stoiko Ivanov said:
TBH - hadn't taken a closer look at pyzor in quite a while - and don't know how well it performs in qualifying spam - would you have any experiences?

OTOH - as for the licensing and pricing - the client and server software should be fine - however as far as I understand it pyzor only works sensibly for most installs with a publicly hosted server - and the one referenced in their docs and in my google-search (public.pyzor.org:24441):
a) currently is not reachable (at least from 2 places I just tested it):

Code:

nc -v public.pyzor.org 24441 nc: connect to public.pyzor.org port 24441 (tcp) failed: Connection refused nc: connect to public.pyzor.org port 24441 (tcp) failed: Connection refused

b) I couldn't find their AUP/conditions under which the server may be contacted, and whether it requires a subscription/payment/sending of e-mails to them/....

Spamexperts who have originally developed it (my guess based on the github url [0], and that they run 'public.pyzor.org' (see [1]), seem to have merged with solarwinds and on the homepage I could not find anything related to an AUP for public.pyzor.org - however - this is just after searching for 10 minutes - so if you have any pointers to the conditions of using public.pyzor.org (or have great experiences by running your own pyzor server) - please share them!

I hope this explains it!

[0] https://github.com/SpamExperts/pyzor
[1] https://pyzor.readthedocs.io/en/release-1-0-0/introduction.html

@Stoiko Ivanov My experience is great, the additional spam scores work well for me. I remember terms of use were similar to e.g. Spamhaus. Why your connection failed is because of TCP, I read it should be UDP.

ittk · Jul 16, 2020

heutger said:
@Stoiko Ivanov My experience is great, the additional spam scores work well for me. I remember terms of use were similar to e.g. Spamhaus. Why your connection failed is because of TCP, I read it should be UDP.

@Stoiko Ivanov This should explain your connection / timeout "issue" with the nc command, which is literally none

https://cwiki.apache.org/confluence/display/SPAMASSASSIN/NetTestFirewallIssues

Pyzor
Pyzor uses both udp and tcp port 24441. It looks as though the client communicates with the server via udp but the server answers back with a tcp connection

Stoiko Ivanov · Jul 16, 2020

heutger said:
I remember terms of use were similar to e.g. Spamhaus.

Any chance you have a link to their terms somewhere? (would help in making an informed decision)

heutger said:
Why your connection failed is because of TCP, I read it should be UDP.

Thanks for the hint - did overread that in the documentation.

This still leaves the question of the acceptable use policy of public.pyzor.org - and we cannot integrate it into PMG if this is not permitted.

ittk · Jul 16, 2020

Stoiko Ivanov said:
Any chance you have a link to their terms somewhere? (would help in making an informed decision)

Thanks for the hint - did overread that in the documentation.

This still leaves the question of the acceptable use policy of public.pyzor.org - and we cannot integrate it into PMG if this is not permitted.

As already quoted some posts before: https://pyzor.readthedocs.io/en/release-1-0-0/introduction.html#license
https://pyzor.readthedocs.io/en/release-1-0-0/about.html#history

https://readthedocs.org/projects/pyzor/downloads/pdf/latest/

https://github.com/SpamExperts/pyzor

https://github.com/SpamExperts/pyzor/blob/master/README.rst

About
Pyzor is a Python implementation of a spam-blocking networked system that use spam signatures to identify them.

Resources
Readme
License
GPL-2.0 License

tom · Jul 16, 2020

@ittk we talk about the use policy of public.pyzor.org and not the license of the software.

Something like you see here for Spamhaus - https://www.spamhaus.org/organization/dnsblusage/

ittk · Jul 16, 2020

tom said:
@ittk we talk about the use policy of public.pyzor.org and not the license of the software.

Something like you see here for Spamhaus - https://www.spamhaus.org/organization/dnsblusage/

Don't know if there is any for it...

Andy_red · Jul 16, 2020

@tom.193
for us it would be fantastic if only the client was integrated; and thanks to an option we can integrate the server address or multiple servers.
Because if you manage many servers it can be useful to have an internal one too.

tom · Jul 16, 2020

We do not integrate anything before we are sure that is a great enhancement for our users.

We always test and validate everything before we can implement.

heutger · Jul 21, 2020

tom said:
We do not integrate anything before we are sure that is a great enhancement for our users.

We always test and validate everything before we can implement.

You're welcome to test. Pyzor made a big step forward in spam detection for me, DCC additional was very helpful, but I know of the licensing issues.

thebiggeek · Jul 23, 2020

Thanks to the lovely guide by @heutger and the lovely people at Proxmox, we made some progress in installing the PMG in a cluster for our inbound Spam Handling, we also have some EFA Servers, that we are now walking away from, and I was facing a particular issue - Not all my emails are checked by Pyzor and some of the spam Passes through - not sure if Pyzor should be invoked each time, and if I some of you have installed your own Pyzor Server. The Mail Traffic that is passing through the PMG cluster right now is under 20,000 emails a day, and we should stay at that level for now, so I don't think I am running into any Querry limits - though I would like to check this (no clue how to).

I also noticed, that DCC is crashing randomly, and I get a Broken pipe message. Restarting DCC Helps, and now I have setup a Monitor to check if DCC Crashes, and have a cron job that gets actioned to go restart the service. But am still not sure why Pyzor is not on each email and how to check the reasons of DCC Crashing.

heutger · Jul 23, 2020

You’re welcome. I’m sorry, that I currently also have no glue on your stability issues. My installations work fine, however, they are far away from your numbers. Maybe you could also start to vote on Proxmox team integrating Pyzor and considering offer DCC as install script or subscription option because of license limitations. Maybe you should consider to have your own Pyzor server (maybe the public just had high load or not only mails but connections count). You may also ask @ittk as of his findings on DCC, maybe you should rotate logs and clean database on a regular base.

thebiggeek · Jul 23, 2020

heutger said:
You’re welcome. I’m sorry, that I currently also have no glue on your stability issues. My installations work fine, however, they are far away from your numbers. Maybe you could also start to vote on Proxmox team integrating Pyzor and considering offer DCC as install script or subscription option because of license limitations. Maybe you should consider to have your own Pyzor server (maybe the public just had high load or not only mails but connections count). You may also ask @ittk as of his findings on DCC, maybe you should rotate logs and clean database on a regular base.

Thank you - Appreciate your Points

1. Big Shoutout to the Proxmox team - I agree that this is the MacOS of what EFA/MailScanner were
2. The Number of 20,000 is the high end number, right now I believe we are doing nearly 2000 Per Server Per Day, but once they go in production - the number may reach there, after the Fail2Ban Implementation the numbers have dropped.
3. Yes Pyzor and DCC Should be integrated I believe - how do I vote on this?
4. I am looking at setting up a Pyzor server - for 2 Reasons, I am in India and right now we are seeing a lot of Ban this, Ban that on our networks, and there are several time outs on our International backbones as there is just too much traffic being generated - let me see how to do this and how to integrate it - will keep you posted
5. The only thing I notice DCC Crashes for is the Time out - now that I have the service in monitoring, I will have some more log information collected - will keep you posted, and also see what @ittk has to say
6. What Database are you talking here - that should be cleaned?
7. I am rotating my logs - as a standard practice

BTW @heutger questions
1. Are you still using a Milter?
2. What are the Final lists you are using?

ittk · Jul 24, 2020

thebiggeek said:
Thank you - Appreciate your Points

1. Big Shoutout to the Proxmox team - I agree that this is the MacOS of what EFA/MailScanner were
2. The Number of 20,000 is the high end number, right now I believe we are doing nearly 2000 Per Server Per Day, but once they go in production - the number may reach there, after the Fail2Ban Implementation the numbers have dropped.
3. Yes Pyzor and DCC Should be integrated I believe - how do I vote on this?
4. I am looking at setting up a Pyzor server - for 2 Reasons, I am in India and right now we are seeing a lot of Ban this, Ban that on our networks, and there are several time outs on our International backbones as there is just too much traffic being generated - let me see how to do this and how to integrate it - will keep you posted
5. The only thing I notice DCC Crashes for is the Time out - now that I have the service in monitoring, I will have some more log information collected - will keep you posted, and also see what @ittk has to say
6. What Database are you talking here - that should be cleaned?
7. I am rotating my logs - as a standard practice

BTW @heutger questions
1. Are you still using a Milter?
2. What are the Final lists you are using?

As for your Point 5:

1. Timeouts are not program crashes!
2. You mean the normal DCC timeout you specify by your own within SA Configfile init.pre.in here?
Its per default 8 seconds and this should be enough to obtain the DCC result, i can only imange of this scenarios, to run into timeouts: you have very poor internet connection metrics (latency, RTT, overloaded bandwidth, DNS-Servers not responding fast enough and so on, so you will sometime run into timeouts.

dcc_timeout n (default: 8)
How many seconds you wait for DCC to complete, before scanning continues without the DCC results.

thebiggeek · Jul 24, 2020

ittk said:
As for your Point 5:

1. Timeouts are not program crashes!
2. You mean the normal DCC timeout you specify by your own within SA Configfile init.pre.in here?
Its per default 8 seconds and this should be enough to obtain the DCC result, i can only imange of this scenarios, to run into timeouts: you have very poor internet connection metrics (latency, RTT, overloaded bandwidth, DNS-Servers not responding fast enough and so on, so you will sometime run into timeouts.

dcc_timeout n (default: 8)
How many seconds you wait for DCC to complete, before scanning continues without the DCC results.

Thank you, for you inputs. Actually I have not modified timeout, currently it is set to Default of 10 and I have not changed that. My Only conclusion on saying that could be the reason is, that just before DCC Crashed with the message "dccifd[30843]: write(MTA socket,123): Broken pipe" was a timeout in the lookup. I have decent bandwidth and about 45 mbps was available to this server to use, the latency bit is something that I can't control at the moment, till I install my own DCC Server (if that is possible).

While we have local Unbound based servers that this cluster was using - I have installed Unbound on the Same Machine. Since then I have not had a Crash. While I still keep this in observation, I thank you for taking the time out to answer this query.

heutger · Jul 24, 2020

thebiggeek said:
Thank you - Appreciate your Points

1. Big Shoutout to the Proxmox team - I agree that this is the MacOS of what EFA/MailScanner were
2. The Number of 20,000 is the high end number, right now I believe we are doing nearly 2000 Per Server Per Day, but once they go in production - the number may reach there, after the Fail2Ban Implementation the numbers have dropped.
3. Yes Pyzor and DCC Should be integrated I believe - how do I vote on this?
4. I am looking at setting up a Pyzor server - for 2 Reasons, I am in India and right now we are seeing a lot of Ban this, Ban that on our networks, and there are several time outs on our International backbones as there is just too much traffic being generated - let me see how to do this and how to integrate it - will keep you posted
5. The only thing I notice DCC Crashes for is the Time out - now that I have the service in monitoring, I will have some more log information collected - will keep you posted, and also see what @ittk has to say
6. What Database are you talking here - that should be cleaned?
7. I am rotating my logs - as a standard practice

BTW @heutger questions
1. Are you still using a Milter?
2. What are the Final lists you are using?

3. You could open a feature request bug tracker ticket or if there is already such a bug tracker ticket, you could add your comments there. You could also send a mail to the developer mailing list, commit your own code there or could also open a feature request ticket for a voting system to vote for new features (currently Proxmox don't have, but I believe, would be a good idea, helped e.g. Plesk very much). Bug tracker is here: https://bugzilla.proxmox.com

5./6./7. in internal conversation with @ittk we talked about some "routine jobs" which may be required to be added for DCC, so maybe he can provide you with his investigation results on rotating also DCC logs and clean the database (from https://www.dcc-servers.net/dcc/INSTALL.html

Install a daily or more frequent cron job like misc/crontab and /var/dcc/libexec/cron-dccd to prune dccm or dccifdlog files and the prune dccd database with dbclean.)

1. yes, as I did not yet had time to upgrade, but will do asap, with PMG 6.2 I will for sure use the pre-queue system from Proxmox.
2. somehow the same as in my advancing thread with three adjustments:
a) at the end you see only xxx, which states, that this lists are secret (as they need to be paid for, you need to sign up with invaluement and then will get their list names and list access)
b) some lists in my documentation I used with *2, I recently changed that to *1 to prevent from any false-positives at all, also on the lists, I really much trust
c) I changed zen.spamhaus.org to the data feed name and signed up to the data feed service as described at https://forum.proxmox.com/threads/support-for-spamhaus-dqs.69619/, I really don't see any impact, but it's also hard to tell, if the just minutes faster data feed release would have caught just one or two additional spam mail than the usual DNS service. However, I have great statistics on my usage, that's nice. But you need to consider, that you may break the message or most important connection limit, if you would check as me for particular rhsbl occurrences like helo, sender, address, client host name as well as via SpamAssassin and normal rbl check.

Regards,
Christian

Search

Search

Information for a good start with PMG

Stoiko Ivanov

Proxmox Staff Member

heutger

Famous Member

ittk

Member

Stoiko Ivanov

Proxmox Staff Member

ittk

Member

tom

Proxmox Staff Member

ittk

Member

Andy_red

Member

tom

Proxmox Staff Member

heutger

Famous Member

thebiggeek

Active Member

heutger

Famous Member

thebiggeek

Active Member

ittk

Member

thebiggeek

Active Member

heutger

Famous Member

We value your privacy