Does not support Chinese character set filtering

yanfei

Active Member
Mar 7, 2019
17
3
43
37
Hello everyone
We have many rules for Chinese spam. In the filtering of "boay", when the original text of the mail uses character sets such as GBK/GB2312/GB18030, it cannot be filtered. When it is UTF-8, it can be filtered 。How can I make proxmox support the filtering of the content of these character sets?

1622691430819.png

1622691489704.png
 
Last edited:
Is XXX_SPAM_KECHENG a xcustom rule? (I cannot find it in the rulesets distributed and updated by PMG) - If yes - how does it look like?
 
Is XXX_SPAM_KECHENG a xcustom rule? (I cannot find it in the rulesets distributed and updated by PMG) - If yes - how does it look like?
body __ZHCN_SPAM_KECHENG1 /元{0,1}.{0,1}\/.{0,1}人/
body __ZHCN_SPAM_KECHENG2 /课程|讲.{0,1}师|大.{0,1}纲/
body __ZHCN_SPAM_KECHENG3 /管理|中层|团队|税务|人力|高新技术|阿米巴|经营/
meta ZHCN_SPAM_KECHENG (( __ZHCN_SPAM_KECHENG1 + __ZHCN_SPAM_KECHENG2 + __ZHCN_SPAM_KECHENG3) > 2)
describe ZHCN_SPAM_KECHENG The spam mail for peixun and kecheng
score ZHCN_SPAM_KECHENG 5.9
This is the content of the rule。
 
Last edited:
Sadly I cannot offer an easy solution - but at least a quick analysis of what I think is not working here:
* the body rule you created contains the characters as UTF-8 character - this is why the mails which are encoded with UTF-8 match
* Spamassassin does (at least to my knowledge, but I only quickly searched online) not convert/reencode messages or your rules (this would also not be possible in general (e.g. many characters do not have a encoded version in ISO-8859-1)

The only thing that could work (although it is not really comfortable) is to create the rule again but instead of writing the Chinese characters as you type them, you need to add their byte-sequence in the GBK encoding

I hope this helps!