[SOLVED] PMG does not translate UTF8 subject when filtering spam

bougatoyta

New Member
Jun 8, 2021
28
2
3
32
Hi,

We received a big spam with a subject trying to make users believe it was send by admin in the subject :

"RE: Administrateur du système"

So we created a regex blocking "Administrateur du système"

But then we received another spam with the same subject and it was not blocked by PMG.

Looking deeper the subject this time was "=?UTF-8?B?UkU6IEFkbWluaXN0cmF0ZXVyIGR1IHN5c3TDqG1l?=" which is "Administrateur du système" encoded

And since it was encoded PMG let the spam be delivered.

Any configuration or option I might have missed to avoid this ?

Regards
 

dcsapak

Proxmox Staff Member
Staff member
Feb 1, 2016
7,874
952
163
33
Vienna
no we actually decode the headers when we do a 'match field' so that should not be the problem...
most likely the problem is that the value in the what object does not get saved as utf8 (but iso) so the strings do not match...
you could try to modify your regex to be 'Administrateur du syst..?me' (so one and two byte chars can match)
that regex should be close enough
 

bougatoyta

New Member
Jun 8, 2021
28
2
3
32
no we actually decode the headers when we do a 'match field' so that should not be the problem...
most likely the problem is that the value in the what object does not get saved as utf8 (but iso) so the strings do not match...
you could try to modify your regex to be 'Administrateur du syst..?me' (so one and two byte chars can match)
that regex should be close enough
Thanks for your fast answer.

(?i)^.*Administrateur du syst..?me.*$ work as expected !

On a more technical stand (and because i'm curious), why the value in What object are stored in ISO and not UTF8 which tend to be a standard ?

Regards
 

dcsapak

Proxmox Staff Member
Staff member
Feb 1, 2016
7,874
952
163
33
Vienna
because perl handles strings weirdly ;) if a string contains characters with codepoints >128 but <256 (which is the case for é : 233 or 0xE9) it writes it out as bytes. (for codepoints >255 it converts it to utf-8 byte sequence)
so the problem is backwards compatibility, we cannot simply utf8 encode all new inputs, because we would have to decode them on read, but we don't know if it was iso or utf8 after the fact.

pmg has this issue in some places and one place it's tracked is here: https://bugzilla.proxmox.com/show_bug.cgi?id=2057
most of the time users can use a workaround like i suggested, but we really ought to fix this soon ....
 
  • Like
Reactions: bougatoyta

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!