I have a problem with some spam messages with the subject field encoded in utf8 base 64 and weird characters used to fool the filter rules
example:
raw subject of incoming email
Subject: =?UTF-8?B?UklGSVVU0J4gREkgUklOTtCeVtCe?=#821538
decode by spamassasin contains this char О instead of O
__SUBJ_NOT_SHORT ======> got hit: "RIFIUTО DI RINNOVO"
so the rule not trigger
header __SUBJECT_PHISHING_3 Subject=~ /(RIFIUTО DI RINNОVО)/i
however these characters are displayed in the email client ( Outlook or Thunderbird) with an O and result correct in italian language to fool the user
RIFIUTО DI RINNОVО
So the spammer inserts weird characters knowing that the client will show them correctly in Italian while spamassassin will not trigger the rule
there is a solution to match these characters or decode them like the email client do without having to create a new rule every time the spammer insert special char to bypass filter
found same problem with some hint https://users.spamassassin.apache.narkive.com/LhGDKXkm/utf-8-spam-rules
RIFIUTО DI RINNOVO
(О instead O )that the mail client instead displays asRifiuto di rinnovo
, correct in Italian. So if I create a rule to block emails with subjectRifiuto di rinnovo
the spammer manages to bypass it, I would like to understand if there is a way with spamassassin to decode special characters in the defined language (italian) to avoid having to create ad hoc rules every time a new modified subject arrives