Spam or not spam

Top  Previous  Next

Parsing texts > Spam or not spam

 

It would be ideal if one could describe the structure of the e-mails so completely that the amount of the mails which cannot be parsed would be identical with the amount of the spam mails. This ideal case might be rare, but it is not impossible. E.g. company mail could be organized such,  that it must obey an exactly defined structure. The IMP filter will usually work with approximations, however and only a certain subset of the mails is recognized for certain as spam or not spam. So the other filters of the Spamihilator are needed furthermore.

 

TextTransformer projects usually are made to produce target texts from source texts. The target text for the Spamihilator is simply "1" or "0"or" -1" for non-spam, an indifferent text and spam.

 

 

Non-spam

"1"

Spam

"-1"

indifferent

"0"

 

 

The returned text is produced in the so-called semantic actions. The definition of the salutation above is therefore supplemented to:

 

Anrede ::= 

("Lieber" | "Hallo") 

(   

    "Heinz" {{iResult = 1; }}

  | WORT    {{iResult = -1; }}

)

 

"iResult" is a variable which has to be declared before. This is demonstrated in the examples.