Hi,
I want to write about a new feature my hoster uberspace does now give me. Up to now, they gave me an global SpamAssaine and the ability to install every other solution locally in my user account.
Unfortunately, SpamAssasine was not so ideal for me as I encrypt all emails with my public key on arrival. Therefore, teaching SpamAssine was nearly impossible (I would have had to decrypt the mails on my pc and send it in some way to SpamAssasine).
Now there is also a global setup for DSPAM BUT learning is still based on each user AND DSPAM can learn just by the signature it has given the mail. What this means is this:
You do not have to give dspam a complete unencrypted mail for retraining if it has seen the unencrypted mail beforehand but only the signature! And the signature is just a header and therefore not encrypted! Yeah! So all I have to do now is send an email to my DSPAM which has the emails for retraining as an attachment inside.
I only had to write a small python script which processes all emails send to a special email address. It will take all the attachments, look for a DSPAM signature and retrain DSPAM to recognize it as spam or ham!
dspam-learn-from-attachments.py
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
from __future__ import print_function
import sys
import email.parser
import subprocess
import re
from string import Template
sub_temp = Template("nice -n 19 ionice -c3 /package/host/localhost/dspam/bin/dspam \
--source=error --class=$type --signature='$sig'")
sig_test = re.compile("^[0-9A-Fa-f]{23}$")
#check python version as we are using argpaser instead of optparser
if not ((int(sys.version_info[0]), int(sys.version[2])) >= (2,7)) :
print("This program uses packages only avaiable in python 2.7 and above")
exit(1)
#this should be build in since python 2.7
import argparse
def main():
parser = argparse.ArgumentParser(description='Pipe Inputmail to dspam-learn as ham or spam')
parser.add_argument('val', choices=['innocent', 'spam'])
email_obj = email.parser.Parser().parse(sys.stdin)
args = parser.parse_args()
tree = [email_obj]
while not len(tree) == 0:
pay = tree.pop()
if pay.is_multipart():
tree.extend(pay.get_payload())
sig = pay["X-DSPAM-Signature"]
if not sig is None:
if sig_test(sig):
subprocess.call(sub_temp.substitute(type=args.val, sig=sig), shell=True)
if __name__ == '__main__':
main()
The script is called by qmail using this:
|~/bin/dspam-learn-from-attachments.py spam
or
|~/bin/dspam-learn-from-attachments.py innocent