Hello,

I was looking for a way to store my emails encrypted on my uberspace account. While this does not help to protect your data on the way to your server, it makes it impossible that your emails are stolen later on.

I found a Perl solution by Mike cardwell here: gpgit perl but as I could not get it to work on my uberspace account and also do not like perl very much, I decided to write a python version. The initial version was using Python2, this version here is Python3 only (or at least I haven't tested if it works with Python2). I also removed all external dependencies (in the past, I was using gnupg-python to access gnupg but this was a lot of trouble and hence removed).

Enjoy!

gpgit3.py

#!/usr/bin/env python3
# -*- coding: ascii -*-
##############################################################################
# Copyright 2016, Frederik Lauber - https://flambda.de/ #
# #
# based on gpgit.pl by
# Copyright 2011, Mike Cardwell - https://grepular.com/ #
# #
# This program is free software; you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation; either version 3 of the License, or #
# any later version. #
# #
# This program is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details. #
# #
# You should have received a copy of the GNU General Public License #
# along with this program; if not, write to the Free Software #
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA #
# #
##############################################################################
"""
Word definitions:
    keyid = last 8 symbols of the key fingerprint
    identifier:
        fingerprint
        0xkeyid
        email_address
            lookup to uids in public keys
    argument:
        regex%identifier
            encrypt if regex matches "To"
        identifier
            encrypt all
        if multiple match, the mail will be encrypted with multiple keys
"""

"""
Some thoughts:
    How to handle emails which cannot be encrypted?
        At the moment, an exception is raised and it is the callers duty
        to either deliver or retry the email
    Why are you not using the gnupg python package?
        because there are multiple packages all with slightly different option names and
        encoding handling.
        After a lot of pain, I decided to just write the two functions I need and
        drop the external dependency
        This also means that I do not support all gpg options anymore but I just
        do not see the point as I am not using most of them anyway
    You dropped a lot of the options Mike cardwells package has, why?
        See above, I myself was not using them and I am also not sure if they
        are really needed. They also add a lot more complexity and I wanted to have this work first
        and if someone needs more options, implement them then.
    You dropped the config file and the logger support?
        Yes, see above, it added a lot of complexity for very little benefit.
        If someone wants to have it again, I can reimplement it but for now I am jsut happy that
        I have gotten around the encoding issues and hence have a working python3 version.
"""



GPG_PATH = u"gpg"
#home dir of the user runing the script will be used
import sys
import email.utils
import email.mime.multipart
import email.encoders
import email.mime.application
import re
import argparse
import copy
import  subprocess


class EncryptionFailed(Exception):
    pass


class NoValidKeyIdentifier(Exception):
    pass


class PrivateKeyNotFound(Exception):
    pass


class UncompilableRegex(Exception):
    pass

userid_cache = None
def get_possible_userids():
    global userid_cache
    if userid_cache is None:
        fingerprint_regex = re.compile(b"""((?:[a-fA-F0-9]{4}\s+){10})""")
        email_regex = re.compile(b"""([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)""")
        tmp = set()

        with subprocess.Popen([GPG_PATH, u"--fingerprint"], 
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE
                        ) as proc:
            output = proc.stdout.read()
        for match in re.findall(fingerprint_regex, output):
            tmp.add(match.strip().replace(b" ",b"").decode("ascii"))

        user_ids = list(tmp)
        user_ids.extend([u"0x" + x[-8:] for x in tmp])
        for match in re.findall(email_regex, output):
            user_ids.append(match.strip().decode("ascii"))
        userid_cache = user_ids
        return user_ids
    else:
        return userid_cache


def encrypt(bytes, receipients):
    command = [GPG_PATH, u"--encrypt", u"-a", u"--trust-model", u"always"]
    for rec in receipients:
        command.append(u"--recipient")
        command.append(rec)
    proc = subprocess.Popen(command, 
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE,
                        stdin=subprocess.PIPE)
    encrypted, errs  = proc.communicate(bytes)
    proc.stdin.close()
    proc.stdout.close()
    proc.stderr.close()
    if proc.wait() != 0:
        raise EncryptionFailed
    return encrypted


def encrypt_email(email_obj, fingerprint_list):
    new_mail_obj = email.mime.multipart.MIMEMultipart(_subtype="encrypted", protocol="application/pgp-encrypted")
    for entry in email_obj._headers:
        if not entry[0].startswith("Content"):
            new_mail_obj.add_header(entry[0], entry[1])
            del email_obj[entry[0]]
    encryption_object = encrypt(email_obj.as_bytes(), fingerprint_list)

    mime_version_part = email.mime.application.MIMEApplication("Version: 1", _encoder=email.encoders.encode_noop, _subtype="pgp-encrypted")
    mime_version_part.set_payload("Version: 1")
    mime_version_part.add_header("Content-Description", "PGP/MIME version identification")
    encrypted_data = email.mime.application.MIMEApplication(encryption_object, "octet-stream", _encoder=email.encoders.encode_noop)
    encrypted_data.add_header("Content-Description", "OpenPGP encrypted message", name="encrypted.asc")
    encrypted_data.add_header("Content-Disposition", "inline", filename='encrypted.asc')
    new_mail_obj.attach(mime_version_part)
    new_mail_obj.attach(encrypted_data)
    return new_mail_obj


def _remove_corrupt_headers(email_obj):
    """Some headers are invalid after encryption.
    Right now, I only remove the ones gpgit.pl also removes
    If you have more information, please notify me"""
    del email_obj["DKIM-Signature"]
    del email_obj["DomainKey-Signature"]



def is_encrypted_mime(email_obj):
    #mime encrypted if:
        #email is multipart
        #email content subtype is "encrypted"
    if email_obj.is_multipart() and email_obj.get_content_subtype() == "encrypted":
        return True
    else:
        return False


def is_encrypted_inline(email_obj):
    #look for the typical PGP message lines in the payload
    #this might not be accurate I guess if the mail is only partial encrypted
    #also, check how this reacts to inline only signed messages
    #while this is not ideal, inline encryption should die in a fire anyway
    for pay in email_obj.walk():
        if pay.get_content_type() == "text/plain" and "-----BEGIN PGP MESSAGE-----" in pay.get_payload() and "-----END PGP MESSAGE-----" in pay.get_payload():
            return True
    return False


def is_encrypted(email_obj):
    return is_encrypted_mime(email_obj) or is_encrypted_inline(email_obj)


def _matching_finderprints(email_obj, regex_fingerprint_list):
    #function will match the given regex information
    #to find all keys the mail is supposed to be
    #encrypted with
    fingerprints = set()
    for rec in regex_fingerprint_list:
        if rec.match(email_obj):
            fingerprints.add(rec.identifier.encode("ascii"))
    if len(fingerprints) == 0:
        raise NoValidKeyIdentifier
    return list(fingerprints)

#helper function to generate the argparse and the helper classes needed
def _return_arg_parser():
    class LineFormatter(argparse.HelpFormatter):
        def _split_lines(self, text, width):
            tmp = []
            for manual_line in text.splitlines():
                tmp.extend(argparse.HelpFormatter._split_lines(self, manual_line, width))
            return tmp

    class Recipient(object):
        def __init__(self, identifier, regex=re.compile(".*")):
            self.regex = regex
            self.identifier = identifier

        def __hash__(self):
            return self.identifier

        def __eq__(self, other):
            return self.identifier == other.identifier

        def __repr__(self):
            return self.identifier

        def __str__(self):
            return self.identifier

        def match(self, email_obj):
            for to in email_obj["to"]:
                if self.regex.match(to):
                    return True
            else:
                return False

        @classmethod
        def _from_identifier_regex(cls, identifier, regex=".*"):
            try:
                return cls(identifier, cls.compile_regex(regex))
            #except NoValidKeyIdentifier:
            #   msg = "%s is not a valid key meaning it is neither a key id nor an email address" % identifier
            #   raise argparse.ArgumentTypeError(msg)
            #except PrivateKeyNotFound:
            #   msg = "No public key found for %s" % identifier
            #   raise argparse.ArgumentTypeError(msg)
            except UncompilableRegex:
                msg = "%s does not compile to a Regex Expression" % regex
                raise argparse.ArgumentTypeError(msg)

        @classmethod
        def from_argparse_string(cls, string):
            """takes a recipient string from argparse and
            makes sure a valid recipient object with a
            valid fingerprint and compiled regex is returned.
            If the reciepient string was invalid, an exception
            explaining the problem will be raised.
            This function can only be called during argparse due to
            the use of argparse exceptions."""
            try:
                return cls._from_identifier_regex(*reversed(string.split("%")))
            except TypeError:
                msg = "%s contains more then one \%" % string
                raise argparse.ArgumentTypeError(msg)

        @staticmethod
        def compile_regex(pr): #pr = possible_regex
            try:
                return re.compile(pr)
            except Exception:
                raise UncompilableRegex

    parser = argparse.ArgumentParser(description='Encrypt EMails with gpg', prog="gpgit, python edition", formatter_class=LineFormatter)
    parser.add_argument('-i', '--input-file', nargs='?', type=argparse.FileType('rb'), default=sys.stdin.buffer, help="input file, default stdin")
    parser.add_argument('-o', '--output_file', nargs='?', type=argparse.FileType('wb'), default=sys.stdout.buffer, help="output file, default stdout")
    #parser.add_argument('input', nargs='?', type=argparse.FileContext('r'), default=sys.stdin)
    #parser.add_argument('output', nargs='?', type=argparse.FileContext('w'), default=sys.stdout)
    #we should switch to FileContext here once it becomes avaiable in 3.4 or use binary file opens
    #see http://bugs.python.org/issue14156
    #see http://bugs.python.org/issue13824
    parser.add_argument('regex_identifier_list', metavar="receipients", type=Recipient.from_argparse_string, nargs='+',
    choices = tuple(map(Recipient, get_possible_userids())),
    help="""A list of regex%%identifier or identifier terms.\n
    (keyid are the last 8 symbols of a keys fingerprint)\n
    identifier: {fingerprint, 0xkeyid, email_address}\n
    regex%%identifier -> encrypt if regex matches "To"\n
    identifier -> encrypt all\n
    If multiple match, the mail will be encrypted with multiple keys.
    At least one valid key for an email must be supplied.
    All keys are trusted but expired keys are ignored (they are still listed here as valid but encryption will fail as gpg does not accept them).
    With your current gnupg config the avaiable identifiers are:\n """ + u"\n".join(get_possible_userids()))
    return parser

def main():
    parser = _return_arg_parser()
    args = parser.parse_args()
    file_in = args.input_file
    file_out = args.output_file
    regex_fingerprint_list = args.regex_identifier_list
    try:
        email_obj = email.message_from_binary_file(file_in)
        if not is_encrypted(email_obj):
            identifiers = _matching_finderprints(email_obj, regex_fingerprint_list)
            email_obj = encrypt_email(email_obj, identifiers)
            _remove_corrupt_headers(email_obj)
    except Exception as E:
        raise E
    else:
        file_out.write(email_obj.as_bytes())

if __name__ == '__main__':
    main()

Usage

On Uberspace, you can integrate the script very easily via maildrop.

Copy the skript to ~/bin and make it executable. Make sure that your gpg installation has the access to your public key. Add this to yout maildrop file:

xfilter "python3 ~/bin/gpgit3.py  ^foo%foo@bar.de 0x01ABCDEF"

This would encrypt all emails addressed to an email address starting with "foo" with the puclic key matching foo@bar.de as well as the key with the fingerprint 0x01ABCDEF. All other keys are only encrypted with 0x01ABCDEF. It is highly advisable to have a catch all as every email which does not match any public key will raise an Exception.

This is desired on uberspace as the email is then requeued but your setup might delete them. So check how your system reacts to this before using it in production.

Also, make sure the public keys are not expired. GPG lists all known public keys but rejects encryption with expired keys. Expired keys will be ignored.


Published

Category

project

Tags