how SPF resp. SRS broke my mail filtering and cost me a lot of sleep recently

I am using e-mail addresses on my “personal domains” (like Hayek.name), which implies some sort of forwarding to an IMAP server.

The company, that hosts my “personal domains”, (UDAG) also forwards massive amount of e-mail messages to some companies that dominate the German market (like GMX and Web.de). The latter ones recently introduced SRS (the obligatory variant!), which in turned caused a lot of e-mail bouncing to UDAG. Apparently UDAG was a little surprised by that, and they in turn also implemented SRS w/o announcements noticeable by their customers.

How did that lightning hit me? (Almost) all of my incoming e-mail messages did not get matched appropriately by my mail filtering any more, and because SPAM Assassin classified them as more or less bad SPAM, all these messages went to SPAM folders.

What kind of e-mail was concerned?

  • recruiting offers
  • banking events
  • telephone calls (resp. their notifications) to my home phone numbers
  • various mailing lists

I can tell you, it was a big, big mess. A lot of messages in the wrong IMAP folders. That’s a true plague.

My preferred approach to fix the trouble was to get SRS rewriting switched off (for me). UDAG frankly denied that. (I guess, I am just not important at all.) (Further down I will tell you, how I sort of continued that path anyway.)

Why did it hit me in the 1st place? Because I query “Return-Path” (which is the main focus of SRS) and not “From“. In 20 years of procmail rule writing I learned, that querying Return-Path is far more reliable than querying From.

I talked to a couple of support guys, and what did they suggest?

  • I should query From, apparently that’s their general approach of mail filtering. I will not do that, From is not reliable.
  • I should rewrite my filtering rules obeying to SRS. You don’t rewrite your procmail regular expressions in an SRS way. Not if you have more than a hundred lines of code – or 20.000 LOC as I do.

I also got told, that procmail would be replaced by Sieve sooner or later in my IMAP server’s environment, so my traditional procmail filtering would not work any longer (then) anyway. WOW! More bad news please!

What to do?

  • rewrite the procmail rules by using a very simple DSL-like approach
  • create the original procmail rules from that DSL
  • and also create procmail rules with regular expressions considering the SRS-mangled e-mail addresses – but only SRS Level 0 – you might not be aware of this, but SRS address mangling can come in multiple levels

I focused on the rules, that only query Return-Path. I assume, I am covering more than 80% of my code that way.

Another 5% of my code queries List-ID and List-Unsubscribe. They are really, really reliable and also very stable. I will create a DSL feature, that expresses their use independent of procmail. Later! Not now! But I guess rather, rather soon – because it looks so intriguing.

I had started writing procmail rules 20 years ago, and they ran on my notebook then for a couple of years. But once I started created duplicates (with slight changes) that would run on “my IMAP server out there”. That was smart but also silly. Duplicates with changes are always a PITA. I should have created that DSL approach from the beginning – and I actually did – but in a way, that was just not simple enough and too painful to carry through. I got stuck with that 15 years ago. I always knew, that procmail would die in “public environments” one day, and I would either have to go with Sieve or do some own mail filtering around IMAP. I started the latter one, and I mentioned that in conversations with ESR in the context of fetchmail, and he made that public, and (shame on me!) I never delivered. And you can still find “the shame” on some websites.

Rewriting my procmail rules in that DSL way allows me to merge those duplicates again – and I rather enjoyed that, but it was a lot of work – a true lot of work. With procmail you have to write the addresses you want to match as regular expressions. Not a big thing, not difficult at all. If somebody has a couple of addresses, and you want to match them all through the same procmail rule, you can also do that using a smarter regular expression. Not a big thing, too. With my DSL I added a feature, that allows listing plain addresses as plain addresses. So for quite some rules I rewrote the regular expressions to a simple list of plain addresses. These rules look far more readable now.

The current code generator of my DSL I am just targeting procmail. But I was keeping in mind from the very beginning, that targeting Sieve or something different should be quite easy to achieve.

Another solution path, that I envisaged after talking to various support staff, is to get the incoming message rewritten back to a state, where Return-Path looks, as it has always looked. I implemented a 20-lines Perl script and a test set-up with a couple of sample messages, and that looked very promising. The communication with the resp. support / sysadmin staff took dragged on (???) for quite a couple of days, finally they told me

  • they can’t do that for me
  • but I can actually do it myself.

Really? Yes, procmail has a feature, that allows it to rewrite its input and process the rewritten input than. Phantastic!  😆 Looks like this voids the DSL work I had done before, but actually it does not. Once the incoming messages will look again, as they always have, I can remove the generated rules coping with SRS Level 0 (un)mangling. My new DSL code looks far, far better than my previous code.

I spent like 30 extra hours during the last ten days on this effort, and I think, the work accomplished was rather worth it.


Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.