Procmail sandbox

---
Last modified 2003 JUL 02 10:34:53 GMT
Read the site Disclaimer

This text is primarily the scrawlings of a draft. It is incomplete, but it may be ages before I get the time to further revise it, so I'm publishing it as-is. I've mentioned sandboxes in many posts to the procmail-users list, but two which spring to mind in the archives were in the form of a walkthrough and also a more concise cut and paste version. This document is intended to superceed those texts.

Sandbox is a term applied to a testing environment wherein your tests are performed within a controlled environment - things stay "in the sandbox".

By developing your procmail filters within a sandbox, you avoid several of the common pitfalls which users encounter when learning procmail or attempting to perform complex operations. Chiefly among these pitfalls is sending autoreply messages to people while you're testing a filter. This is generally a bad thing to do - especially if while testing, you generate scores of these autoreplies, or cause a loop condition.

A sandbox won't prevent you from taking a bad filter and using it in your live procmail config, but it can present you with an opportunity to test your filters in a consistent and "mail-safe" manner before attempting to use them in a live config.

The suggested way to invoke your MTA within procmail is using the $SENDMAIL variable (this holds true even if your MTA is not Sendmail - Postfix for example has a sendmail "stub"). When you forward a message using the bang (!) action, that uses $SENDMAIL. So, as long as you stick to this, your recipes are well-suited to being developed largely unchanged within a sandbox - instead of accepting the default definition of SENDMAIL=sendmail, we define it ourselves (this should be in the path which you define in your sandbox rc):

SENDMAIL=sendmailtest.sh

#!/bin/sh
# sendmailtest.sh
echo Parms: $@ >> mail.sendmailtest
cat - >> mail.sendmailtest

This way, when you have a script which would send an autoreply, the message simply appears in mail.sendmailtest, preceeded by a Parms: line (no, the mail.sendmailtest file IS NOT appropriate for feeding into formail to split it and perform other operations -- it is intended solely for examining what _would_have_ been sent, and the arguments which sendmail would have been invoked with. You could optionally redirect _testing_ output from sendmail, but the point is that the message isn't actually SENT.

You may download this file as sandbox.rc.

# Sean's little sandbox.
#
# 20020619/0915 SBS		Updated with additional notes, added standard header
#						extractions and defined ORGMAIL.
# 20030424/2019 SBS		more header extractions
# 20040511/1212 SBS		Added RELAYHOST extraction
# 20041212/1124 SBS		Revised RELAYHOST to include original HOST
#
# Note: variables defined here (such as NL, AUTOREPLY, and the header 
# extractions) should be mimic'd in your regular .procmailrc OR ELSE REMOVED
# ENTIRELY from this sandbox -- these features have proven themselves nice to
# have, but if they're not in your live .procmailrc, then having them here is
# only going to set you up for problems if your tested scripts use them, and
# then you deploy those scripts into your live filters without the benefit
# of the supporting variables.
# 
# You might find it handy to Take some variable definitions and place them
# in a separate file, which you can include here as well as into your
# live .procmailrc, say, like so:
#
# INCLUDERC=$HOME/.procmail/variables.rc
#
# Doing so would make your test environ better mimic your live one, although
# you need to ensure that the included file does not define variables -- or
# run filters which would conflict with testing.
#

# IMPORTANT: this dir is different from where your regular procmailrc
# would deliver messages.  Your recipes should always avoid delivering to
# hard-coded paths - define directories using variables so that the paths
# can be changed external to the script.
MAILDIR=$HOME/.procmail/sandbox

# Disable icky delivery notifications (if you like 'em, go ahead and enable
# them here).
COMSAT=no

# Newline for use in logging (this negates the need to add hard newlines
# into logging statements - you can use $NL instead).
NL="
"

# whitespace (spacetab) multiple
wsstar='[ 	]*'

# Define a flag to indicate that we're running in a sandbox.
# this can be useful when you're inserting a test filter into an otherwise
# LIVE rcfile.
SANDBOX=1

# Specify a logfile, separate from your live procmail logfile.
# (note that you may want to purge the log file between test runs)
LOGFILE=sandbox.log

# define a temp directory (for explicit lockfiles and the like)
TEMP=/tmp

# We're testing, so it's a good bet we probably want verbosity in the logging
VERBOSE=ON

# This will log additional delivery abstract information - it isn't of much
# significance if VERBOSE=ON above, but if you choose to switch VERBOSE=OFF,
# this will provide more info than the barebones logging.
LOGABSTRACT=ALL

# Specify the default mail delivery mailbox file
# For *MY* testing purposes, anything NOT specifically filtered goes to the 
# ether (remember, we're piping into this ruleset from a saved file) - the 
# messages I'm interested when running a test script tend to be the ones
# expressly matched by the recipe.  If I have a need to retain copies of
# unmatched messages, I can achieve that through the included test_filter.rc
# For your purposes, you might want to set this to ./default.mbox or 
# something - in any event, IT MUST NOT BE YOUR DEFAULT MAILBOX, or you'll 
# deliver mail to yourself when running tests.  Your included test_filter.rc 
# script can of course override this, meaning that WHEN you have a need to 
# save unfiltered mail, you can do so without altering THIS template.
DEFAULT=/dev/null

# When manually invoked, procmail will not have an $ORGMAIL defined.  Since
# some scripts may rely on this, you may wish to define it as /dev/null, or
# to some test repository file.  Recall from the documentation that in normal
# operation, the initial value of $DEFAULT is defined to be the same as 
# $ORGMAIL.
ORGMAIL=/dev/null

# redefine SENDMAIL for the purpose of testing
# (will be executed relative to $MAILDIR)
SENDMAIL=./sendmail.sh

# define other variables you'd normally define

# Path to text files used for bounces and autoreplies, etc.
AUTOREPLY=$HOME/.procmail/autoreply

# Common header extractions I use (not necessary in YOUR sandbox, but these
# have proven to be useful to have on hand rather than extracting them in
# individual recipes).  Note above comments about setting the same variables
# here as you do in your live .procmailrc
#
# NOTE: I MAKE EXTENSIVE USE OF SOME OF THESE VARIABLES WITHIN MY OWN
# RECIPES.
#
:0
{
	:0
	* ^Subject:[ 	]*\/[^ 	].*
	{
		SUBJECT=$MATCH
	}

	:0
	* ^To:[ 	]*\/[^ 	].*
	{
		TO=$MATCH
	}

	:0
	* ^From:[ 	]*\/[^ 	].*
	{
		FROM=$MATCH
	}

	# This is an optional header - your MTA configuration may not insert
	# it (bummer for you).  It is very useful to have
	:0
	* ^X-Envelope-To: *<\/[^>]*
	{
		ENVTO=$MATCH
	}

	# This is also an optional header.  If you don't have this, you can
	# get the same information through the commented out rule which
	# follows.
	:0
	* ^X-Envelope-From: *\/[^ 	].*
	{
		ENVFROM=$MATCH
	}

	# alternative to X-Envelope-From:
#	:0
#	* ^From \/[^ ]*
#	{
#		ENVFROM=$MATCH
#	}

	# Here we have to call shell.... -rt will parse return address
	# according to RFC rules.  Note we only process HEADER.
	:0 h
	SENDER=|formail -b -rtzxTo:

	# get the From: address as an address component ONLY (no comments)
	:0 h
	CLEANFROM=|formail -IReply-To: -rtzxTo:

	# username portion
	:0
	* CLEANFROM ?? ^\/[^@]+
	{
		FROM_USER=$MATCH
	}

	# domain portion
	:0
	* CLEANFROM ?? @\/.*
	{
		FROM_DOMAIN=$MATCH
	}


	# Obtain the hostname of the host which relayed the message to us.
	# This is found in the topmost received header.

	RELAYHOSTX=`formail -u Received: -czx Received:`

	# The hostname provided in the SMTP EHLO exchange will be the first
	# token on this line.
	:0
	* RELAYHOSTX ?? ^from \/[^ 	]*
	{
		RELAYHOSTEHLO=$MATCH
	}

	# Then isolate the hostname portion (if any) in the parenthetical.
	:0
	* RELAYHOSTX ?? ^from [^ 	]* \(\/[^)]*\)\>+by\>
	* MATCH ?? ^\/[^)]+
	{
		RELAYHOSTX=$MATCH

		:0
		* RELAYHOSTX ?? ^\/[^[ ]+
		{
			# grab whatever up to the first space or the open
			# brackets for the IP
			RELAYHOST=$MATCH
		}

		:0
		* RELAYHOSTX ?? ()\[\/[^] ]+
		{
			# grab the apparent host IP from the brackets
			RELAYHOSTIP=$MATCH
		}
	}

	# null out RELAYHOSTX (temp variable used in the extraction process)
	RELAYHOSTX=

	# if the relay host has no rDNS, RELAYHOST should be undefined.
}


# Finally, include your test filter HERE - it's in a separate file, where 
# it should stay (once it tests good, it's an easy matter to move it into 
# your live procmail config).
INCLUDERC=test_filter.rc

sendmail.sh contains:

#!/bin/sh
# script: sendmail.sh
# author: Sean B. Straw
#
# This script is intended to be used for sandbox testing of procmail scripts
# so that we don't annoy the hell out of the universe because of some
# oversight in a script implementation.  It permits you to write the body of
# your procmail script just as you would use it in a live context, but by
# redefining $SENDMAIL in your sandbox wrapper, your included script invokes
# this instead of the real MTA.
#
# To use from procmail, simply redefine $SENDMAIL:
#
# SENDMAIL=/path/to/sendmail.sh
#
# Note: if you want to define the output filename dynamically from within
# your procmail config, you could define SENDMAIL above with the filename
# as the first argument and then change this script accordingly).
#
# This script uses 'lockfile', which is a supplemental procmail utility.
#

# set mailbox name, or perhaps it is a passed argument...
MBOXNAME=test.sent.mail

# create a lockfile 2 second delays, 6 retries.
if lockfile -2 -r 6 $MBOXNAME.lock; then
 ( echo X-MTA-Parameters: $@ ; echo X-MTA-Send-Date: `date` ; cat - ) >> $MBOXNAME 
 rm -f $MBOXNAME.lock
else
 # emit an error message to STDERR
 echo FAILURE OF $0 > /proc/self/fd/2
 # return a non-zero exit status to our caller so they know we failed
 exit 1
fi

And here's an example test_filter.rc:
#	File: 		noreply.rc
#	Description:	ProcMail script for NoReply email address
#
#	19961002/2052 SBS	Initial coding
#
#	First, we file a copy, then we reply to it and stop.
#

:0
* ^TONoReply@(somehost\.|)somedomain\.tld
{
	# I archive off a compressed copy for safekeeping (and future tests)
	:0c:
	|gzip -9fc>>$MAILDIR/noreply.gz

	# If it is looped or from the mailer daemon, do nothing more.

	:0 w
	* !^FROM_DAEMON
	* !^X-Loop: NoReply@somehost.somedomain.tld
	| ( formail -rt -A "X-Loop: NoReply@somehost.somedomain.tld" \
		-I "From: NoReply@somehost.somedomain.tld (No Reply robot)" ;\
	cat $AUTOREPLY/noreply.msg ) | $SENDMAIL -t

	# in case it was from daemon, looped, or failed the above delivery,
	# ditch it.
	:0
	/dev/null
}

I often run standalone filter tests (such as experimenting with a new spam filter - or obtaining VERBOSE logging of a message which failed to be caught by my spam filters, to determine why it wasn't matched) by redirecting an existing mailbox into the testing filter, like so:

formail -s procmail -m testing.rc < junkmail.mbx

or (since *I* archive volumes of email using compression):

gzip -dc < junkmail.gz | formail -s procmail -m testing.rc

You can easily dummy up a message - take a copy of a saved message and edit it to resemble the sort of message you wish to filter (the subject line, the To: or From: or whatever), then redirect or pipe it into the formail invocation above.

Using this sort of framework will permit you to quickly and easily develop reliable filters for your procmail configuration. It is very easy to slap some filters together to test a theory this way - wondering if a certain filter would catch a message? Write it, and throw the message at it. You'll answer a LOT of your own questions if you simply experiment. You'll also discover a variety of filtering tricks once you realize that you can experiment without subjecting your live mailspool to your experimental filters. When you want, you can edit the test message to be precisely what you want it to be, and feed that into the test script. Between runs, you'll probably want to delete the testing.log file.

You might even make a shell script to run the procmail process, then show the log, and delete the log:

#!/bin/sh
# (I should be chmod +x)
# delete the log from previous run
rm testing.log
# run the test filter
formail -s procmail -m testing.rc < my_message_file
# view the log
less testing.log
# edit the test filter
vi test_filter.rc

You'd run the script, the previously existing log would be deleted, the filters would be processed, the log would be viewed, you could see how the output worked, and then exit the pager (less), the editor would be invoked on the test script so you could make tweaks, and run again.

I use something vaguely similar to the sandbox for post-processing some mailboxes (say to retrieve a false positive from a spam mailbox) and re-inject them into my mailspool.

If you structure your .procmailrc right, and include the bulk of your recipes into it, you could easily take your recipes and run them against a test message with verbose logging without ACTUALLY redelivering or forwarding that message.

[TOP]

Professional Software Engineering
Post Box 751224
Petaluma, CA 94975-1224 USA

EMail to: PSE-L@mail.professional.org

Copyright © 1995-2024 Professional Software Engineering, All Rights Reserved