How to create a cronjob to use sa-learn to teach spamassassin - mbox


Enter Your Query:
Use '%' for wildcards and quotes for "exact phrases"


Top Level » Email » Spam

How to create a cronjob to use sa-learn to teach spamassassin - mboxLast Modified: Mar 15, 2011, 12:03 am
This guide is to describe the steps to give you the ability to teach spamassassin what spam is.  The guide is for the mbox format, but you can find the Maildir version here.

This will assume that you've already installed spamassassin, and have it running in it's default state.
The domain will be called "domain.com" for system account "username", and email user 'bob".
The folders will be called teach-isspam and teach-isnotspam, but you can use whatever you want.

1) Create a new imap folder that you can place your spam into, and a new folder that you can place false positives into (so it doesn't tag them again)

cd /home/username/imap/domain.com/bob/mail
touch teach-isspam teach-isnotspam
chown username:username teach*
chmod 600 teach*
cd ..
echo "teach-isspam" >> .mailboxlist
echo "teach-isnotspam" >> .mailboxlist

Now you have 2 new mailboxes under user bob when accessed via IMAP (squirrelmail or roundcube).  Test these out, try putting messages in them to ensure they function correctly.

2) Now that you have the ability to put your spam and non-spam messages into their correct places, you'll need to setup a cronjob to check these locations with sa-learn.
Create an sh file in /home/username/.spamassassin/teach.sh.
In it, put:

#!/bin/sh
FILESPAM=/home/username/imap/domain.com/bob/mail/teach-isspam
FILEHAM=/home/username/imap/domain.com/bob/mail/teach-isnotspam
echo "learning spam via $FILESPAM...";
sa-learn --no-sync --spam --mbox $FILESPAM

echo "";
echo "learning ham via $FILEHAM...";
sa-learn --no-sync --ham --mbox $FILEHAM

echo "";
echo "syncing...";
sa-learn --sync

echo "";
echo "current status:"
sa-learn --dump magic

exit 0;

Save, chmod the teach.sh to 700.

At this point, you should be able to manually run the teach.sh to see if it works.  If you test it out, run it as username so that you make sure all files written are chowned to username, and not root.

3) Now to automate the frequent running of the teach.sh so you don't have to run it manually every time.
Log into DirectAdmin as username and go to the cronjobs section.  Enter the commmand

/home/username/.spamassassin/teach.sh

and for the times, put a * character in all filelds, except for "hour", put  */12 so that it runs twice per day (every 12 hours)

That's it.
To use it, if you get email that wasn't tagged as spam, drag it into your teach-isspam folder.
If you get email that was tagged as spam that should have been, move it to your teach-isnotspam folder.
You can delete the email you've place there aftera day or so, to ensure it's been caught by the sa-learn program.
Note that sa-learn can process the same email twice and it won't hurt anything.

If you want to empty these 2 folders after each run, add:

echo -n '' > $FILESPAM
echo -n '' > $FILEHAM

just before the "exit 0;".  This will reset those folders to 0 bytes, so that you don't have to delete the messages after they're processed.
 
Related Helpfiles
How to enable SpamAssassin on your server.
How to limit the number of emails sent by each user (prevent spammer)
How to enable realtime blocklists (RBLs) with exim
How to create a cronjob to use sa-learn to teach spamassassin - Maildir

© 2018 JBMC Software, Suite 173  3-11 Bellerose Drive, St Albert, AB  T8N 1P7  Canada.  Mon-Fri 9AM-5PM MST