If you’re running a mail server like Qmail or my new favorite Postfix chances are you’re using the Maildir mailbox type. I like Maildir because it stores each mail in a separate file, making cleaning and managing mailboxes very easy on the server. Also, Maildir works over network filesystems like NFS much better, meaning you can have several mail servers using the same mailbox store and if messages are received for the same person on multiple servers there is no problem with access to the mailbox because the message will be stored in it’s own unique file.
Recently I was tasked to review a mailbox that was forgotten about on a mail server and had accumulated a lot of spam junk messages. Since there was a chance that there was valid mail in the box it was decided that we should task someone to go through it. The problem was there was 74,000+ messages in the mailbox, over half of which was surely spam and viruses.
With a simple find . -type f -exec grep -l -i "spamword" '{}' ';'|wc -l
I was able to find and count the number of messages that contained spamword
. Then by changing the command to find . -type f -exec grep -l -i "spamword" '{}' ';'|xargs rm -v
I was able to remove all the messages that contained this word. By removing common spam words like “viagra, cialis, poker, pharmacy” I was able to cut down on a lot of messages.
Next was to scan for viruses. Using Clam Antivirus you can scan a maildir for viruses by using the –mbox command line switch. I chose to move all infected mails to a dir so I could later check them out by hand (just to be sure).
mkdir /tmp/clamscan-infected
chmod 777 /tmp/clamscan-infected/
clamscan --mbox -i --move /tmp/clamscan-infected/
I use /tmp because the directory needs to be one that clamscand in unprivileged mode can read/write to.
By scanning and removing commond spam words and viruses I was able to cut the messages down to around 22,000. I’m sure 99% of those are spam too.