Email is a terrible way to communicate and should be avoided where possible. Unfortunately it is also the lowest common denominator on the web and will continue to be for the near future.
In the early days of the internet it was easy to run your own mailserver. Due to the absurd quantity of spam this task got increasingly harder and many tech-savvy people gave up and switched to Gmail or other services. This is a pity because a decentralized email infrastructure is harder to surveil, subpoena or shut down. I encourage everyone to run their own mail service if possible.
In this guide I will summarize the steps needed to get an effective spamassassin (SA) setup.
Find missing perl modules 🔗
SA will skip functionality unless all dependencies are installed. Run spamassassin -D –lint 2>&1 | grep -i failed and add whatever is missing.
Install a local Nameserver 🔗
Many blacklists work via DNS lookups and operate on a “some for free” model, giving smaller servers a certain number of free requests. When using a shared nameserver, you will quickly run out of free requests and get blocked.
Use the Checksum Blacklists 🔗
Partial checksums of emails seem to be a very effective way to detect spam. SA supports DCC, Razor and Pyzor as main systems. There is also one provided by the German IT magazine iX. You can compare the efficiency of each system here.
- Pyzor has a package available for Debian, so it’s very easy to install.
- DCC needs to be compiled, but the effort is worth it.
- iX Hash is a custom Perl module. It just needs to be dumped in the main config folder, usually /etc/spamassassin
Once installed, you should get hits based on checksums or errors in your logs.
Use Bayesian Filter 🔗
This filter needs some time to train, but is very effective on new spam, not covered by blacklists yet. You can train it on existing, sorted email or let it learn over time.
For performance, use a SQL database as backend.
# Enable the Bayes system use_bayes 1
use_bayes_rules 1
bayes_auto_learn 1
bayes_auto_expire 1
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn DBI:mysql:spama_bayes:localhost
bayes_sql_username spamassassin
bayes_sql_password XXXXXX bayes_sql_password WiqsqkzbdUs1sORqp6xC
bayes_sql_override_username mail
Use a moving average spam score 🔗
SA has a feature called “Auto Whitelist”. The name is a bit misleading. What it does is to normalize the score of a sender/IP combination over time. So “safe” senders will be less likely to be flagged as spam, even if they send one abnormal email.
# Use moving average of scores (Auto Whitelist) use_auto_whitelist 1
auto_whitelist_distinguish_signed 1
auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn dbi:mysql:spama_awl:localhost
user_awl_sql_username spamassassin
user_awl_sql_password WiqsqkzbdUs1sORqp6xC
user_awl_sql_table awl
user_awl_sql_override_username vmail
Use additional Blacklists 🔗
SA comes with a number of pre-configured blacklists, but adding a few may help. Intra2Net has a nice list comparison of their failure rate and SA snippets for quick installation. Just drop them in your local configuration files. E.g.
# https://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN_GBUDB.html header RCVD_IN_GBUDB eval:check_rbl('gbudb', 'truncate.gbudb.net.', '127.0.0.2')
describe RCVD_IN_GBUDB Listed in truncate.gbudb.net
tflags RCVD_IN_GBUDB net
score RCVD_IN_GBUDB 6
Use a Virus Scanner 🔗
Phishing, Cryptolockers and other malware arriving by email is a big issue for users. Most will get detected by SA, but some attachments may slip through. You should use ClamAV as a bare minimum. Using an updated commercial scanner will improve your detection rate. Linux virus scanners are mostly bad and unstable, so choose the least-bad one depending on your budget. It should also run in demonized mode to avoid loading all the definitions for every single scan.
Conclusion 🔗
Implementing these steps should help you detect the vast majority of spam emails. As a last step you can fine-tune SA’s weights to your environment, if you notice a pattern of false negatives.
score URIBL_RED 4
score SPF_FAIL 3
score SPF_SOFTFAIL 3
score RAZOR2_CF_RANGE_E8_51_100 2.5
score RAZOR2_CF_RANGE_51_100 2.5
score DCC_CHECK 3
Or alternatively define your own checks for very specific spam:
# Custom rules
header FAKE_INVOICE_2014_SUBJECT Subject =~ /\bRechnungOnline Monat\b/i
score FAKE_INVOICE_2014_SUBJECT 6
I hope this guide is useful and makes running your own mailserver easier.