spamd and network whitelisting

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

spamd and network whitelisting

Clint Pachl
I would like to share my 45-day experience with running spamd and my
observations and how I'm allowing mail from SMTP clusters to bypass
spamd. Feedback and discussion would be greatly appreciated.

I have two domains that I have been using for my businesses: one is 13
years old and the other is 8 years old. I have never had a spam problem
until about six months ago. In October I was getting about 100-200 spams
per day per domain. The spam rate was increasing from month to month.
All mail was going directly to my OpenSMTPd. I was not using filtering
of any kind so the signal-to-noise was very low, and frustrating.

So I read the spamd and related man pages and enabled spamd on my
firewall on November 1. I was astonished! I literally got 6 spam emails
that first week for both domains!

However, the big problem was, I also wasn't getting legitimate business
emails that were sent from SMTP clusters/pools. After studying my logs,
tweaking spamd(8) flags, looking to external solutions (DNSBL, SPF,
reverse IP verification), I had some observations and discovered some
patterns. Here's the solution I'd like to share:

I wrote two very small scripts: spamd-dnsbl and spamclusterd. These
scripts work together to keep spam to a minimum while passing all
legitimate email (in my case so far).

1) spamd-dnsbl: Queries a DNSBL using the IPs in spamdb(8). If an IP is
on a black list it is added as a TRAPPED entry in the spamdb. The script
only checks IPs which have been added since last run. Currently, only
the zen.spamhaus.org DNSBL is queried because I found it to be the most
true of all those listed at
http://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists.
Alternatively, multiple DNSBLs could be queried and the results could be
used in aggregate to determine spam status, thus promoted to TRAPPED.

2) spamclusterd: Queries spamdb(8) for networks to whitelist, which it
adds to a pf table that bypasses spamd. So before this script gets
carried away allowing IP blocks to bypass spamd, the spamdb(8) is first
pruned of spammers using the spamd-dnsbl script.

I've only been running this setup for about 30 days, but I haven't
missed an email yet; plus spam is still about 1 per day across both
domains. I receive emails from all the common SMTP clusters, such as
Gmail, Microsoft (hotmail.com, outlook.com, msn.com, etc.), and Yahoo
but also US government agencies such as, mail.mil, usmc.mil, uscg.mil,
irs.gov, etc.

I noticed a pattern of commonalities of these legitimate sending clusters:

1. The envelope's from and to addresses are identical across tuples.

2. The HELOs are very similar, with the TLD from each tuple almost
certainly the same.

3. They make multiple attempts from different IP addresses, however, the
IPs differ only by a few bits. (Caveat: I'm only using IPv4)

These 3 points are the basis of spamclusterd. How it works is, if two or
more GREY tuples with matching "to" and "from" addresses, HELOs with
matching TLDs, and IPs with matching network bits (/24), then add the
/24 network to the spamd-cluster table in pf, which bypasses spamd.

I was going to get fancy and do an SPF lookup and try to determine the
exact network to whitelist, but simply whitelisting a 256 IP block seems
good enough. Once in awhile the subsequent client IP will be outside
this block, but the /24 seems to work better than 90% of the time.

Currently, just two client IPs from the same /24 network is enough to
get that network whitelisted, which seems like a low bar. However, with
the prior DNSBL pruning, this seems sufficient for now.

## Some other observations ##

Spammers, even if sending from the same IP or IP network and regardless
of the
TO address, tend to randomize the FROM and/or HELO. Therefore, in the
case of my spamclusterd script, whitelisting a spammer is less likely
when ensuring both HELO and FROM match for multiple tuples. These IPs
will then continue to deal with spamd, and it's business as usual.

I initially tried setting 1 minute passtime and 12 hour greyexp times
for spamd (i.e. -G 1:12:864) in hopes to eventually whitelist a client
IP, originating from a cluster, that has reattempted within that large
window. However, in my first week, I missed a couple of Gmails which
resent for 5+ days and ultimately failed to deliver. What was
interesting was one of the Google server IPs retried after 12 hours and
3 minutes, just missing the grey window, while others retried after 24
hours. I now set -G 1:10:1080.

It seems safe to assume a spammer if reverse IP lookup returns NXDOMAIN
and IP
is on at least 1 reputable DNSBL or lookup returns SERVFAIL after two
attempts.

Using SPF seems unreliable as of 11/22/16. Tested SPF on hundreds of IPs
in spamdb using the ruby spf gem. More than half the IPs did not specify
SPF or it failed in some
way.

If the envelope's "from" is our domain (i.e., to and from addresses are
the same domain), it is definitely a spammer because we only send our
mail to the submission port and never to the smtp port. For example,
there are currently 217 grey entries and 31 meet this criteria. However,
these spammers almost never resend so not worth it to blacklist them
after the first connection attempt. What would be best is if we could
blacklist these spammers upon first connection (for example, add flag to
spamd(8) that doesn't allow email from ourselves because we authenticate
and submit mail to submission port 587, which could use domains from
spamd.alloweddomains).

Thank you for reading this far. Please let me know if you would like
clarification or have questions. If there is interest in my scripts, I
can send those as well.

Thanks to all the developers who made spamd; an amazing, simple, clever
tool.

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Devin Reade
You might also want to look at bgp-spamd.

With respect to dealing with SPF, the simple solution (permitting an
IP if it is on the sending domain's SPF list) doesn't work too well
in the general case since it appears many spammers publish SPF records.

However what I found works well, at least for some low-volume domains,
is to identify the subset of domains for which I would like to honour
the SPF records and automatically whitelist them.

I wrote a little perl script, available as:
   <ftp://ftp.gno.org/pub/tools/gen-spf-whitelist>
The script takes a set of whitelisted domains and queries the DNS to
build up the matching set of whitelisted IPs.  It then puts these into
a file that can be loaded as a pf table.  This permits pf to bypass
spamd for these whitelisted domains.  There is extra usage information
(and a description of current limitations) in comments at the top of
the script.

This does require one to reload the pf configuration, however (due to
paranoia) the current version of the script doesn't do that. Instead,
it mails root if something has changed that would require the
configuration to be updated.  Experience shows that this doesn't trip
very often.

I invoke the script from daily.local as something like:

   /usr/local/sbin/gen-spf-whitelist \
       example.com \
       example.tld \
       something.else.net \
       (...)

I qualified the above by mentioning I was using it on some low-volume
domains because the current mechanism probably doesn't scale well
with respect to maintaining the list of domains.  It could probably
benefit from a couple of substantive changes:

- permit the whitelisted IPs to be updated without needing to have pf reload
  it's rules.  This implies updating the pf table directly, in a manner
  similar to what is used for bgp-spamd.

- be able to tie in with a client management system that permits users
  to request domains to be whitelisted (only SPF-publishing domains could
  be whitelisted this way using this mechanism).

Potential candidate domains for inclusion will be obvious.  If you
'grep GREY /var/log/daemon', the most likely potential candidates are
those where you will see multiple delivery attempts from the same domain
to the same recipient but where the originating IPs differ (although
likely in the same net block).

Devin

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Clint Pachl
In reply to this post by Clint Pachl
Some have requested my scripts and configurations so here it is. Below
you fill find the spamd-dnsbl and spamclusterd scripts that are used for
blacklisting spammers and whitelisting networks, respectively. Also
included is dnsbl-check which I use for testing IPs against multiple DNSBLs.

In the crontab below, you will see that I archive the spamdb daily and
save some stats mainly for post analysis. For instance, my initial spam
fighting technique many years ago (prior to enabling spamd actually) was
to block the IP networks (20,000+ IPv4 networks) of the countries in
which we received the most spam, yet weren't expecting legitimate email
from (i.e. China, Russia, India, Brazil, etc.). I still had this enabled
up until 2016-12-17. So I make notes of changes like this to see the
positive or negative effects and I have the spamdb archives to assist
the analysis. Changing spamd_flags is something else I document.

A side note: Years ago, blocking spamming countries, for me here in the
US, essentially got rid of my spam problem, but has become ineffective
as many spammers are sending from US networks now, thus spamd. It has
only been three days since I disabled spam country blocking, but I have
received exactly 2 emails that have made it pass spamd, which would have
otherwise been blocked by the country IP block. Not bad, but we'll see
what the stats look like in a couple of weeks. However, I can guarantee
that the number of trapped entries in my spamdb will increase. I
originally created my pf table of spamming countries from
http://www.ipdeny.com/ipblocks/data/countries/

One of the other tests, which had significant impact, was using
spamd.alloweddomains. I tried a few things, but settled on my current
setup: for one email domain I list just the domain part (e.g.
@domain1.com), but for the other domain, which has limited users, I list
the full email addresses of all current accounts (e.g.
[hidden email], [hidden email], ...). This increased my TRAPPED
entries by 30%. These additional TRAPPED IPs were mainly one-shot
spammers, so it was nice to tarpit them while I had the chance. So far
spamd has been very effective so I haven't defined and published any
SPAMTRAP addresses, but this is just another knob I can turn on and
measure if needed.

To assist with spam management without root privileges, I added the spam
administrator to the _spamd group, gave r/w group privileges on
/var/db/spamd, and added a few pfctl commands to the doas.conf.

Overall I am ecstatic about spamd and its integration with pf, as well
as the simple spamdb interface (with the help of grep(1), cut(1),
sort(1), wc(1), column(1), sed(1), etc.). It is an extremely flexible
and powerful toolset. Hopefully my experience and scripts are helpful to
other spam fighters. I think you can look to other projects, like
spamassassin for example, to get ideas of spam fighting techniques which
can be implemented at a lower level using pf and spamd. For example, a
set of factors could determine a spam "score" similar to spamassassin:
if an IP is on multiple DNSBLs (each list weighted by quality), the DNS
PTR doesn't correspond to the HELO, and it fails SPF, then it is
probably safe to blacklist. The bgp-spamd.net project is another tool
that could be added to the mix. You will have to balance complexity and
effectiveness, but I would encourage simplicity and minimal resource usage.

Again, hats off to all the developers.


=== spamclusterd ===

#!/bin/sh
#
# Whitelist an SMTP cluster network.
#
# NOTE: pipe spamdb(8) or an archive to stdin.

extract_helo_tld() { echo "$1" | sed -En 's/.*[[:<:]]([^.]+\.[^.]+)$/\1/p'; }
extract_ip_net() { echo "${1%.*}"; }

print_ip_net_with_mask() {
        echo "$(extract_ip_net $1).0/24"
}

helo_tld_match()
{
        tld1=$(extract_helo_tld "$1")
        tld2=$(extract_helo_tld "$2")
        [[ -n $tld1 && $tld1 = $tld2 ]]
}

ip_net_match()
{
        net1=$(extract_ip_net $1)
        net2=$(extract_ip_net $2)
        [[ $net1 = $net2 ]]
}

_ip=""
_helo=""
_from=""
_to=""
is_cluster=0

grep "^GREY" |
tr "|" "\t" |
cut -f2-5 |
sort -k3,4 -k2 -k1 |
while read ip helo from to
do
        if [[ $to = $_to && $from = $_from ]] &&
           helo_tld_match "$helo" "$_helo" &&
           ip_net_match "$ip" "$_ip"
        then
                is_cluster=1
        elif [[ $is_cluster = 1 ]]
        then
                is_cluster=0
                print_ip_net_with_mask $_ip
        fi

        _ip="$ip"
        _helo="$helo"
        _from="$from"
        _to="$to"

done




=== spamd-dnsbl ===

#!/bin/sh
#
# Query DNSBL using the IPs in spamdb(8). If an IP is on a black list, add it
# as a TRAPPED entry in the spamdb.
#
# It seems most spammers send once and go away. The 1 minute pass time is
# effective at stopping most of these spammers. The other spammers seem to
# resend 10 minutes to more than an hour later, so a longer pass time won't
# defend against such spammers. That is where DNSBLs can be used to get these
# spammers marked as TRAPPED.
#
# For a list of DNSBL providers, see ~/src/shell/dnsbl-check.
#
# The [bgp-spamd](http://bgp-spamd.net/) project is another option for
# obtaining white and black lists.
#
# TODO: Query multiple DNSBL services and use the results, in agregate, as the
# factor to determine whether the IP should be TRAPPED. For example, if an IP
# is listed in 3 of 8 black lists, then trap it. Maybe we should have levels.
# First query reputable black lists, then fall back to the aggreate. This
# should speed up the process (i.e. no need to query the entire list of
# services if it is listed with a reputable service). The zen.spamhaus.org
# DNSBL seems reputable.
#

START_TIME=~/.${0##*/}.start
start_time=`date +%s`
prev_start_time=$(cat $START_TIME 2>/dev/null || echo 0)
DNSBL="zen.spamhaus.org"

IFS=\|
spamdb | egrep '^(GREY|WHITE)' |
while read entry
do set -- $entry
        ip=$2
        if [ $1 = GREY ]
        then ts=$6
        else ts=$5 # WHITE timestamp
        fi

        if [ $ts -ge $prev_start_time ]
        then query=$(IFS="."
                        set -- $ip
                        rev_ip="$4.$3.$2.$1"
                        echo "${rev_ip}.${DNSBL}"
                )
                host -t A $query >/dev/null && spamdb -ta $ip
                #host -t A $query >/dev/null && echo $ip # FOR TESTING
        fi
done

echo $start_time > $START_TIME



=== dnsbl-check ===

#!/bin/sh
#
# Check if the given IPv4 address is on a DNS blacklist.
# The list of DNSBL services was taken from
# https://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists.
#
# DNSBLs that return too many false positives:
# - hostkarma.junkemailfilter.com
# - recent.spam.dnsbl.sorbs.net
# - dnsbl.sorbs.net

ip=$1
[[ $ip = [0-9]*.[0-9]*.[0-9]*.[0-9]* ]] || { echo 'IPv4 required'; exit 1; }
rev_ip=$(
        IFS="."
        set -- $ip
        echo "$4.$3.$2.$1"
)

DNSBL_SERVICES='
zen.spamhaus.org
bl.spamcop.net
b.barracudacentral.org
rbl.megarbl.net
all.s5h.net
srnblack.surgate.net
bl.blocklist.de
dnsbl.inps.de
ix.dnsbl.manitu.net
blacklist.hostkarma.com
spamtrap.drbl.drand.net
bl.spamcannibal.org
spam.spamrats.com
dyna.spamrats.com
noptr.spamrats.com
dnsrbl.org
dnsbl.cobion.com
dul.dnsbl.sorbs.net
noservers.dnsbl.sorbs.net
badconf.rhsbl.sorbs.net
escalations.dnsbl.sorbs.net
web.dnsbl.sorbs.net
safe.dnsbl.sorbs.net
babl.rbl.webiron.net
'

for dnsbl in $DNSBL_SERVICES
do host -t A ${rev_ip}.${dnsbl} >/dev/null &&
        echo "$ip on $dnsbl black list." &
done

wait




=== admin's crontab ===

*/5     *       *       *       *       ~/bin/spamd-dnsbl
0       0       *       *       *       ~/bin/spamdb-stats >> ~/spamdb.stats
0       0       *       *       *       spamdb > ~/spamdb.ark/$(date
+\%m\%d)
*/15    *       *       *       *       spamdb | ~/bin/spamclusterd |
doas pfctl -t spamd-cluster -T add -f - -q



=== /etc/doas.conf ===

permit nopass admin cmd pfctl args -t spamd-cluster -T add -f - -q
permit nopass admin cmd pfctl args -t spamd-cluster -T show
permit nopass admin cmd pfctl args -t spamd-cluster -vT show



=== spamdb-stats ===

#!/bin/sh
# Daily spamdb(8) stats via cron.

for i in WHITE GREY TRAP
do      spamdb | grep ^$i | wc -l
done | tr "\n" " "

date +"%t%m/%d"



=== spamdb.stats ===

    WHITE     GREY  TRAPPED       DATE
       82      119      573      11/05
       89       50      233      11/06
       97       73      172      11/07
      101      240      440      11/08
      112      161      343      11/09


=== /etc/pf.conf (spamd stuff only) ===

pass  out     on egress proto tcp to port smtp
pass  in      on egress proto tcp to port smtp divert-to LOCALHOST port
spamd
pass  in  log on egress proto tcp from <spamd-white>   to port smtp
rdr-to MAIL
pass  in  log on egress proto tcp from <spamd-cluster> to port smtp
rdr-to MAIL

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Clint Pachl
In reply to this post by Devin Reade
Devin Reade wrote on 12/19/16 12:59:
> You might also want to look at bgp-spamd.

Yes, this was on my radar for quite some time. However, my simple spamd
setup with assistance from the zen.spamhaus.org DNSBL has been extremely
effective. It's nice to know we've got more big guns if needed.


> With respect to dealing with SPF, the simple solution (permitting an
> IP if it is on the sending domain's SPF list) doesn't work too well
> in the general case since it appears many spammers publish SPF records.

You're right. When I ran ruby-spf against the the TRAPPED IPs in my
spamdb, a surprising number passed SPF (like 15%). On the other hand,
one of the popular email domains from our customer DB is @att.net, which
doesn't even publish SPF. After some real life testing against our
client email DB, I determined SPF was not effective in filtering spam
for us. If it is used, it should be a small factor at best.

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Craig Skinner-3
In reply to this post by Clint Pachl
Hello Clint,

On Fri, 16 Dec 2016 07:21:47 -0700 Clint Pachl wrote:
> I would like to share my 45-day experience with running spamd and my
> observations and how I'm allowing mail from SMTP clusters to bypass
> spamd. Feedback and discussion would be greatly appreciated.
>

spamd in greylisting mode is indeed truly awesome!

With over 10 years real world experience running this way,
with several domains, I've tried a lot of ideas & scripts too...

The original design is very good and doesn't need much assistance.

To solve the clustered round robin senders (Gmail, etc.) simply bump
the -G:greyexp: time from 4 hours to 4+ days - 100 hours is good.
Job done! No scripts needed.

When configured like this, most gmails come through in around 6 hours
to 1.5 days, with some a bit longer. The more inbound gmails, the
shorter the delay, down to a few minutes as volume increases.
Same for Outlook, Amazon, (which are both worse than Gmail) etc,....

Bumping the -G :whiteexp time to 40 days helps a bit too.


Aggressive stuttering and a shrunk window foils almost all zombies.

Add in a fake highlisting -M to the mix, and it is game over for the
zombies, which love to target a backup MX box, so give them a trap.
(This needs a constantly deferring MTA on that IP address too.)

spamd_flags='-G 25:100:960 -S 90 -s 5 -w 1 -M .... -y .... -Y ... -Y ... -Y ...'
spamlogd_flags='-I -W 960 -Y ... -Y ... -Y ...'

(AOL only retries for 25 minutes (not the RFC 4 days....), so if you
want to receive from AOL, the -G passtime: needs to be ~10 minutes.)


Some pf rate limiting kills off those zombies that understand the 'try
again later' SMTP code, then start hammering the server all at once:

The 2nd rule blocks (after almost 2 days) badly setup M$ Extrange
servers, which retry every minute....


set block-policy drop

# Normal & highlisting Internet inbound operation via spamd:
pass in on $ext_if inet proto tcp \
        from any port > 1023 \
        to {$ext_if:0, $ext_if:2} port smtp \
        divert-to localhost port spamd \
        keep state \
        (max-src-conn 30, max-src-conn-rate 50/90000, \
                overload <scanners> flush global)

pass in log on $ext_if inet proto tcp \
        from <spamd-white> port > 1023 \
        to {$ext_if:0, $ext_if:2} port smtp \
        user root \
        modulate state \
        (max-src-conn 80, max-src-conn-rate 150/15000, \
                overload <scanners> flush global)


block in log from <scanners>


EASY! SIMPLE! Nothing to break.

No special domain lookups or exception lists. No maintenance labour.





Bob's other tool I deployed for many years was his greyscanner (in
ports). Over the years, I modified this to do aggregate DNS black &
white listing too. When I realised that it was very rare for spam to
pass the extended stuttering, I stopped running greyscanner.




Reverting to the default -G flags (4 hours grey expire), and help
promote round robin senders faster from grey to white, I wrote this
simple script. It runs unprivileged once every 4 hours from cron.

No pf tables/lists, no doas/sudo rules. No SPF checks.

It operates on an fgrep pattern of spamd HELO hostnames, as Gmail,
Outlook, etc. relay for many domains, but HELO from Google/Outlook.

The decision to upgrade from grey to whitelisted status is based on
an accumulated sliding score of multiple DNS list lookups.

See http://web.Britvault.Co.UK/products/ungrey-robins/ & logs there.




Also try Boudewijn's patch (see his continued blocking graph):
https://github.com/bdijkstra82/OpenBSD-spamlogd


>
> Thanks to all the developers who made spamd; an amazing, simple,
> clever tool.
>

Aye!
--
Craig Skinner | http://linkd.in/yGqkv7

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Boudewijn Dijkstra-3
In reply to this post by Clint Pachl
Op Tue, 20 Dec 2016 12:51:19 +0100 schreef Clint Pachl  
<[hidden email]>:

> Devin Reade wrote on 12/19/16 12:59:
>> With respect to dealing with SPF, the simple solution (permitting an
>> IP if it is on the sending domain's SPF list) doesn't work too well
>> in the general case since it appears many spammers publish SPF records.
>
> You're right. When I ran ruby-spf against the the TRAPPED IPs in my  
> spamdb, a surprising number passed SPF (like 15%). On the other hand,  
> one of the popular email domains from our customer DB is @att.net, which  
> doesn't even publish SPF. After some real life testing against our  
> client email DB, I determined SPF was not effective in filtering spam  
> for us. If it is used, it should be a small factor at best.

SPF was never meant for making accept/reject decisions on arbitrary  
domains.  If you don't trust the sending domain, then SPF evaluation is  
pointless.


--
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Boudewijn Dijkstra-3
In reply to this post by Clint Pachl
Op Tue, 20 Dec 2016 12:31:05 +0100 schreef Clint Pachl  
<[hidden email]>:
> [...]
> grep "^GREY" |
> tr "|" "\t" |
> [...]

I've learned to do all parsing of /var/db/spamd via the <db.h> interface  
as the envelope-from sometimes contains a "|" (pipe) character.


--
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

Reply | Threaded
Open this post in threaded view
|

Re: spamd and network whitelisting

Christopher Zimmermann-5
In reply to this post by Clint Pachl
On 2016-12-16 Clint Pachl <[hidden email]> wrote:

[...]
> What would be
> best is if we could blacklist these spammers upon first connection

I also wanted to just-in-time decisions, but with dnswl lookups.
I wrote a program to intercept incoming, unknown smtp connections and
do a dnswl lookup to whitelist them just in time. You could do the same
for blacklisting, but only for lookups based on ip because the program
looks only at the initial syn packet.
For me this helped a lot to deliver mails faster which would otherwise
be delayed in the greytrap, or even get stuck, because they come from
smtp pools.


here are the pf rules:
pass in on egress inet proto tcp to (self) port smtp flags S/SA no state
divert-packet port 25
pass in on egress inet proto tcp from <dnswl-grey> to (self) port smtp keep
state rdr-to 127.0.0.1 port spamd
pass in log (to pflog1) on egress proto tcp from {<spamd-white> <dnswl-white>}
to port smtp keep state

and here's the C program. It still has lots of dead debugging code.:

#include <sys/types.h>
#include <sys/time.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <sys/fcntl.h>
#include <sys/wait.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/ip6.h>
#include <netinet/tcp.h>
#include <net/if.h>
#include <net/pfvar.h>
#include <arpa/inet.h>
#include <arpa/nameser.h>
#include <resolv.h>
#include <poll.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <pwd.h>
#include <grp.h>
#include <err.h>
#include <assert.h>


#define DEBUG 0

#define DIVERT_PORT 25

#define NSTATES 10

struct dns_header {
    uint16_t id;
    uint16_t flags;
#define QR 0x8000
#define OPCODE_MASK 0x7800
#define OPCODE_SHIFT 11
#define AA 0x0400
#define TC 0x0200
#define RD 0x0100
#define RA 0x0080
#define AD 0x0020
#define CD 0x0010
#define RCODE_MASK 0x000f
#define RCODE_SHIFT 0
    uint16_t qdcount;
    uint16_t ancount;
    uint16_t nscount;
    uint16_t arcount;
};

struct dns_record {
    uint16_t type;
    uint16_t class;
    uint32_t ttl;
    uint16_t length;
};

struct state {
    union {
        struct in_addr in4;
        struct in6_addr in6;
        uint8_t octets[sizeof(struct in6_addr)];
    } addr;
    struct timespec timeout;
    int af;
    uint16_t dnskey;
} states[NSTATES];

void send_query(struct state *state, const char *question);
void process_response();

void enlist(struct state *state, int white);

int dnssock, pfdev;

const char *const whitelists[] = {
    "list.dnswl.org",
    "swl.spamhaus.org",
};

int main(int argc, char *argv[])
{
    int i, ret;
    time_t t;
    struct sockaddr_in sin4;
    struct sockaddr_in6 sin6;
    struct group *group;
    struct passwd *passwd;
    struct pollfd fds[3];

    tzset();

    pfdev = open("/dev/pf", O_RDWR);
    if (pfdev == -1) err(1, "open(\"/dev/pf\") failed");

    ret = IPPROTO_DIVERT_INIT;
    setsockopt(fds[1].fd, IPPROTO_IP, IP_DIVERTFL, &ret, sizeof(ret));
    setsockopt(fds[2].fd, IPPROTO_IPV6, IP_DIVERTFL, &ret, sizeof(ret));

    /* DNS */
    if (res_init() == -1) err(1, "res_init");
    assert(_res_ext.nsaddr_list[0].ss_family != 0);
    fds[0].fd = dnssock = socket(_res_ext.nsaddr_list[0].ss_family,
                       SOCK_DGRAM | SOCK_DNS, 0);
    if (fds[0].fd == -1) err(1, "socket");

    if (connect(fds[0].fd, (struct sockaddr *)&_res_ext.nsaddr_list[0],
                _res_ext.nsaddr_list[0].ss_len) != 0)
        err(1, "connect");

    /* IPv4 divert */
    memset(&sin4, 0, sizeof(sin4));
    sin4.sin_family = AF_INET;
    sin4.sin_port = htons(DIVERT_PORT);
    sin4.sin_addr.s_addr = INADDR_ANY;
    fds[1].fd = socket(AF_INET, SOCK_RAW, IPPROTO_DIVERT);
    if (fds[1].fd == -1) err(1, "socket");
    if (bind(fds[1].fd, (struct sockaddr *) &sin4, sizeof(sin4)) != 0)
        err(1, "bind");

    /* IPv6 divert */
    memset(&sin6, 0, sizeof(sin6));
    sin6.sin6_family = AF_INET6;
    sin6.sin6_port = htons(DIVERT_PORT);
    sin6.sin6_addr = in6addr_any;
    fds[2].fd = socket(AF_INET6, SOCK_RAW, IPPROTO_DIVERT);
    if (fds[2].fd == -1) err(1, "socket");
    if (bind(fds[2].fd, (struct sockaddr *) &sin6, sizeof(sin6)) != 0)
        err(1, "bind");

    group = getgrnam("_spamd");
    if (group == NULL) err(1, "getgrnam");
    endgrent();
    passwd = getpwnam("_spamd");
    if (passwd == NULL) err(1, "getpwnam");
    if (chroot("/var/empty") != 0) err(1, "chroot");
    if (setgroups(0, NULL) != 0) err(1, "setgroups");
    if (setgid(group->gr_gid) != 0) err(1, "setgid");
    if (setuid(passwd->pw_uid) != 0) err(1, "setuid");

    fds[0].events = POLLIN;
    fds[1].events = POLLIN;
    fds[2].events = POLLIN;

#if 0
    states[0].af = AF_INET;
    clock_gettime(CLOCK_MONOTONIC, &states[0].timeout);
    states[0].timeout.tv_sec++;
    states[0].addr.in4.s_addr = inet_addr("217.72.192.73");
    fds[0].events |= POLLOUT;
#endif

    while (1) {
        char src[48], dst[48];
        struct timespec timestamp;

#if DEBUG
        for (i=0; i < 3; i++)
            fprintf(stderr, "%d: fd:%d events:%hd revents:%hd\n",
                    i, fds[i].fd, fds[i].events, fds[i].revents);
        fprintf(stderr, "Polling");
#endif
        ret = -1;
        for (i=0; i < NSTATES; i++)
            if (states[i].af != 0 &&
                    (ret == -1 ||
                     timespeccmp(&states[i].timeout, &states[ret].timeout, <)))
                ret = i;
        if (ret == -1)
            ret = ppoll(fds, 3, NULL, NULL);
        else {
            if (clock_gettime(CLOCK_MONOTONIC, &timestamp) == -1) err(1,
"clock_gettime");
            timespecsub(&states[ret].timeout, &timestamp, &timestamp);
            if (timestamp.tv_sec < 0) timestamp.tv_sec = timestamp.tv_nsec = 0;
            ret = ppoll(fds, 3, &timestamp, NULL);
        }
        if (ret == -1) err(1, "poll");
        if (clock_gettime(CLOCK_MONOTONIC, &timestamp) == -1) err(1,
"clock_gettime");

#if DEBUG
        for(i=0; i < 3; i++)
            fprintf(stderr, "%d: fd:%d events:%hd revents:%hd\n",
                    i, fds[i].fd, fds[i].events, fds[i].revents);
#endif

        /* first check for DNS replies and timeouts to free up states. */
        if (fds[0].revents & POLLIN)
            process_response();

        /* timeouts */
        for (i=0; i < NSTATES; i++) {

            if (states[i].af != 0 &&
                    timespeccmp(&states[i].timeout, &timestamp, <))
                enlist(&states[i], 0);
        }

        /* send DNS queries ? */
        if (fds[0].revents & POLLOUT) {
            fds[0].events &= ~POLLOUT;
            for (i=0; i < NSTATES; i++) {
                if (states[i].af == 0) continue;
                if (states[i].dnskey == 0) {
                    arc4random_buf(&states[i].dnskey, sizeof(states[i].dnskey));
                    for (int j = 0; j < sizeof(whitelists) / sizeof(whitelists[0]); j++) {
                        send_query(&states[i], whitelists[j]);
                    }
                }
                if (states[i].dnskey == 0) {
                    fds[0].events |= POLLOUT;
                    break;
                }
            }
        }

        /* Then accept next smtp connects */
        if (fds[1].revents & POLLIN) {
            /* IPv4 */;
            char packet[IP_MAXPACKET];
            const struct ip * const ip = (struct ip *) packet;
            ret = recv(fds[1].fd, packet, sizeof(packet), MSG_DONTWAIT);
            if (ret == -1) err(1, "recv");
            if (ret < sizeof(struct ip)) {
                warnx("packet is too short");
                continue;
            }

            if (inet_ntop(AF_INET, &ip->ip_src, src,
                        sizeof(src)) == NULL)
                (void)strlcpy(src, "?", sizeof(src));

            if (inet_ntop(AF_INET, &ip->ip_dst, dst,
                        sizeof(dst)) == NULL)
                (void)strlcpy(dst, "?", sizeof(dst));

            t = time(NULL);
            fprintf(stderr, "%.19s: %s -> %s\n", ctime(&t), src, dst);

            ret = -1;
            for (i=0; i < NSTATES; i++) {
                if (states[i].addr.in4.s_addr == ip->ip_src.s_addr) {
                    ret = -2;
                    break;
                }
                if (states[i].af == 0) ret = i;
            }
            if (ret == -1)
                warnx("State table full");
            else if (ret == -2)
                warnx("Already seen");
            else {
                struct timespec timeout = { 0, 900000000 }; /* 0,9 s */
                states[ret].af = AF_INET;
                states[ret].addr.in4 = ip->ip_src;
                timespecadd(&timestamp, &timeout, &states[ret].timeout);

                /* queue dns */
                fds[0].events |= POLLOUT;

#if DEBUG
                fprintf(stderr, "Activated state %d for %s\n", ret, name);
#endif
            }
        }
        else if (fds[2].revents & POLLIN) {
            /* IPv6 */;
            char packet[IPV6_MAXPACKET];
            const struct ip6_hdr * const ip6 = (struct ip6_hdr *) packet;
            ret = recv(fds[2].fd, packet, sizeof(packet), MSG_DONTWAIT);
            if (ret == -1) err(1, "recv");
            if (ret < sizeof(struct ip6_hdr)) {
                warnx("packet is too short");
                continue;
            }

            if (inet_ntop(AF_INET6, &ip6->ip6_src, src,
                        sizeof(src)) == NULL)
                (void)strlcpy(src, "?", sizeof(src));

            if (inet_ntop(AF_INET6, &ip6->ip6_dst, dst,
                        sizeof(dst)) == NULL)
                (void)strlcpy(dst, "?", sizeof(dst));

            t = time(NULL);
            fprintf(stderr, "%.19s: %s -> %s\n", ctime(&t), src, dst);

            ret = -1;
            for (i=0; i < NSTATES; i++) {
                if (!memcmp(&states[i].addr.in6, &ip6->ip6_src,
                            sizeof(ip6->ip6_src))) {
                    ret = -2;
                    break;
                }
                if (states[i].af == 0) ret = i;
            }
            if (ret == -1)
                warnx("State table full");
            else if (ret == -2)
                warnx("Already seen");
            else {
                states[ret].af = AF_INET6;
                states[ret].addr.in6 = ip6->ip6_src;
                states[ret].timeout = timestamp;
                states[ret].timeout.tv_sec++; /* 1s timeout */

                /* queue dns */
                fds[0].events |= POLLOUT;

#if DEBUG
                fprintf(stderr, "Activated state %d for %s\n", ret, name);
#endif
            }
        }
    }
}

void send_query(struct state *state, const char *question)
{
    int ret;
    uint8_t msg[512];
    uint8_t *p = msg + sizeof(struct dns_header);
    struct dns_header *head = (struct dns_header *)msg;
    struct dns_record *record;
    char name[HOST_NAME_MAX];

    memset(msg, 0, sizeof(msg));

    head->id = htons(state->dnskey);
    head->flags = htons(RD);
    /* In practise only one question is supported by nameservers. */
    head->qdcount = htons(1);

    ret = snprintf(name, sizeof(name),
            "%hhu.%hhu.%hhu.%hhu.%s",
            state->addr.octets[3], state->addr.octets[2],
            state->addr.octets[1], state->addr.octets[0],
            question);
    if (ret >= sizeof(name)) errx(1, "truncated domain name");

    ret = dn_comp(name, p, sizeof(msg) - (p-msg), NULL, NULL);
    if (ret == -1) errx(1, "dn_comp");
    p += ret;

    record = (struct dns_record *)p;
    p += 4; /* no ttl or length in the question section */
    if (p - msg > sizeof(msg)) errx(1, "buffer too small");
    record->type = htons(1);
    record->class = htons(1);

    ret = send(dnssock, msg, p - msg, MSG_DONTWAIT); /* TODO: use poll */
    if (ret == -1) err(1, "send");
    if (ret != p - msg) err(1, "sent short datagram");
}

void process_response()
{
    int ret;
    uint8_t msg[512];
    const uint8_t *p;
    struct dns_header *head = (struct dns_header *)msg;
#if DEBUG
    struct dns_record *record;
    char name[HOST_NAME_MAX];
#endif

    memset(msg, 0, sizeof(msg));
    ret = recv(dnssock, msg, sizeof(msg), MSG_DONTWAIT);
    if (ret == -1) err(1, "recv");
    if (ret > 1023) warn("Datagram truncated.");

    msg[sizeof(msg) - 1] = '\0';

#if DEBUG
    fprintf(stderr, "Received DNS: id %#.4hx flags %#.4hx qdcount%hu
ancount%hu nscount%hu arcount%hu\n",
        ntohs(head->id), ntohs(head->flags),
        ntohs(head->qdcount),
        ntohs(head->ancount),
        ntohs(head->nscount),
        ntohs(head->arcount));
#endif

    if ((ntohs(head->flags) & (QR|RCODE_MASK)) == QR) {
        /* lookup successful */
        for (int i=0; i <= NSTATES; i++)
            if (states[i].dnskey == ntohs(head->id)) {
                enlist(&states[i], 1);
            }
    }
    else if ((ntohs(head->flags) & RCODE_MASK) >> RCODE_SHIFT == 3)
        /* fprintf(stderr, "No entry found\n") */;
    else
        warnx("DNS response code not understood.");

    p = msg + sizeof(struct dns_header);

#if DEBUG
    /* Questions */
    fprintf(stderr, "QUESTIONS section:\n"); fflush(stderr);
    for (int i = 0; i < ntohs(head->qdcount); i++) {
        ret = dn_expand(msg, msg + sizeof(msg), p, name, sizeof(name));
        if (ret == -1) errx(1, "dn_expand");
        p += ret;

        record = (struct dns_record *)p;
        p += 4; /* no ttl or length in the question section */
        if (p - msg > sizeof(msg)) {
            warnx("end of buffer reached.");
            break;
        }

        record->type = htons(1);
        record->class = htons(1);

        fprintf(stderr, "QNAME <%s> QTYPE %hu QCLASS %hu\n",
                name,
                ntohs(record->type), /* QTYPE */
                ntohs(record->class)); /* QCLASS */
    }

    /* Answers */
    fprintf(stderr, "ANSWERS section:\n"); fflush(stderr);
    for (int i = 0; i < ntohs(head->ancount); i++) {
        ret = dn_expand(msg, msg + sizeof(msg), p, name, sizeof(name));
        if (ret == -1) errx(1, "dn_expand");
        p += ret;

        record = (struct dns_record *)p;
        p += 10;
        if (p - msg > sizeof(msg)) {
            warnx("end of buffer reached.");
            break;
        }

        fprintf(stderr, "QNAME <%s> TYPE %hu CLASS %hu TTL %u length %hu data <%s> as
IPv4 address: %s\n",
                name,
                ntohs(record->type), /* QTYPE */
                ntohs(record->class), /* QCLASS */
                ntohl(record->ttl), /* TTL */
                ntohs(record->length), /* RDATA length */
                p, inet_ntoa(*(struct in_addr *)p));

        p += ntohs(record->length);
    }

    /* Authorities */
    fprintf(stderr, "AUTHORITIES section:\n"); fflush(stderr);
    for (int i = 0; i < ntohs(head->nscount); i++) {
        ret = dn_expand(msg, msg + sizeof(msg), p, name, sizeof(name));
        if (ret == -1) errx(1, "dn_expand");
        p += ret;

        record = (struct dns_record *)p;
        p += 10;
        if (p - msg > sizeof(msg)) {
            warnx("end of buffer reached.");
            break;
        }

        if (ntohs(record->type) == 1 && ntohs(record->class) == 1 &&
            ntohs(record->length) == 4) {
            if (inet_ntop(AF_INET, p, address, sizeof(address)) == NULL)
                err(1, "inet_ntop");
        }

        fprintf(stderr, "QNAME <%s> TYPE %hu CLASS %hu TTL %u length %hu data <%s> as
IPv4 address: %s\n",
                name,
                ntohs(record->type), /* QTYPE */
                ntohs(record->class), /* QCLASS */
                ntohl(record->ttl), /* TTL */
                ntohs(record->length), /* RDATA length */
                p, inet_ntoa(*(struct in_addr *)p));

        p += ntohs(record->length);
    }

    /* Additional */
    fprintf(stderr, "ADDITIONAL section:\n"); fflush(stderr);
    for (int i = 0; i < ntohs(head->arcount); i++) {
        ret = dn_expand(msg, msg + sizeof(msg), p, name, sizeof(name));
        if (ret == -1) errx(1, "dn_expand");
        p += ret;

        record = (struct dns_record *)p;
        p += 10;
        if (p - msg > sizeof(msg)) {
            warnx("end of buffer reached.");
            break;
        }

        if (ntohs(record->type) == 1 && ntohs(record->class) == 1 &&
            ntohs(record->length) == 4) {
            if (inet_ntop(AF_INET, p, address, sizeof(address)) == NULL)
                err(1, "inet_ntop");
        }

        fprintf(stderr, "QNAME <%s> TYPE %hu CLASS %hu TTL %u length %hu data <%s> as
IPv4 address: %s\n",
                name,
                ntohs(record->type), /* QTYPE */
                ntohs(record->class), /* QCLASS */
                ntohl(record->ttl), /* TTL */
                ntohs(record->length), /* RDATA length */
                p, inet_ntoa(*(struct in_addr *)p));

        p += ntohs(record->length);
    }
#endif
}

void enlist(struct state *state, int white)
{
#if 0
    int ret;
    pid_t pid;
#endif
    time_t t;
    struct pfioc_table pfioc;
    struct pfr_addr pfaddr;
    char address[48];

    /* add to spamd-white/grey table */
    bzero(&pfioc, sizeof(pfioc));
    bzero(&pfaddr, sizeof(pfaddr));
    strlcpy(pfioc.pfrio_table.pfrt_name,
            white ? "dnswl-white" : "dnswl-grey",
            sizeof(pfioc.pfrio_table.pfrt_name));
    pfioc.pfrio_buffer = &pfaddr;
    pfioc.pfrio_esize = sizeof(pfaddr);
    pfioc.pfrio_size = 1;

    pfaddr.pfra_af = state->af;
    switch (state->af) {
        case AF_INET:
            pfaddr.pfra_ip4addr = state->addr.in4;
            pfaddr.pfra_net = 32;
            break;
        case AF_INET6:
            pfaddr.pfra_ip6addr = state->addr.in6;
            pfaddr.pfra_net = 128;
            break;
        default:
            errx(1, "unknown address family %d", state->af);
    }

    if (inet_ntop(pfaddr.pfra_af, &pfaddr.pfra_ip4addr, address,
sizeof(address)) == NULL)
        err(1, "inet_ntop");

    if (ioctl(pfdev, DIOCRADDADDRS, &pfioc) == -1)
        err(1, "cannot add %s to table %s", address, pfioc.pfrio_table.pfrt_name);
    else {
        t = time(NULL);
        fprintf(stderr, "%.19s: added %d %s to table %s\n",
                ctime(&t), pfioc.pfrio_nadd, address, pfioc.pfrio_table.pfrt_name);
    }

    /* add to spamdb database by running the spamdb command. */
#if 0
    if (white) {
        pid = fork();
        if (pid == -1)
            err(1, "fork");
        else if (pid == 0) {
            execle("/usr/sbin/spamdb", "spamdb", "-a", address, NULL, NULL);
            err(1, "execle");
        }
        do {
            pid_t pid2;
            pid2 = waitpid(pid, &ret, 0);
            if (pid2 == -1) err(1, "waitpid");
            assert(pid2 == pid);
        } while (! WIFEXITED(ret) );

        if (WEXITSTATUS(ret) != 0)
            warnx("spamdb -a %s failed with %d", address, WEXITSTATUS(ret));
    }
#endif

    memset(state, 0, sizeof(*state));
}


--
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
2779 7F73 44FD 0736 B67A  C410 69EC 7922 34B4 2566

[demime 1.01d removed an attachment of type application/pgp-signature]