spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Adam Wolk-2
Hi misc@

I upgraded my mail server to an amd64 snapshot from Sep 2nd and found
the server stuck delivering mail in the morning with spamassasin
churning at 90% CPU usage.

Quick investigation lead me to a huge bayes_toks file of 65.3G in
/var/spampd/.spamassasin/.

$ ls -alh
total 4738352
drwx------  2 _spampd  _spampd   512B Sep  4 10:00 .
drwxr-xr-x  3 _spampd  _spampd   512B Sep  3 15:57 ..
-rw-------  1 _spampd  _spampd    36B Sep  4 09:53 bayes.lock
-rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
-rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks

$ file
bayes_toks bayes_toks: Berkeley DB 1.85 (Hash, version 2, native
byte-order)


Interestingly I don't see that much space used with df (anyone knows
why?):

$ df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd0a     1008M   90.1M    868M     9%    /
/dev/sd0k      9.8G   80.3M    9.3G     1%    /home
/dev/sd0d      3.9G    118K    3.7G     0%    /tmp
/dev/sd0f      3.9G    1.0G    2.7G    28%    /usr
/dev/sd0g     1001M    212M    738M    22%    /usr/X11R6
/dev/sd0h      9.8G    572M    8.8G     6%    /usr/local
/dev/sd0j      3.9G    2.0K    3.7G     0%    /usr/obj
/dev/sd0i      2.0G    2.0K    1.9G     0%    /usr/src
/dev/sd0e      598G    4.3G    564G     1%    /var

I removed the file and disk usage dropped by 2.3G on /var.


Did anyone experience issues with spamassasin/spampd similar to the
one reported above?

p5-Mail-SpamAssassin-3.4.1p2 (installed)
spampd-2.30p3 (installed)

After deleting the file, restarting the service processing a single
email brought the DB to reported size 37.9M, few emails later it's
already reported as 113M I have a hunch that it will bloat again really
fast.

Regards,
Adam

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

koko
On Fri, 4 Sep 2015 10:20:01 +0200
Adam Wolk <[hidden email]> wrote:

> After deleting the file, restarting the service processing a single
> email brought the DB to reported size 37.9M, few emails later it's
> already reported as 113M I have a hunch that it will bloat again really
> fast.
>

try to disable bayes, set parameter "use_bayes 0" and
placed into the server-wide local.cf configuration file.

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Michael McConville-2
[hidden email] wrote:
> Adam Wolk <[hidden email]> wrote:
> > After deleting the file, restarting the service processing a single
> > email brought the DB to reported size 37.9M, few emails later it's
> > already reported as 113M I have a hunch that it will bloat again
> > really fast.
>
> try to disable bayes, set parameter "use_bayes 0" and placed into the
> server-wide local.cf configuration file.

I administrate a mail server running Debian Jessie that uses the shell
script method of calling SpamAssassin from Postfix. It uses a ton of
CPU, so I don't think this is an OpenBSD problem.

That said, you probably shouldn't disable Bayesian filtering. IIUC,
that's the main point of using SpamAssassin, and it's necessary to block
almost all spam.

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Adam Wolk-2
On Fri, 4 Sep 2015 12:31:13 -0400
Michael McConville <[hidden email]> wrote:

> [hidden email] wrote:
> > Adam Wolk <[hidden email]> wrote:
> > > After deleting the file, restarting the service processing a
> > > single email brought the DB to reported size 37.9M, few emails
> > > later it's already reported as 113M I have a hunch that it will
> > > bloat again really fast.
> >
> > try to disable bayes, set parameter "use_bayes 0" and placed into
> > the server-wide local.cf configuration file.
>
> I administrate a mail server running Debian Jessie that uses the shell
> script method of calling SpamAssassin from Postfix. It uses a ton of
> CPU, so I don't think this is an OpenBSD problem.
>
> That said, you probably shouldn't disable Bayesian filtering. IIUC,
> that's the main point of using SpamAssassin, and it's necessary to
> block almost all spam.

Thanks, I had an initial suspicion that something was misconfigured on
my previous snapshots as I saw spamassasin being executed but never
used a lot of CPU (though it did flag 1 - literally one, email as spam
- but that's expected volume for a server with 2 accounts).

It's quite possible that Bayesian filtering started working for me only
since this snapshot. I would appreciate it if you could check the size
of your bayes_toks db & some info on general growth per email (seems to
be around 30-60M on my server) as that's the only thing I think could
be wrong with it atm. 65.3G accumulated in less than 24h for a DB that
serves around 11k emails *per month* seems a lot (and most of that
traffic are OpenBSD mailing lists).

Regards,
Adam

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Paul de Weerd
In reply to this post by Adam Wolk-2
On Fri, Sep 04, 2015 at 10:20:01AM +0200, Adam Wolk wrote:
| Hi misc@
|
| I upgraded my mail server to an amd64 snapshot from Sep 2nd and found
| the server stuck delivering mail in the morning with spamassasin
| churning at 90% CPU usage.
|
| Quick investigation lead me to a huge bayes_toks file of 65.3G in
| /var/spampd/.spamassasin/.
|
| $ ls -alh
| total 4738352
| drwx------  2 _spampd  _spampd   512B Sep  4 10:00 .
| drwxr-xr-x  3 _spampd  _spampd   512B Sep  3 15:57 ..
| -rw-------  1 _spampd  _spampd    36B Sep  4 09:53 bayes.lock
| -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
| -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
|
| $ file
| bayes_toks bayes_toks: Berkeley DB 1.85 (Hash, version 2, native
| byte-order)
|
|
| Interestingly I don't see that much space used with df (anyone knows
| why?):

You should read up on sparse files.  Here's a quick trick from the
sparse files book of tricks:

# First we create a file 'bigfile' using dd:
[weerd@despair] $ dd if=/dev/zero of=bigfile bs=1048576 count=10 seek=1024
10+0 records in
10+0 records out
10485760 bytes transferred in 0.178 secs (58799094 bytes/sec)

# ls will tell us how big this file is:
[weerd@despair] $ ls -lh bigfile
-rw-r--r--  1 weerd  weerd   1.0G Sep  4 19:51 bigfile

# du will tell us how much space is in use by this file:
[weerd@despair] $ du -sh bigfile
10.1M   bigfile

# cp is even better at the sparse files game:
[weerd@despair] $ cp bigfile bigfile2

# bigfile2 is the same as bigfile:
[weerd@despair] $ ls -lh bigfile2
-rw-r--r--  1 weerd  weerd   1.0G Sep  4 19:54 bigfile2

# No, really .. exactly the same:
[weerd@despair] $ md5 bigfile*
MD5 (bigfile) = 5ec6988d232a445bc40b9dca003b95f7
MD5 (bigfile2) = 5ec6988d232a445bc40b9dca003b95f7

# However, it uses a lot less disk space:
[weerd@despair] $ du -sh bigfile2
48.0K   bigfile2


TL;DR: files with lots of emptiness (consecutive ranges of all 0 data)
are efficiently stored using "sparse files"

| $ df -h
| Filesystem     Size    Used   Avail Capacity  Mounted on
| /dev/sd0a     1008M   90.1M    868M     9%    /
| /dev/sd0k      9.8G   80.3M    9.3G     1%    /home
| /dev/sd0d      3.9G    118K    3.7G     0%    /tmp
| /dev/sd0f      3.9G    1.0G    2.7G    28%    /usr
| /dev/sd0g     1001M    212M    738M    22%    /usr/X11R6
| /dev/sd0h      9.8G    572M    8.8G     6%    /usr/local
| /dev/sd0j      3.9G    2.0K    3.7G     0%    /usr/obj
| /dev/sd0i      2.0G    2.0K    1.9G     0%    /usr/src
| /dev/sd0e      598G    4.3G    564G     1%    /var
|
| I removed the file and disk usage dropped by 2.3G on /var.
|
|
| Did anyone experience issues with spamassasin/spampd similar to the
| one reported above?
|
| p5-Mail-SpamAssassin-3.4.1p2 (installed)
| spampd-2.30p3 (installed)
|
| After deleting the file, restarting the service processing a single
| email brought the DB to reported size 37.9M, few emails later it's
| already reported as 113M I have a hunch that it will bloat again really
| fast.
|
| Regards,
| Adam
|

--
>++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
+++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
                 http://www.weirdnet.nl/                 

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Chris Cappuccio
In reply to this post by Adam Wolk-2
Adam Wolk [[hidden email]] wrote:

> Hi misc@
>
> I upgraded my mail server to an amd64 snapshot from Sep 2nd and found
> the server stuck delivering mail in the morning with spamassasin
> churning at 90% CPU usage.
>
> Quick investigation lead me to a huge bayes_toks file of 65.3G in
> /var/spampd/.spamassasin/.
>
> $ ls -alh
> total 4738352
> drwx------  2 _spampd  _spampd   512B Sep  4 10:00 .
> drwxr-xr-x  3 _spampd  _spampd   512B Sep  3 15:57 ..
> -rw-------  1 _spampd  _spampd    36B Sep  4 09:53 bayes.lock
> -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
> -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
>

What are your memory limits for the user/daemon class that runs spamassassin?

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Adam Wolk-2
On Fri, 4 Sep 2015 11:08:35 -0700
Chris Cappuccio <[hidden email]> wrote:

> Adam Wolk [[hidden email]] wrote:
> > Hi misc@
> >
> > I upgraded my mail server to an amd64 snapshot from Sep 2nd and
> > found the server stuck delivering mail in the morning with
> > spamassasin churning at 90% CPU usage.
> >
> > Quick investigation lead me to a huge bayes_toks file of 65.3G in
> > /var/spampd/.spamassasin/.
> >
> > $ ls -alh
> > total 4738352
> > drwx------  2 _spampd  _spampd   512B Sep  4 10:00 .
> > drwxr-xr-x  3 _spampd  _spampd   512B Sep  3 15:57 ..
> > -rw-------  1 _spampd  _spampd    36B Sep  4 09:53 bayes.lock
> > -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
> > -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
> >
>
> What are your memory limits for the user/daemon class that runs
> spamassassin?

Touche, not set. Though it was running like that since ~December last
year hence my question to misc@ if anyone noticed it behaving
differently since the last release. In no way I'm assuming that
something is wrong on the OS / software level - in fact I assumed that
my setup was performed incorrectly by me. So far I learned a ton of
useful info by asking on the list here, hope no one feels offended :)

$ cat /etc/login.conf | grep -i spam
$

Regards,
Adam

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Chris Cappuccio
Adam Wolk [[hidden email]] wrote:

> > > -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
> > > -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
> > >
> >
> > What are your memory limits for the user/daemon class that runs
> > spamassassin?
>
> Touche, not set. Though it was running like that since ~December last
> year hence my question to misc@ if anyone noticed it behaving
> differently since the last release. In no way I'm assuming that
> something is wrong on the OS / software level - in fact I assumed that
> my setup was performed incorrectly by me. So far I learned a ton of
> useful info by asking on the list here, hope no one feels offended :)
>
> $ cat /etc/login.conf | grep -i spam
> $
>

Well it still runs with some class, perhaps as daemon ?

I guess I'm really asking, is your login.conf modified? Post it and your rc.conf.local

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Frank Brodbeck-2
In reply to this post by Adam Wolk-2
> $ cat /etc/login.conf | grep -i spam
> $

UUOC

grep -i spam /etc/login.conf

But that is not actually answering the question as we don't know the login class you are using and what it's limits are like ;-)

You can get the login class by using id(1). For the limits I think you need to read login.conf.

Frank.

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Adam Wolk-2
In reply to this post by Chris Cappuccio
On Fri, 4 Sep 2015 11:37:09 -0700
Chris Cappuccio <[hidden email]> wrote:

> Adam Wolk [[hidden email]] wrote:
> > > > -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
> > > > -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
> > > >
> > >
> > > What are your memory limits for the user/daemon class that runs
> > > spamassassin?
> >
> > Touche, not set. Though it was running like that since ~December
> > last year hence my question to misc@ if anyone noticed it behaving
> > differently since the last release. In no way I'm assuming that
> > something is wrong on the OS / software level - in fact I assumed
> > that my setup was performed incorrectly by me. So far I learned a
> > ton of useful info by asking on the list here, hope no one feels
> > offended :)
> >
> > $ cat /etc/login.conf | grep -i spam
> > $
> >
>
> Well it still runs with some class, perhaps as daemon ?
>
> I guess I'm really asking, is your login.conf modified? Post it and
> your rc.conf.local
>

Not modified by hand.

$ grep -i spam /etc/passwd                                                                                                                          
_spamd:*:62:62:Spam Daemon:/var/empty:/sbin/nologin
_spamdaemon:*:506:506:SpamAssassin:/var/db/spamassassin:/sbin/nologin
_spampd:*:746:746:spampd user:/var/spampd:/sbin/nologin
$ id _spamd
uid=62(_spamd) gid=62(_spamd) groups=62(_spamd)
$ id _spamdaemon
uid=506(_spamdaemon) gid=506(_spamdaemon) groups=506(_spamdaemon)
$ id _spampd
uid=746(_spampd) gid=746(_spampd) groups=746(_spampd)
$



$ cat /etc/login.conf
# $OpenBSD: login.conf,v 1.5 2015/07/20 18:53:18 sthen Exp $

#
# Sample login.conf file.  See login.conf(5) for details.
#

#
# Standard authentication styles:
#
# passwd        Use only the local password file
# chpass        Do not authenticate, but change users password (change
#               the YP password if the user has one, else change the
#               local password)
# lchpass       Do not login; change user's local password instead
# radius        Use radius authentication
# reject        Use rejected authentication
# skey          Use S/Key authentication
# activ         ActivCard X9.9 token authentication
# crypto        CRYPTOCard X9.9 token authentication
# snk           Digital Pathways SecureNet Key authentication
# tis           TIS Firewall Toolkit authentication
# token         Generic X9.9 token authentication
# yubikey       YubiKey authentication
#

# Default allowed authentication styles
auth-defaults:auth=passwd,skey:

# Default allowed authentication styles for authentication type ftp
auth-ftp-defaults:auth-ftp=passwd:

#
# The default values
# To alter the default authentication types change the line:
#       :tc=auth-defaults:\
# to be read something like: (enables passwd, "myauth", and activ)
#       :auth=passwd,myauth,activ:\
# Any value changed in the daemon class should be reset in default
# class.
#
default:\
        :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin /usr/local/sbin:\
        :umask=022:\
        :datasize-max=512M:\
        :datasize-cur=512M:\
        :maxproc-max=256:\
        :maxproc-cur=128:\
        :openfiles-cur=512:\
        :stacksize-cur=4M:\
        :localcipher=blowfish,8:\
        :ypcipher=old:\
        :tc=auth-defaults:\
        :tc=auth-ftp-defaults:

#
# Settings used by /etc/rc and root
# This must be set properly for daemons started as root by inetd as well.
# Be sure reset these values back to system defaults in the default class!
#
daemon:\
        :ignorenologin:\
        :datasize=infinity:\
        :maxproc=infinity:\
        :openfiles-cur=128:\
        :stacksize-cur=8M:\
        :localcipher=blowfish,9:\
        :tc=default:

#
# Staff have fewer restrictions and can login even when nologins are set.
#
staff:\
        :datasize-cur=1536M:\
        :datasize-max=infinity:\
        :maxproc-max=512:\
        :maxproc-cur=256:\
        :ignorenologin:\
        :requirehome@:\
        :tc=default:

#
# Authpf accounts get a special motd and shell
#
authpf:\
        :welcome=/etc/motd.authpf:\
        :shell=/usr/sbin/authpf:\
        :tc=default:

#
# Building ports with DPB uses raised limits
#
pbuild:\
        :datasize-max=infinity:\
        :datasize-cur=4096M:\
        :maxproc-max=1024:\
        :maxproc-cur=256:\
        :tc=default:

#
# Override resource limits for certain daemons started by rc.d(8)
#
bgpd:\
        :openfiles-cur=512:\
        :tc=daemon:

unbound:\
        :openfiles-cur=512:\
        :tc=daemon:

dovecot:\
        :openfiles-cur=512:\
        :openfiles-max=2048:\
        :tc=daemon:

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Stuart Henderson
In reply to this post by Adam Wolk-2
On 2015-09-04, Adam Wolk <[hidden email]> wrote:
> It's quite possible that Bayesian filtering started working for me only
> since this snapshot. I would appreciate it if you could check the size
> of your bayes_toks db & some info on general growth per email (seems to
> be around 30-60M on my server) as that's the only thing I think could
> be wrong with it atm. 65.3G accumulated in less than 24h for a DB that
> serves around 11k emails *per month* seems a lot (and most of that
> traffic are OpenBSD mailing lists).

That definitely seems wrong, my bayes_toks from 500-1000 mails/day with
amavis+spamassassin is around 5MB. I'm not sure where to start looking
though, I'd probably try wiping the db and starting again, though the
only time I remember having to do that myself is when someone was relaying
spam through a host in DNSWL which got auto-learned as ham (i.e bogus data
not corruption).

$ sudo -u _vscan sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0       4202          0  non-token data: nspam
0.000          0       1799          0  non-token data: nham
0.000          0     151022          0  non-token data: ntokens
0.000          0 1422052584          0  non-token data: oldest atime
0.000          0 1441425256          0  non-token data: newest atime
0.000          0 1441426503          0  non-token data: last journal sync atime
0.000          0 1441412100          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

Reply | Threaded
Open this post in threaded view
|

Re: spamassasin large CPU usage on new snapshot and a huge bayes_toks file not reported in df

Chris Cappuccio
In reply to this post by Adam Wolk-2
Adam Wolk [[hidden email]] wrote:

> On Fri, 4 Sep 2015 11:37:09 -0700
> Chris Cappuccio <[hidden email]> wrote:
>
> > Adam Wolk [[hidden email]] wrote:
> > > > > -rw-------  1 _spampd  _spampd   9.8M Sep  3 22:52 bayes_seen
> > > > > -rw-------  1 _spampd  _spampd  65.3G Sep  3 22:55 bayes_toks
> > > > >
> > > >
> > > > What are your memory limits for the user/daemon class that runs
> > > > spamassassin?
> > >
> > > Touche, not set. Though it was running like that since ~December
> > > last year hence my question to misc@ if anyone noticed it behaving
> > > differently since the last release. In no way I'm assuming that
> > > something is wrong on the OS / software level - in fact I assumed
> > > that my setup was performed incorrectly by me. So far I learned a
> > > ton of useful info by asking on the list here, hope no one feels
> > > offended :)
> > >
> > > $ cat /etc/login.conf | grep -i spam
> > > $
> > >
> >
> > Well it still runs with some class, perhaps as daemon ?
> >
> > I guess I'm really asking, is your login.conf modified? Post it and
> > your rc.conf.local
> >
>
> Not modified by hand.
>

In that case, I wonder if you are hitting some kind of bug.
I have been having regular crashes under perl in 5.8/5.8 current,
I think from spamassassin (called via mailscanner). It looks
like I am hitting some occasional corruption within the sqlite
library after being called through the perl module.