               popsneaker 0.6 -- email filter
               http://www.ixtools.de/popsneaker
                        January 2002
===========================================================================

see the file "AUTHORS" for contact information.
see the file "COPYING" for license issues.
see the file "ChangeLog" for information about changes.
see the file "INSTALL" for installation (compiling) instructions.
see the file "TODO" for information about what has to be done.

There is HTML documentation in the popsneaker/doc directory.

===========================================================================

  The Popsneaker Handbook
  Stefan Baehre popsneaker@ixtools.de
  Version 0.6.2a, Tue Dec 28 18:25:55 CET 2002

  This Handbook describes Popsneaker Version 0.6.2
  ______________________________________________________________________

  Table of Contents


  1. Introduction
     1.1 Changes

  2. Installation
     2.1 How to obtain popsneaker
     2.2 Requirements
     2.3 Compilation and installation

  3. Usage
     3.1 General Usage
     3.2 Configuration
        3.2.1 popsneakerrc
        3.2.2 Startup
        3.2.3 Logfile
     3.3 Filters
        3.3.1 maxsize
        3.3.2 accept, assume and deny
        3.3.3 dupcheck
        3.3.4 Scoring
        3.3.5 Options
        3.3.6 Restricting headers
        3.3.7 Regular expressions
        3.3.8 Example

  4. Questions and Answers
  5. Copyright


  ______________________________________________________________________

  [1m1.  Introduction[0m


  This is the documentation of popsneaker, a mailfilter for remote
  filtering of email accounts.  It is most useful for computers which
  have a dial-up connection to the internet.  You can define rules to
  select emails, which you don't want to download to your local host.
  This is a simple and effective way to get rid of spam, advertisings
  and other kinds of unwanted mail.  The filter rules are very flexible
  and powerful, but still easy to handle.  The main ruletypes are using
  regular expressions to deny, to accept or to make an assumption on the
  mail.



  [1m1.1.  Changes[0m


  This is the 0.6 release of popsneaker.  It contains some redesigns in
  the networking code to make popsneaker more modular.  While prior
  versions support only the POP3 protocol, it should become possible now
  to connect to servers using many different methods.  The following
  mail-retrieval protocols are currently supported: POP3 (incl. APOP).

  Version 0.5 was the first implementation in C++ and exceeds the
  previous versions in performance and efficience.
  0.4 and earlier versions were implemented as Tcl scripts.  At this
  time, there is no reason to use those versions anymore.

  For more detailed information read the file ChangeLog, that comes with
  the archive.



  [1m2.  Installation[0m


  [1m2.1.  How to obtain popsneaker[0m


  The latest release can always be downloaded from ixtools.de.



  [1m2.2.  Requirements[0m


  In order to use popsneaker, you need a glibc based system.  Most
  modern linux distributions come with this library.

  Furthermore Tcp4u is needed.  It is available at the following sites:

  ftp://papa.indstate.edu/winsock-l/WindowsNT/Develop and its mirrors,
  http://download.com and http://www.hotfiles.com.

  Tcp4u is distributed with Debian GNU/Linux 2.2.  If you are aware of
  any other distributions, please let me know.

  You can also get prepackaged versions for Debian and Red Hat at
  ixtools.de.

  If you don't have or don't want to install this library, you can get a
  statically linked binary of popsneaker from ixtools.de.



  [1m2.3.  Compilation and installation[0m


  In order to compile and install popsneaker on your system, type the
  following in the base directory of the popsneaker distribution:



       % ./configure
       % make
       % make install



  Since popsneaker uses

  autoconf


  compiling it.  Should you run into problems please report them to me.



  [1m3.  Usage[0m


  [1m3.1.  General Usage[0m


  Popsneaker is intended to be used in conjunction with a mail retrieval
  tool like fetchmail. The concept is to start popsneaker first to clean
  your mailaccount and then run fetchmail to download the mails
  popsneaker considered good.



  [1m3.2.  Configuration[0m



  [1m3.2.1.  popsneakerrc[0m


  When started, popsneaker reads its configuration from one of the
  following files.  Popsneaker uses the first file that exists.



       ~/.popsneakerrc
       /etc/popsneakerrc
       /usr/etc/popsneakerrc
       /usr/local/etc/popsneakerrc



  So the first thing you have to do is to create a popsneakerrc in at
  least one of the above positions.  Be sure to make it accessible only
  by the user who starts popsneaker, because it contains the passwords
  for your mailaccounts.

  To make this more easy for you, there is an example configuration file
  in the archive, which is installed automatically.  This example
  contains many comments and has reasonable defaults, so it should be
  easy to set it up within seconds.  The last section describes the
  filter rules, which are explained in a later chapter.



  [1m3.2.2.  Startup[0m


  The next step is to make popsneaker start before fetchmail.  The best
  way to achieve this is to add a preconnect statement to your
  fetchmailrc. Here is an example for a fetchmailrc entry:



       poll pop.isp.com
          proto POP3
          user myremotename
          password mypassword
          is mylocalname
          options fetchall
          preconnect '/usr/local/bin/popsneaker pop.isp.com'



  Make sure you use the "fetchall" option of fetchmail, because when
  popsneaker has finished its job, all messages in your mailbox are
  marked read.  Without this option, fetchmail will not fetch any mails
  at all.

  If you are using another mail retrieval tool than fetchmail, you can
  add a call to popsneaker to the same script that starts your retrieval
  tool. To use fetchmail as an example again, add the following into the
  script:



       /usr/local/bin/popsneaker pop.isp.com
       /usr/bin/fetchmail pop.isp.com



  This also works with any other retrieval tool.



  [1m3.2.3.  Logfile[0m


  It is advisable to read the logfile periodically, so that you can
  check which mail was deleted.  A handy way would be to let cron mail
  the log to you, for example once per week.  This can be done using
  either savelog or logrotate.

  With savelog, you can write a script like this:


  #!/bin/sh
  LOG=/var/log/popsneaker.log
  if [ -e $LOG ]; then
     cat $LOG | mail -s "popsneaker log" root
     savelog -p -c 1 $LOG >/dev/null
  fi



  With logrotate, this is even simpler.  This statement in logrotate's
  config file mails the log once per week:


  /var/log/popsneaker.log {
          missingok
          rotate 0
          mail root
          weekly
          notifempty
          copytruncate
  }



  You should make use of one of these methods, since no one else takes
  care of the logfile and it will become bigger and bigger.  I would
  prefer the logrotate way if available.  It is very easy to handle.



  [1m3.3.  Filters[0m


  There are 5 different filter types. I will describe them one by one
  now and then show you how to combine them.



  [1m3.3.1.  maxsize[0m



       Syntax: maxsize [options] <size>



  This filter deletes mail, which exceeds a given size.

  For example you can say:



       maxsize 10000



  to delete every mail, that is bigger than 10kB.



  [1m3.3.2.  accept, assume and deny[0m



       Syntax: accept [options] "<rule>"
               assume [options] "<rule>"
               deny   [options] "<rule>"



  With "accept" you can define mail which should in no way be deleted.
  So you can prevent the deletion of important mail (maybe from a good
  friend or your employer). This is like building a "positive"-list.

  With "assume" you can define assumptions a mail must fullfill. For
  example, you can delete mail which is not directly addressed to you.
  Or you can delete mail which is not plain text.

  "deny" is the opposite of "accept". You can define which mail should
  be deleted. Maybe you want to give some keywords here, or you want to
  avoid mail from special persons.

  The rules itself are POSIX 1003.2 extended regular expressions.  These
  expressions are processed on every line of the mailheader.  To
  understand how it works, let's look at some examples:

  Assume we want to stop all mail that comes from badguy@mytown.com. To
  make popsneaker delete such mail, put a



  deny "^From: .*badguy@mytown\.com"



  in your popsneakerrc. Or maybe you hate all people at mytown. Change
  this line to



       deny "^From: .*mytown\.com"



  to delete all mail from that domain.

  "\" is used to quote special characters. If you want to use a
  character which is special to regular expressions, you must put a "\"
  in front of it.

  In the above examples, we have to quote the "." in the domainname,
  because it has a special meaning in regular expressions.

  To learn more about regular expressions read



       man 7 regex



  There are also some books and documents which describe regular
  expressions in detail.



  [1m3.3.3.  dupcheck[0m



       Syntax: dupcheck -strict
               dupcheck -relaxed



  The dupcheck filter detects duplicated incoming mails and deletes them
  except for one.

  In "strict" mode, this filter compares the message-ids of the incoming
  mails. This will delete mails that were send to multiple of your email
  addresses. This should be save.

  In "relaxed" mode, the filter compares the size, the subject and the
  sender of the mail. This will delete mails, that were send to you more
  than once (with intention or by mistake). Be aware, there is a small
  risk, that to many mails are deleted (when the same sender sends you
  some different mails with the same subject and exactly the same size).

  If you are not sure which mode to use, use "strict". Duplicated mails
  are deleted silently and no logging is done.


  [1m3.3.4.  Scoring[0m



       Syntax: score <value> [options] "<rule>"
               score_reset
               score_eval [condition] accept|deny



  Scoring is a feature introduced in version 0.6.2 of popsneaker.  You
  can now use a finer grain of control than "accept" and "deny".

  The "score" filter rule changes the current score of a message by a
  given value if the rule matches.



       score -10 "^subject: .*for free"



  The "score_reset" instruction sets the score value back to 0.  This
  can be useful, if you have more than one block of independant score
  rules.

  And the "score_eval" instruction makes a decision based upon the score
  value.  The condition is set via options, possible settings are:



       -lt <value>      True if the score is less than <value>.
       -le <value>      True if the score is less or equal <value>.
       -gt <value>      True if the score is greater than <value>.
       -ge <value>      True if the score is greater or equal <value>.



  A short example:



       score_reset
       score +10 "^References: .*mydomain\.net"
       score +10 "^In-Reply-To: .*mydomain\.net"
       score -5  "^Content-Type: text/html"
       score -5  "^Message-ID: .*@127\.0\.0\.1"
       score -10 "^(to|cc): .*,.*,.*,.*,.*,"
       score_eval -gt 0 accept
       score_eval -lt 0 deny

       score_reset
       score -10 -case "^Subject: .*FREE"
       score -10 -case "^Subject: .*BUY"
       score -10 -case "^Subject: .*CASH"
       score -5  -case "^Subject: .*free"
       score -5  -case "^Subject: .*buy"
       score -5  -case "^Subject: .*cash"
       score_eval -lt 5 deny



  [1m3.3.5.  Options[0m


  The behaviour of some of the filter rules described above can be
  modified using the following options.



       -case            This executes the filterrule case sensitive.
                        Case insensitive is the default. Only useful
                        for the accept, assume and deny rules.

       -nocase          This executes the filterrule case insensitive.
                        This is the default. Only useful for the accept,
                        assume and deny rules.

       -verbose         The rule is verbose and writes deletions of mail
                        to the logfile. This is the default.

       -silent          The rule deletes mail silently without mentioning
                        it in the logfile.



  [1m3.3.6.  Restricting headers[0m


  You can speed up the processing of the filter rules by making use of
  the restrict statement.



       Syntax: restrict "header" ...



  This way, only those headerlines that are mentioned in a restrict
  statement are used for comparisons.  If you use restrictions, make
  sure to set up a restrict statement for every header your filters are
  using.  All other headerlines are ignored.  Every restrict can have
  multiple headerlines and you can also use multiple restrict
  statements.

  If no restrict statement is given, all header lines are used.  This is
  the default and a save decision.



  [1m3.3.7.  Regular expressions[0m


  This is a short and very simplified summary about regular expressions.

  A regular expression is a string of atoms with some special
  characters. An atom can be a single character, another regular
  expression included in parenthesis, or a range included in brackets.
  The following special characters can be added to atoms:



  *    matches a sequence of 0 or more matches of the atom.
  +    matches a sequence of 1 or more matches of the atom.
  ?    matches the atom 0 or 1 time.



  There are also these special characters:


  ^    matches the beginning of a string.
  $    matches the end of a string.
  .    matches any character.
  ()   used to build more complex atoms.
  []   builds a range. Matches any character in the brackets.
  |    is the logical "or".



  In a range there are the following additional characters available:


  -    used to build sequences (for example [0-9]).
  ^    negates a sequence ([^0-9] matches anything other than 0-9)



  [1m3.3.8.  Example[0m


  This is a commented example for a filter script:



  # ----------------------------------------------------------------
  # Start of popsneakerrc.
  #
  # Lines that start with a "#" are comments.
  #

  # Account information.
  #
  account -protocol pop3 "pop.myisp.com" "loginname" "secret"
  # This account supports the more secure APOP authentification.
  account -protocol apop "pop.mymail.org" "loginname" "secret"

  # This is my employers domain. No mail from this domain should be
  # deleted.
  #
  accept "^From: .*business\.com"

  #
  # I have a very good friend. I want everything he sends.
  #
  accept "^From: .*my\.friend@isp\.com"

  #
  # From everone else I don't want mails bigger than 1 MB.
  #
  maxsize 1048576

  #
  # These are also some friends.
  #
  accept "^From: .*friend1"
  accept "^From: .*friend2"

  #
  # Also accept replies to my mail.
  #
  accept "^Subject: Re: "

  #
  # A more advanced method of detecting replies.
  #
  accept "^References: .*my-domain.com"
  accept "^In-Reply-To: .*my-domain.com"

  #
  # Accept mailinglists:
  #
  accept "^X-Mailing-List: "
  accept "^List-Id: "

  #
  # No mail larger than 64 KB should pass this point.
  #
  maxsize 65536

  #
  # These are some spammers I get mail from.
  #
  deny -silent "^From: .*@buyers\.com"
  deny -silent "^From: .*@get-it-all\.com"

  #
  # These are some keywords I don't like to see.
  #
  deny       -silent "^Subject: .*\$\$\$"
  deny -case -silent "^Subject: .*MONEY"
  deny -case -silent "^Subject: .*CREDIT"
  deny -case -silent "^Subject: .*FREE"
  deny -case -silent "^Subject: .*CASH"

  #
  # Some checks on the Message-ID:
  #
  assume "^Message-ID: .*<.+@.+\..+>"
  deny   "^Message-ID: .*@127\.0\.0\.1"

  #
  # Get rid of HTML mails:
  #
  deny "^Content-Type: text/html"

  #
  # The mail should be addressed directly to me (many spammers don't do
  # this). Be carefull: all possible addresses must be mentioned here or
  # mail gets lost accidently. Note that mailinglists use their own
  # "To"-field. They must be accepted before.
  #
  assume "^((To)|(Cc)): .*(\
  (my\.address@business\.com)|(my\.address@private\.net)|\
  (@mydomain\.net)|(anotherone@mail4free\.org))"

  #
  # End of popsneakerrc.
  # ----------------------------------------------------------------



  For a complete popsneakerrc with all possible options you should read
  the sample.popsneakerrc which comes with this archive.


  [1m4.  Questions and Answers[0m



  [1m5.  Copyright[0m


  Popsneaker Copyright 2000, 2001 Stefan Baehre, popsneaker@ixtools.de

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License version 2 as
  published by the Free Software Foundation.

  This program is distributed in the hope that it will be useful, but
  WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.



