USENIX 2003 Annual Technical Conference, FREENIX Track Abstract
Pp. 63-76 of the Proceedings |
Learning Spam: Simple Techniques for Freely-Available Software
Bart Massey, Mick Thomure, Raya Budrevich, and Scott Long, Portland State University
Abstract
The problem of automatically filtering out spam e-mail using a classifier based on machine learning methods is of great recent interest. This paper gives an introduction to machine learning methods for spam filtering, reviewing some of the relevant ideas and work in the open source community. An overview of several feature detection and machine learning techniques for spam filtering is given. The authors' freely-available implementations of these techniques are discussed. The techniques' performance on several different corpora are evaluated. Finally, some conclusions are drawn about the state of the art and about fruitful directions for spam filtering for freely-available UNIX software practitioners.
- View the full text of this paper in HTML and
PDF.
Until June 2004, you will need your USENIX membership identification in order to access the full papers. The Proceedings are published as a collective work, © 2003 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
- If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
|