Georgios Kakavelakis, Robert Beverly, and Joel Young
Proceedings of the 25th USENIX Large Installation Systems Administration Conference
(LISA 2011),
Boston, MA, December 2011.
Botnets are a significant source of abusive messaging (spam, phishing, etc) and other types of malicious traffic. A promising approach to help mitigate botnet-generated traffic is signal analysis of transport-layer (\ie TCP/IP) characteristics, \eg timing, packet reordering, congestion, and flow-control. Prior work~\cite{spamflow-ceas08} shows that machine learning analysis of such traffic features on an SMTP MTA can accurately differentiate between botnet and legitimate sources. We make two contributions toward the \emph{real-world} deployment of such techniques: i) an architecture for real-time on-line operation; and ii) auto-learning of the unsupervised model across different environments without human labeling (\ie training). We present a ``SpamFlow'' SpamAssassin plugin and the requisite auxiliary daemons to integrate transport-layer signal analysis with a popular open-source spam filter. Using our system, we detail results from a production deployment where our auto-learning technique achieves better than $95$ percent accuracy, precision, and recall after reception of $\approx$ 1,000 emails.
[PDF(388KB)]
[BibTeX]
[Presentation Slides(1252KB)]
[ Return to publications ]