Design and Evaluation of a Real-Time URL Spam Filtering Service
Top Cited Papers
- 1 May 2011
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 447-462
- https://doi.org/10.1109/sp.2011.25
Abstract
On the heels of the widespread adoption of web services such as social networks and URL shorteners, scams, phishing, and malware have become regular threats. Despite extensive research, email-based spam filtering techniques generally fall short for protecting other web services. To better address this need, we present Monarch, a real-time system that crawls URLs as they are submitted to web services and determines whether the URLs direct to spam. We evaluate the viability of Monarch and the fundamental challenges that arise due to the diversity of web service spam. We show that Monarch can provide accurate, real-time protection, but that the underlying characteristics of spam do not generalize across web services. In particular, we find that spam targeting email qualitatively differs in significant ways from spam campaigns targeting Twitter. We explore the distinctions between email and Twitter spam, including the abuse of public web hosting and redirector services. Finally, we demonstrate Monarch's scalability, showing our system could protect a service such as Twitter -- which needs to process 15 million URLs/day -- for a bit under $800/day.Keywords
This publication has 23 references indexed in Scilit:
- Detecting algorithmically generated malicious domain namesPublished by Association for Computing Machinery (ACM) ,2010
- The Koobface botnet and the rise of social malwarePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- So long, and no thanks for the externalitiesPublished by Association for Computing Machinery (ACM) ,2009
- Beyond blacklistsPublished by Association for Computing Machinery (ACM) ,2009
- Feature hashing for large scale multitask learningPublished by Association for Computing Machinery (ACM) ,2009
- Identifying suspicious URLsPublished by Association for Computing Machinery (ACM) ,2009
- The Elements of Statistical LearningPublished by Springer Science and Business Media LLC ,2009
- You've been warnedPublished by Association for Computing Machinery (ACM) ,2008
- CantinaPublished by Association for Computing Machinery (ACM) ,2007
- Detecting spam web pages through content analysisPublished by Association for Computing Machinery (ACM) ,2006