SURF
- 17 October 2011
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 467-476
- https://doi.org/10.1145/2046707.2046762
Abstract
Search engine optimization (SEO) techniques are often abused to promote websites among search results. This is a practice known as blackhat SEO. In this paper we tackle a newly emerging and especially aggressive class of blackhat SEO, namely search poisoning. Unlike other blackhat SEO techniques, which typically attempt to promote a website's ranking only under a limited set of search keywords relevant to the website's content, search poisoning techniques disregard any term relevance constraint and are employed to poison popular search keywords with the sole purpose of diverting large numbers of users to short-lived traffic-hungry websites for malicious purposes. To accurately detect search poisoning cases, we designed a novel detection system called SURF. SURF runs as a browser component to extract a number of robust (i.e., difficult to evade) detection features from search-then-visit browsing sessions, and is able to accurately classify malicious search user redirections resulted from user clicking on poisoned search results. Our evaluation on real-world search poisoning instances shows that SURF can achieve a detection rate of 99.1% at a false positive rate of 0.9%. Furthermore, we applied SURF to analyze a large dataset of search-related browsing sessions collected over a period of seven months starting in September 2010. Through this long-term measurement study we were able to reveal new trends and interesting patterns related to a great variety of poisoning cases, thus contributing to a better understanding of the prevalence and gravity of the search poisoning problem.Keywords
This publication has 7 references indexed in Scilit:
- Design and Evaluation of a Real-Time URL Spam Filtering ServicePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- BLADEPublished by Association for Computing Machinery (ACM) ,2010
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- Tracking Web spam with HTML style similaritiesACM Transactions on the Web, 2008
- Detecting semantic cloaking on the webPublished by Association for Computing Machinery (ACM) ,2006
- Detecting spam web pages through content analysisPublished by Association for Computing Machinery (ACM) ,2006
- Identifying link farm spam pagesPublished by Association for Computing Machinery (ACM) ,2005