Detecting Link Spam Using Temporal Information
- 1 December 2006
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE International Conference on Data Mining (ICDM)
- No. 15504786,p. 1049-1053
- https://doi.org/10.1109/icdm.2006.51
Abstract
How to effectively protect against spam on search ranking results is an important issue for contemporary web search engines. This paper addresses the problem of combating one major type of web spam: 'link spam.' Most of the previous work on anti link spam managed to make use of one snapshot of web data to detect spam, and thus it did not take advantage of the fact that link spam tends to result in drastic changes of links in a short time period. To overcome the shortcoming, this paper proposes using temporal information on links in detection of link spam, as well as other information. Specifically, it defines temporal features such as in-link growth rate (IGR) and in-link death rate (IDR) in a spam classification model (i.e., SVM). Experimental results on web domain graph data show that link spam can be successfully detected with the proposed method.Keywords
This publication has 7 references indexed in Scilit:
- Site level noise removal for search enginesPublished by Association for Computing Machinery (ACM) ,2006
- Detecting spam web pages through content analysisPublished by Association for Computing Machinery (ACM) ,2006
- Detecting phrase-level duplication on the world wide webPublished by Association for Computing Machinery (ACM) ,2005
- Accurately interpreting clickthrough data as implicit feedbackPublished by Association for Computing Machinery (ACM) ,2005
- Detecting Search Engine Spam from a Trackback Network in BlogspaceLecture Notes in Computer Science, 2005
- Spam, damn spam, and statisticsPublished by Association for Computing Machinery (ACM) ,2004
- Combating Web Spam with TrustRankPublished by Elsevier BV ,2004