hPSD: A Hybrid PU-Learning-Based Spammer Detection Model for Product Reviews

Abstract
Spammers, who manipulate online reviews to promote or suppress products, are flooding in online commerce. To combat this trend, there has been a great deal of research focused on detecting review spammers, most of which design diversified features and thus develop various classifiers. The widespread growth of crowdsourcing platforms has created large-scale deceptive review writers who behave more like normal users, that the way they can more easily evade detection by the classifiers that are purely based on fixed characteristics. In this paper, we propose a hybrid semisupervised learning model titled hybrid PU-learning-based spammer detection (hPSD) for spammer detection to leverage both the users' characteristics and the user-product relations. Specifically, the hPSD model can iteratively detect multitype spammers by injecting different positive samples, and allows the construction of classifiers in a semisupervised hybrid learning framework. Comprehensive experiments on movie dataset with shilling injection confirm the superior performance of hPSD over existing baseline methods. The hPSD is then utilized to detect the hidden spammers from real-life Amazon data. A set of spammers and their underlying employers (e.g., book publishers) are successfully discovered and validated. These demonstrate that hPSD meets the real-world application scenarios and can thus effectively detect the potentially deceptive review writers.
Funding Information
  • National Basic Research Program of China (2016YFB1000901)
  • National Natural Science Foundation of China (71571093, 91646204, 71701089, 71801123)
  • National Center for International Joint Research on E-Business Information Processing (2013B01035)
  • National Natural Science Foundation of China (71725002, 71531001, U1636210, 71471009)
  • Fundamental Research Funds for the Central Universities

This publication has 31 references indexed in Scilit: