hPSD: A Hybrid PU-Learning-Based Spammer Detection Model for Product Reviews

2 November 2018

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Cybernetics

Vol. 50 (4), 1595-1606
https://doi.org/10.1109/tcyb.2018.2877161

Abstract

Spammers, who manipulate online reviews to promote or suppress products, are flooding in online commerce. To combat this trend, there has been a great deal of research focused on detecting review spammers, most of which design diversified features and thus develop various classifiers. The widespread growth of crowdsourcing platforms has created large-scale deceptive review writers who behave more like normal users, that the way they can more easily evade detection by the classifiers that are purely based on fixed characteristics. In this paper, we propose a hybrid semisupervised learning model titled hybrid PU-learning-based spammer detection (hPSD) for spammer detection to leverage both the users' characteristics and the user-product relations. Specifically, the hPSD model can iteratively detect multitype spammers by injecting different positive samples, and allows the construction of classifiers in a semisupervised hybrid learning framework. Comprehensive experiments on movie dataset with shilling injection confirm the superior performance of hPSD over existing baseline methods. The hPSD is then utilized to detect the hidden spammers from real-life Amazon data. A set of spammers and their underlying employers (e.g., book publishers) are successfully discovered and validated. These demonstrate that hPSD meets the real-world application scenarios and can thus effectively detect the potentially deceptive review writers.

Keywords

Funding Information

National Basic Research Program of China (2016YFB1000901)
National Natural Science Foundation of China (71571093, 91646204, 71701089, 71801123)
National Center for International Joint Research on E-Business Information Processing (2013B01035)
National Natural Science Foundation of China (71725002, 71531001, U1636210, 71471009)
Fundamental Research Funds for the Central Universities

This publication has 31 references indexed in Scilit:

Collective Spammer Detection in Evolving Multi-Relational Social Networks
Published by Association for Computing Machinery (ACM) ,2015
Uncovering Crowdsourced Manipulation of Online Reviews
Published by Association for Computing Machinery (ACM) ,2015
Estimating the prevalence of deception in online review communities
Published by Association for Computing Machinery (ACM) ,2012
Shilling Attack Detection—A New Approach for a Trustworthy Recommender System
INFORMS Journal on Computing, 2012
Impact of Online Consumer Reviews on Sales: The Moderating Role of Product and Consumer Characteristics
Journal of Marketing, 2010
Positive Unlabeled Learning for Data Stream Classification
Published by Society for Industrial & Applied Mathematics (SIAM) ,2009
Examining the Relationship Between Reviews and Sales: The Role of Reviewer Identity Disclosure in Electronic Markets
Information Systems Research, 2008
Unsupervised strategies for shilling detection and robust collaborative filtering
User Modelling and User-Adapted Interaction, 2008
Building text classifiers using positive and unlabeled examples
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Learning Rate Adaptation in Stochastic Gradient Descent
Nonconvex Optimization and Its Applications, 2001

Cited by 152 articles