DeepWalk

Top Cited Papers

24 August 2014

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 701-710
https://doi.org/10.1145/2623330.2623732

Abstract

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide $F_1$ scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table

Keywords

Other Versions

Version 2, 2014-03-26, preprints

Funding Information

Google (Faculty Research Award)
Division of Information and Intelligent Systems (DBI-1060572 and IIS-1017181)
Division of Biological Infrastructure (DBI-1060572 and IIS-1017181)

This publication has 22 references indexed in Scilit:

Fast Random Walk Graph Kernel
Published by Society for Industrial & Applied Mathematics (SIAM) ,2012
It's who you know
Published by Association for Computing Machinery (ACM) ,2011
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing, 2011
Leveraging social media networks for classification
Data Mining and Knowledge Discovery, 2011
Anomaly detection
ACM Computing Surveys, 2009
Using ghost edges for classification in sparsely labeled networks
Published by Association for Computing Machinery (ACM) ,2008
A bias/variance decomposition for models using collective inference
Machine Learning, 2008
A unified architecture for natural language processing
Published by Association for Computing Machinery (ACM) ,2008
Modularity and community structure in networks
Proceedings of the National Academy of Sciences of the United States of America, 2006
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1984

Cited by 6059 articles