Empirical comparison of algorithms for network community detection

Abstract
Detecting clusters or communities in large real-world graphs such as large social or information networks is a problem of considerable interest. In practice, one typically chooses an objective f unction that captures the intuition of a network cluster as set of nod es with better internal connectivity than external connectivity, and then one applies approximation algorithms or heuristics to extract sets of nodes that are related to the objective function and that "lo ok like" good communities for the application of interest. In this paper, we explore a range of network community detec- tion methods in order to compare them and to understand their rela- tive performance and the systematic biases in the clusters t hey iden- tify. We evaluate several common objective functions that are used to formalize the notion of a network community, and we examine several different classes of approximation algorithms tha t aim to optimize such objective functions. In addition, rather tha n simply fixing an objective and asking for an approximation to the bes t clus- ter of any size, we consider a size-resolved version of the optimiza- tion problem. Considering community quality as a function of its size provides a much finer lens with which to examine communit y detection algorithms, since objective functions and approximation algorithms often have non-obvious size-dependent behavior.

This publication has 31 references indexed in Scilit: