Exploring the community structure of newsgroups
- 22 August 2004
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 783-787
- https://doi.org/10.1145/1014052.1016914
Abstract
We propose to use the community structure of Usenet for organizing and retrieving the information stored in newsgroups. In particular, we study the network formed by cross-posts, messages that are posted to two or more newsgroups simultaneously. We present what is, to our knowledge, by far the most detailed data that has been collected on Usenet cross-postings. We analyze this network to show that it is a small-world network with significant clustering. We also present a spectral algorithm which clusters newsgroups based on the cross-post matrix. The result of our clustering provides a topical classification of newsgroups. Our clustering gives many examples of significant relationships that would be missed by semantic clustering methods.Keywords
This publication has 10 references indexed in Scilit:
- Spectral analysis of dataPublished by Association for Computing Machinery (ACM) ,2001
- Authoritative sources in a hyperlinked environmentJournal of the ACM, 1999
- On power-law relationships of the Internet topologyACM SIGCOMM Computer Communication Review, 1999
- Latent semantic indexingPublished by Association for Computing Machinery (ACM) ,1998
- Inferring Web communities from link topologyPublished by Association for Computing Machinery (ACM) ,1998
- Social Network AnalysisPublished by Cambridge University Press (CUP) ,1994
- Partitioning of unstructured problems for parallel processingComputing Systems in Engineering, 1991
- Partitioning Sparse Matrices with Eigenvectors of GraphsSIAM Journal on Matrix Analysis and Applications, 1990
- A property of eigenvectors of nonnegative symmetric matrices and its application to graph theoryCzechoslovak Mathematical Journal, 1975
- Eigenvectors of acyclic matricesCzechoslovak Mathematical Journal, 1975