Exploring social annotations for information retrieval
- 21 April 2008
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 715-724
- https://doi.org/10.1145/1367497.1367594
Abstract
Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is concerned with developing probabilistic models and computational algorithms for social annotations. We propose a unified framework to combine the modeling of social annotations with the language modeling-based methods for information retrieval. The proposed approach consists of two steps: (1) discovering topics in the contents and annotations of documents while categorizing the users by domains; and (2) enhancing document and query language models by incorporating user domain interests as well as topical background models. In particular, we propose a new general generative model for social annotations, which is then simplified to a computationally tractable hierarchical Bayesian network. Then we apply smoothing techniques in a risk minimization framework to incorporate the topical information to language models. Experiments are carried out on a real-world annotation data set sampled from del.icio.us. Our results demonstrate significant improvements over the traditional approaches.Keywords
This publication has 15 references indexed in Scilit:
- Probabilistic models for discovering e-communitiesPublished by Association for Computing Machinery (ACM) ,2006
- Usage patterns of collaborative tagging systemsJournal of Information Science, 2006
- Information Retrieval in Folksonomies: Search and RankingLecture Notes in Computer Science, 2006
- Probabilistic author-topic models for information discoveryPublished by Association for Computing Machinery (ACM) ,2004
- A study of smoothing methods for language models applied to information retrievalACM Transactions on Information Systems, 2004
- On the bursty evolution of blogspacePublished by Association for Computing Machinery (ACM) ,2003
- SemTag and seekerPublished by Association for Computing Machinery (ACM) ,2003
- Document language models, query models, and risk minimization for information retrievalPublished by Association for Computing Machinery (ACM) ,2001
- The Semantic WebScientific American, 2001
- A language modeling approach to information retrievalPublished by Association for Computing Machinery (ACM) ,1998