Understanding PubMed(R) user search behavior through log analysis
Open Access
- 1 January 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Database: The Journal of Biological Databases and Curation
- Vol. 2009, bap018
- https://doi.org/10.1093/database/bap018
Abstract
This article reports on a detailed investigation of PubMed users’ needs and behavior as a step toward improving biomedical information retrieval. PubMed is providing free service to researchers with access to more than 19 million citations for biomedical articles from MEDLINE and life science journals. It is accessed by millions of users each day. Efficient search tools are crucial for biomedical researchers to keep abreast of the biomedical literature relating to their own research. This study provides insight into PubMed users’ needs and their behavior. This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries. Multiple aspects of users’ interactions with PubMed are characterized in detail with evidence from these logs. Despite having many features in common with general Web searches, biomedical information searches have unique characteristics that are made evident in this study. PubMed users are more persistent in seeking information and they reformulate queries often. The three most frequent types of search are search by author name, search by gene/protein, and search by disease. Use of abbreviation in queries is very frequent. Factors such as result set size influence users’ decisions. Analysis of characteristics such as these plays a critical role in identifying users’ information needs and their search habits. In turn, such an analysis also provides useful insight for improving biomedical information retrieval. Database URL:http://www.ncbi.nlm.nih.gov/PubMedKeywords
This publication has 41 references indexed in Scilit:
- Improving accuracy for identifying related PubMed queries by an integrated approachJournal of Biomedical Informatics, 2009
- Evaluating Relevance Ranking Strategies for MEDLINE RetrievalJournal of the American Medical Informatics Association, 2009
- Evaluation of query expansion using MeSH in PubMedInformation Retrieval Journal, 2008
- Modeling actions of PubMed users with n-gram language modelsInformation Retrieval Journal, 2008
- Query log analysisACM SIGIR Forum, 2007
- A Day in the Life of PubMed: Analysis of a Typical Day's Query LogJournal of the American Medical Informatics Association, 2007
- Web-based provision of information on infectious diseases: a systems studyHealth Informatics Journal, 2006
- Lessons learned from evaluation of the use of the National electronic Library of InfectionHealth Informatics Journal, 2006
- Mining interesting knowledge from weblogs: a surveyData & Knowledge Engineering, 2005
- Indexing and access for digital libraries and the internet: Human, database, and domain factorsJournal of the American Society for Information Science, 1998