Integrating text mining into the MGI biocuration workflow

Open Access

1 January 2009

journal article
research article
Published by Oxford University Press (OUP) in Database: The Journal of Biological Databases and Curation

Vol. 2009, bap019
https://doi.org/10.1093/database/bap019

Abstract

A major challenge for functional and comparative genomics resource development is the extraction of data from the biomedical literature. Although text mining for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases (MODs). Not only are most available biological natural language (bioNLP) and information retrieval and extraction solutions difficult to adapt to existing MOD curation workflows, but many also have high error rates or are unable to process documents available in those formats preferred by scientific journals.

Keywords

This publication has 18 references indexed in Scilit:

Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation
BMC Bioinformatics, 2009
Reflect: augmented browsing for the life scientist
Nature Biotechnology, 2009
OnTheFly: a tool for automated document-based text annotation, data linking and network generation
Bioinformatics, 2009
The Mouse Genome Database genotypes::phenotypes
Nucleic Acids Research, 2009
Introducing meta-services for biomedical information extraction
Genome Biology, 2008
Semantically linking and browsing PubMed abstracts with gene ontology
BMC Genomics, 2008
Overview of BioCreative II gene normalization
Genome Biology, 2008
iHOP web services
Nucleic Acids Research, 2007
Overview of BioCreAtIvE task 1B: normalized gene lists
BMC Bioinformatics, 2005
Improving the performance of dictionary-based approaches in protein name recognition
Journal of Biomedical Informatics, 2004

Cited by 48 articles