Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II
Open Access
- 1 January 2012
- journal article
- research article
- Published by Oxford University Press (OUP) in Database: The Journal of Biological Databases and Curation
- Vol. 2012, bas043
- https://doi.org/10.1093/database/bas043
Abstract
Manual curation of data from the biomedical literature is a rate-limiting factor for many expert curated databases. Despite the continuing advances in biomedical text mining and the pressing needs of biocurators for better tools, few existing text-mining tools have been successfully integrated into production literature curation systems such as those used by the expert curated databases. To close this gap and better understand all aspects of literature curation, we invited submissions of written descriptions of curation workflows from expert curated databases for the BioCreative 2012 Workshop Track II. We received seven qualified contributions, primarily from model organism databases. Based on these descriptions, we identified commonalities and differences across the workflows, the common ontologies and controlled vocabularies used and the current and desired uses of text mining for biocuration. Compared to a survey done in 2009, our 2012 results show that many more databases are now using text mining in parts of their curation workflows. In addition, the workshop participants identified text-mining aids for finding gene names and symbols (gene indexing), prioritization of documents for curation (document triage) and ontology concept assignment as those most desired by the biocurators. Database URL:http://www.biocreative.org/tasks/bc-workshop-2012/workflow/Keywords
This publication has 30 references indexed in Scilit:
- BioCreative-2012 Virtual IssueDatabase: The Journal of Biological Databases and Curation, 2012
- Semi-automatic semantic annotation of PubMed queries: A study on quality, efficiency, satisfactionJournal of Biomedical Informatics, 2010
- An Overview of BioCreative II.5IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2010
- Overview of BioCreative II gene mention recognitionGenome Biology, 2008
- Overview of the protein-protein interaction annotation extraction task of BioCreative IIGenome Biology, 2008
- Overview of BioCreative II gene normalizationGenome Biology, 2008
- INTEGRATING NATURAL LANGUAGE PROCESSING WITH FLYBASE CURATIONPacific Symposium on Biocomputing, 2006
- Overview of BioCreAtIvE: critical assessment of information extraction for biologyBMC Bioinformatics, 2005
- Overview of BioCreAtIvE task 1B: normalized gene listsBMC Bioinformatics, 2005
- Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological LiteraturePLoS Biology, 2004