Combining literature text mining with microarray data: advances for system biology modeling
Open Access
- 15 June 2011
- journal article
- review article
- Published by Oxford University Press (OUP) in Briefings in Bioinformatics
- Vol. 13 (1), 61-82
- https://doi.org/10.1093/bib/bbr018
Abstract
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene–disease relationships is described.Keywords
This publication has 82 references indexed in Scilit:
- BioPPISVMExtractor: A protein–protein interaction extractor for biomedical literature using SVM and rich feature setsJournal of Biomedical Informatics, 2009
- Arrowsmith two-node search interface: A tutorial on finding meaningful links between two disparate sets of articles in MEDLINEComputer Methods and Programs in Biomedicine, 2009
- Mining Protein–Protein Interactions from Published Literature Using Linguamatics I2EPublished by Springer Science and Business Media LLC ,2009
- Evaluating contributions of natural language parsers to protein–protein interaction extractionBioinformatics, 2008
- FACTA: a text search engine for finding associated biomedical conceptsBioinformatics, 2008
- Frontiers of biomedical text mining: current progressBriefings in Bioinformatics, 2007
- PathExpress: a web-based tool to identify relevant pathways in gene expression dataNucleic Acids Research, 2007
- Update of the G2D tool for prioritization of gene candidates to inherited diseasesNucleic Acids Research, 2007
- KOBAS server: a web-based platform for automated annotation and pathway identificationNucleic Acids Research, 2006
- Literature mining for the biologist: from information retrieval to biological discoveryNature Reviews Genetics, 2006