Bioinformatics services for analyzing massive genomic datasets
Open Access
- 31 March 2020
- journal article
- Published by Korea Genome Organization in Genomics & Informatics
- Vol. 18 (1), e8
- https://doi.org/10.5808/gi.2020.18.1.e8
Abstract
The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.Keywords
Funding Information
- National Research Foundation of Korea (2014M3C9A3064552, 2014M3C9A3065221, 2014M3C9A3064548, 2014M3C9A3068554, 2014M3C9A3068822, 2019M3C9A5069653)
This publication has 39 references indexed in Scilit:
- Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and CufflinksNature Protocols, 2012
- ChIP-Seq Data Analysis: Identification of Protein–DNA Binding Sites with SISSRs Peak-FinderMethods in Molecular Biology, 2011
- The sequence read archive: explosive growth of sequencing dataNucleic Acids Research, 2011
- ADGO 2.0: interpreting microarray data and list of genes using composite annotationsNucleic Acids Research, 2011
- CisGenome Browser: a flexible tool for genomic data visualizationBioinformatics, 2010
- GSA-SNP: a general approach for gene set analysis of polymorphismsNucleic Acids Research, 2010
- Ab initio gene identification in metagenomic sequencesNucleic Acids Research, 2010
- De novo assembly of human genomes with massively parallel short read sequencingGenome Research, 2009
- PeakSeq enables systematic scoring of ChIP-seq experiments relative to controlsNature Biotechnology, 2009
- MEGAN analysis of metagenomic dataGenome Research, 2007