Epi-Gene: An R-Package for Easy Pan-Genome Analysis
Open Access
- 20 September 2021
- journal article
- retracted article
- Published by Hindawi Limited in BioMed Research International
- Vol. 2021, 1-8
- https://doi.org/10.1155/2021/5585586
Abstract
The main aim of this study was to develop a set of functions that can analyze the genomic data with less time consumption and memory. Epi-gene is presented as a solution to large sequence file handling and computational time problems. It uses less time and less programming skills in order to work with a large number of genomes. In the current study, some features of the Epi-gene R-package were described and illustrated by using a dataset of the 14 Aeromonas hydrophila genomes. The joining, relabeling, and conversion functions were also included in this package to handle the FASTA formatted sequences. To calculate the subsets of core genes, accessory genes, and unique genes, various Epi-gene functions have been used. Heat maps and phylogenetic genome trees were also constructed. This whole procedure was completed in less than 30 minutes. This package can only work on Windows operating systems. Different functions from other packages such as dplyr and ggtree were also used that were available in R computing environment.Funding Information
- Priority Academic Program Development of Jiangsu Higher Education Institutions (CX(17)2027, D2017-3-1, 31372454)
This publication has 21 references indexed in Scilit:
- Evolution of Pan-Genomes of Escherichia coli, Shigella spp., and Salmonella entericaJournal of Bacteriology, 2013
- Comparing clustering and pre-processing in taxonomy analysisBioinformatics, 2012
- Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomesBMC Genomics, 2012
- The genome diversity and karyotype evolution of mammalsMolecular Cytogenetics, 2011
- Search and clustering orders of magnitude faster than BLASTBioinformatics, 2010
- Comparison of 61 Sequenced Escherichia coli GenomesMicrobial Ecology, 2010
- genoPlotR: comparative gene and genome visualization in RBioinformatics, 2010
- The Era of Genomic EpidemiologyNeuroepidemiology, 2009
- The ring of life provides evidence for a genome fusion origin of eukaryotesNature, 2004
- R: A Language for Data Analysis and GraphicsJournal of Computational and Graphical Statistics, 1996