“METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive”
Open Access
- 3 September 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 21 (1), 1-12
- https://doi.org/10.1186/s12859-020-03694-0
Abstract
The improvements in genomics methods coupled with readily accessible high-throughput sequencing have contributed to our understanding of microbial species, metagenomes, infectious diseases and more. To maximize the impact of these genomics studies, it is important that data from biological samples will become publicly available with standardized metadata. The availability of data at public archives provides the hope that greater insights could be obtained through integration with multi-omics data, reproducibility of published studies, or meta-analyses of large diverse datasets. These datasets should include a description of the host, organism, environmental source of the specimen, spatial-temporal information and other relevant metadata, but unfortunately these attributes are often missing and when present, they show inconsistencies in the use of metadata standards and ontologies. METAGENOTE (https://metagenote.niaid.nih.gov) is a web portal that greatly facilitates the annotation of samples from genomic studies and streamlines the submission process of sequencing files and metadata to the Sequence Read Archive (SRA) (Leinonen R, et al, Nucleic Acids Res, 39:D19-21, 2011) for public access. This platform offers a wide selection of packages for different types of biological and experimental studies with a special emphasis on the standardization of metadata reporting. These packages follow the guidelines from the MIxS standards developed by the Genomics Standard Consortium (GSC) and adopted by the three partners of the International Nucleotides Sequencing Database Collaboration (INSDC) (Cochrane G, et al, Nucleic Acids Res, 44:D48-50, 2016) - National Center for Biotechnology Information (NCBI), European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). METAGENOTE then compiles, validates and manages the submission through an easy-to-use web interface minimizing submission errors and eliminating the need for submitting sequencing files via a separate file transfer mechanism. METAGENOTE is a public resource that focuses on simplifying the annotation and submission process of data with its corresponding metadata. Users of METAGENOTE will benefit from the easy to use annotation interface but most importantly will be encouraged to publish metadata following standards and ontologies that make the public data available for reuse.Keywords
Funding Information
- National Institute of Allergy and Infectious Diseases (BCBB Support Services Contract HHSN316201300006W/HHSN27200002 to MSC, Inc)
This publication has 19 references indexed in Scilit:
- MGnify: the microbiome analysis resource in 2020Nucleic Acids Research, 2019
- Nephele: a cloud platform for simplified, standardized and reproducible microbiome data analysisBioinformatics, 2017
- Assessing Metadata Quality of a Federally Sponsored Health Data Repository.2017
- The FAIR Guiding Principles for scientific data management and stewardshipScientific Data, 2016
- The International Nucleotide Sequence Database CollaborationNucleic Acids Research, 2015
- The Foundational Model of Anatomy in OWL 2 and its useArtificial Intelligence in Medicine, 2013
- The environment ontology: contextualising biological and biomedical entitiesJournal of Biomedical Semantics, 2013
- The sequence read archive: explosive growth of sequencing dataNucleic Acids Research, 2011
- Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specificationsNature Biotechnology, 2011
- Modeling sample variables with an Experimental Factor OntologyBioinformatics, 2010