Systematic Guidelines for Effective Utilization of COVID-19 Databases in Genomic, Epidemiologic, and Clinical Research
Open Access
- 6 March 2023
- Vol. 15 (3), 692
- https://doi.org/10.3390/v15030692
Abstract
The pandemic has led to the production and accumulation of various types of data related to coronavirus disease 2019 (COVID-19). To understand the features and characteristics of COVID-19 data, we summarized representative databases and determined the data types, purpose, and utilization details of each database. In addition, we categorized COVID-19 associated databases into epidemiological data, genome and protein data, and drug and target data. We found that the data present in each of these databases have nine separate purposes (clade/variant/lineage, genome browser, protein structure, epidemiological data, visualization, data analysis tool, treatment, literature, and immunity) according to the types of data. Utilizing the databases we investigated, we created four queries as integrative analysis methods that aimed to answer important scientific questions related to COVID-19. Our queries can make effective use of multiple databases to produce valuable results that can reveal novel findings through comprehensive analysis. This allows clinical researchers, epidemiologists, and clinicians to have easy access to COVID-19 data without requiring expert knowledge in computing or data science. We expect that users will be able to reference our examples to construct their own integrative analysis methods, which will act as a basis for further scientific inquiry and data searching.Keywords
Funding Information
- Korea government (NRF-2021M3H9A2097227, NRF-2022R1A2C3008162)
- Catholic Medical Center Research Foundation
This publication has 102 references indexed in Scilit:
- COVID-19 Is a Data Science IssuePatterns, 2020
- Progress and Prospects on Vaccine Development against SARS-CoV-2Vaccines, 2020
- Epidemiological data from the COVID-19 outbreak, real-time case informationScientific Data, 2020
- SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease InhibitorCell, 2020
- Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational studyThe Lancet Digital Health, 2020
- A new coronavirus associated with human respiratory disease in ChinaNature, 2020
- The 2019‐new coronavirus epidemic: Evidence for virus evolutionJournal of Medical Virology, 2020
- Nextstrain: real-time tracking of pathogen evolutionBioinformatics, 2018
- GISAID: Global initiative on sharing all influenza data – from vision to realityEurosurveillance, 2017
- Data, disease and diplomacy: GISAID's innovative contribution to global healthGlobal Challenges, 2017