New Search

Export article
Open Access

A resource to explore the discovery of rare diseases and their causative genes

Published: 4 May 2021
Scientific Data , Volume 8, pp 1-8; doi:10.1038/s41597-021-00905-y

Abstract: Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia,, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods.
Keywords: causative genes / rare diseases / monogenic / model / OMIM / added / PubMed

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

Share this article

Click here to see the statistics on "Scientific Data" .
References (25)
    Back to Top Top