Scientific Data

Journal Information
ISSN / EISSN : 2052-4463 / 2052-4463
Current Publisher: Springer Science and Business Media LLC (10.1038)
Total articles ≅ 1,664
Current Coverage
Archived in

Latest articles in this journal

, Mohammad H. Mahoor
Scientific Data, Volume 8, pp 1-14; doi:10.1038/s41597-021-00832-y

In recent years, fingerprint-based positioning has gained researchers’ attention since it is a promising alternative to the Global Navigation Satellite System and cellular network-based localization in urban areas. Despite this, the lack of publicly available datasets that researchers can use to develop, evaluate, and compare fingerprint-based positioning solutions constitutes a high entry barrier for studies. As an effort to overcome this barrier and foster new research efforts, this paper presents OutFin, a novel dataset of outdoor location fingerprints that were collected using two different smartphones. OutFin is comprised of diverse data types such as WiFi, Bluetooth, and cellular signal strengths, in addition to measurements from various sensors including the magnetometer, accelerometer, gyroscope, barometer, and ambient light sensor. The collection area spanned four dispersed sites with a total of 122 reference points. Each site is different in terms of its visibility to the Global Navigation Satellite System and reference points’ number, arrangement, and spacing. Before OutFin was made available to the public, several experiments were conducted to validate its technical quality.
, Pei Liu, , Huayan Zhao, Dalila Bensaddek, , Liming Xiong
Scientific Data, Volume 8, pp 1-1; doi:10.1038/s41597-021-00852-8

A Correction to this paper has been published:
, Saori C. Tanaka, Kaoru Amano, Ai Koizumi, , , Kazuhisa Shibata, Vincent Taschereau-Dumouchel, ,
Scientific Data, Volume 8, pp 1-9; doi:10.1038/s41597-021-00845-7

Decoded neurofeedback (DecNef) is a form of closed-loop functional magnetic resonance imaging (fMRI) combined with machine learning approaches, which holds some promises for clinical applications. Yet, currently only a few research groups have had the opportunity to run such experiments; furthermore, there is no existing public dataset for scientists to analyse and investigate some of the factors enabling the manipulation of brain dynamics. We release here the data from published DecNef studies, consisting of 5 separate fMRI datasets, each with multiple sessions recorded per participant. For each participant the data consists of a session that was used in the main experiment to train the machine learning decoder, and several (from 3 to 10) closed-loop fMRI neural reinforcement sessions. The large dataset, currently comprising more than 60 participants, will be useful to the fMRI community at large and to researchers trying to understand the mechanisms underlying non-invasive modulation of brain dynamics. Finally, the data collection size will increase over time as data from newly run DecNef studies will be added.
, Shota Kiyomoto, Do Ngoc Khanh, Manabu Kanda
Scientific Data, Volume 8, pp 1-14; doi:10.1038/s41597-021-00850-w

Numerical weather prediction models are progressively used to downscale future climate in cities at increasing spatial resolutions. Boundary conditions representing rapidly growing urban areas are imperative to more plausible future predictions. In this work, 1-km global anthropogenic heat emission (AHE) datasets of the present and future are constructed. To improve present AHE maps, 30 arc-second VIIRS satellite imagery outputs such as nighttime lights and night-fires were incorporated along with the LandScanTM population dataset. A futuristic scenario of AHE was also developed while considering pathways of radiative forcing (i.e. representative concentration pathways), pathways of social conditions (i.e. shared socio-economic pathways), a 1-km future urbanization probability map, and a model to estimate changes in population distribution. The new dataset highlights two distinct features; (1) a more spatially-heterogeneous representation of AHE is captured compared with other recent datasets, and (2) consideration of future urban sprawls and climate change in futuristic AHE maps. Significant increases in projected AHE for multiple cities under a worst-case scenario strengthen the need for further assessment of futuristic AHE.
Yao Zhang, Xiangming Xiao, Xiaocui Wu, Sha Zhou, Geli Zhang, Yuanwei Qin, Jinwei Dong
Scientific Data, Volume 8, pp 1-1; doi:10.1038/s41597-021-00854-6

A Correction to this paper has been published:
, Alexander Olsson, Paulina Sager, Elin Andersson, Christian Cipriani, , Anders Björkman, Christian Antfolk
Scientific Data, Volume 8, pp 1-10; doi:10.1038/s41597-021-00843-9

Control of contemporary, multi-joint prosthetic hands is commonly realized by using electromyographic signals from the muscles remaining after amputation at the forearm level. Although this principle is trying to imitate the natural control structure where muscles control the joints of the hand, in practice, myoelectric control provides only basic hand functions to an amputee using a dexterous prosthesis. This study aims to provide an annotated database of high-density surface electromyographic signals to aid the efforts of designing robust and versatile electromyographic control interfaces for prosthetic hands. The electromyographic signals were recorded using 128 channels within two electrode grids positioned on the forearms of 20 able-bodied volunteers. The participants performed 65 different hand gestures in an isometric manner. The hand movements were strictly timed using an automated recording protocol which also synchronously recorded the electromyographic signals and hand joint forces. To assess the quality of the recorded signals several quantitative assessments were performed, such as frequency content analysis, channel crosstalk, and the detection of poor skin-electrode contacts.
, , Saeed Sadri, Kate Salmon, Zubair Maalick, Stuart Webster
Scientific Data, Volume 8, pp 1-12; doi:10.1038/s41597-021-00847-5

High resolution simulations at 4.4 km and 1.5 km resolution have been performed for 12 historical tropical cyclones impacting Bangladesh. We use the European Centre for Medium-Range Weather Forecasting 5th generation Re-Analysis (ERA5) to provide a 9-member ensemble of initial and boundary conditions for the regional configuration of the Met Office Unified Model. The simulations are compared to the original ERA5 data and the International Best Track Archive for Climate Stewardship (IBTrACS) tropical cyclone database for wind speed, gust speed and mean sea-level pressure. The 4.4 km simulations show a typical increase in peak gust speed of 41 to 118 knots relative to ERA5, and a deepening of minimum mean sea-level pressure of up to −27 hPa, relative to ERA5 and IBTrACS data. The downscaled simulations compare more favourably with IBTrACS data than the ERA5 data suggesting tropical cyclone hazards in the ERA5 deterministic output may be underestimated. The dataset is freely available from 10.5281/zenodo.3600201.
Elisabeth L. Rosvold,
Scientific Data, Volume 8, pp 1-7; doi:10.1038/s41597-021-00846-6

This article presents a new open source extension to the Emergency Events Database (EM-DAT) that allows researchers, for the first time, to explore and make use of subnational, geocoded data on major disasters triggered by natural hazards. The Geocoded Disasters (GDIS) dataset provides spatial geometry in the form of GIS polygons and centroid latitude and longitude coordinates for each administrative entity listed as a disaster location in the EM-DAT database. In total, GDIS contains spatial information on 39,953 locations for 9,924 unique disasters occurring worldwide between 1960 and 2018. The dataset facilitates connecting the EM-DAT database to other geographic data sources on the subnational level to enable rigorous empirical analyses of disaster determinants and impacts.
, Matthew Cornell, Evan L. Ray, Katie House, Khoa Le
Scientific Data, Volume 8, pp 1-11; doi:10.1038/s41597-021-00839-5

Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is implemented in Zoltar, a new software application that stores forecasts using the data model and provides standardized API access to the data. In one real-time case study, an instance of the Zoltar web application was used to store, provide access to, and evaluate real-time forecast data on the order of 108 rows, provided by over 40 international research teams from academia and industry making forecasts of the COVID-19 outbreak in the US. Tools and data infrastructure for probabilistic forecasts, such as those introduced here, will play an increasingly important role in ensuring that future forecasting research adheres to a strict set of rigorous and reproducible standards.
, Ricardo Stuck, Brent McPherson, Daniel Bullock, Lindsey Kitchell, , Derek Kellar, Hu Cheng, Sharlene Newman, Nicholas Port, et al.
Scientific Data, Volume 8, pp 1-17; doi:10.1038/s41597-021-00823-z

We describe a dataset of processed data with associated reproducible preprocessing pipeline collected from two collegiate athlete groups and one non-athlete group. The dataset shares minimally processed diffusion-weighted magnetic resonance imaging (dMRI) data, three models of the diffusion signal in the voxel, full-brain tractograms, segmentation of the major white matter tracts as well as structural connectivity matrices. There is currently a paucity of similar datasets openly shared. Furthermore, major challenges are associated with collecting this type of data. The data and derivatives shared here can be used as a reference to study the effects of long-term exposure to collegiate athletics, such as the effects of repetitive head impacts. We use advanced anatomical and dMRI data processing methods publicly available as reproducible web services at
Back to Top Top