Journal PeerJ Computer Science-
PeerJ Computer Science, Volume 4; doi:10.7717/peerj-cs.146
Abstract:The relationship between a driver’s glance orientation and corresponding head rotation is highly complex due to its nonlinear dependence on the individual, task, and driving context. This paper presents expanded analytic detail and findings from an effort that explored the ability of head pose to serve as an estimator for driver gaze by connecting head rotation data with manually coded gaze region data using both a statistical analysis approach and a predictive (i.e., machine learning) approach. For the latter, classification accuracy increased as visual angles between two glance locations increased. In other words, the greater the shift in gaze, the higher the accuracy of classification. This is an intuitive but important concept that we make explicit through our analysis. The highest accuracy achieved was 83% using the method of Hidden Markov Models (HMM) for the binary gaze classification problem of (a) glances to the forward roadway versus (b) glances to the center stack. Results suggest that although there are individual differences in head-glance correspondence while driving, classifier models based on head-rotation data may be robust to these differences and therefore can serve as reasonable estimators for glance location. The results suggest that driver head pose can be used as a surrogate for eye gaze in several key conditions including the identification of high-eccentricity glances. Inexpensive driver head pose tracking may be a key element in detection systems developed to mitigate driver distraction and inattention.
PeerJ Computer Science, Volume 4; doi:10.7717/peerj-cs.147
Abstract:This article describes the motivation, design, and progress of the Journal of Open Source Software (JOSS). JOSS is a free and open-access journal that publishes articles describing research software. It has the dual goals of improving the quality of the software submitted and providing a mechanism for research software developers to receive credit. While designed to work within the current merit system of science, JOSS addresses the dearth of rewards for key contributions to science made in the form of software. JOSS publishes articles that encapsulate scholarship contained in the software itself, and its rigorous peer review targets the software components: functionality, documentation, tests, continuous integration, and the license. A JOSS article contains an abstract describing the purpose and functionality of the software, references, and a link to the software archive. The article is the entry point of a JOSS submission, which encompasses the full set of software artifacts. Submission and review proceed in the open, on GitHub. Editors, reviewers, and authors work collaboratively and openly. Unlike other journals, JOSS does not reject articles requiring major revision; while not yet accepted, articles remain visible and under review until the authors make adequate changes (or withdraw, if unable to meet requirements). Once an article is accepted, JOSS gives it a digital object identifier (DOI), deposits its metadata in Crossref, and the article can begin collecting citations on indexers like Google Scholar and other services. Authors retain copyright of their JOSS article, releasing it under a Creative Commons Attribution 4.0 International License. In its first year, starting in May 2016, JOSS published 111 articles, with more than 40 additional articles under review. JOSS is a sponsored project of the nonprofit organization NumFOCUS and is an affiliate of the Open Source Initiative (OSI).
PeerJ Computer Science, Volume 4; doi:10.7717/peerj-cs.145
Abstract:Finding useful patterns in datasets has attracted considerable interest in the field of visual analytics. One of the most common tasks is the identification and representation of clusters. However, this is non-trivial in heterogeneous datasets since the data needs to be analyzed from different perspectives. Indeed, highly variable patterns may mask underlying trends in the dataset. Dendrograms are graphical representations resulting from agglomerative hierarchical clustering and provide a framework for viewing the clustering at different levels of detail. However, dendrograms become cluttered when the dataset gets large, and the single cut of the dendrogram to demarcate different clusters can be insufficient in heterogeneous datasets. In this work, we propose a visual analytics methodology called MCLEAN that offers a general approach for guiding the user through the exploration and detection of clusters. Powered by a graph-based transformation of the relational data, it supports a scalable environment for representation of heterogeneous datasets by changing the spatialization. We thereby combine multilevel representations of the clustered dataset with community finding algorithms. Our approach entails displaying the results of the heuristics to users, providing a setting from which to start the exploration and data analysis. To evaluate our proposed approach, we conduct a qualitative user study, where participants are asked to explore a heterogeneous dataset, comparing the results obtained by MCLEAN with the dendrogram. These qualitative results reveal that MCLEAN is an effective way of aiding users in the detection of clusters in heterogeneous datasets. The proposed methodology is implemented in an R package available at https://bitbucket.org/vda-lab/mclean.
PeerJ Computer Science, Volume 4; doi:10.7717/peerj-cs.144
Abstract:We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.
PeerJ Computer Science, Volume 4; doi:10.7717/peerj-cs.143
Abstract:Despite most Spatial Data Infrastructures offering service-based visualization of geospatial data, requirements are often at a very basic level leading to poor quality of maps. This is a general observation for any geospatial architecture as soon as open standards as those of the Open Geospatial Consortium (OGC) are applied. To improve the situation, this paper does focus on improvements at the portrayal interoperability side by considering standardization aspects. We propose two major redesign recommendations. First to consolidate the cartographic theory at the core of the OGC Symbology Encoding standard. Secondly to build the standard in a modular way so as to be ready to be extended with upcoming future cartographic requirements. Thus, we start by defining portrayal interoperability by means of typical-use cases that frame the concept of sharing cartography. Then we bring to light the strengths and limits of the relevant open standards to consider in this context. Finally we propose a set of recommendations to overcome the limits so as to make these use cases a true reality. Even if the definition of a cartographic-oriented standard is not able to act as a complete cartographic design framework by itself, we argue that pushing forward the standardization work dedicated to cartography is a way to share and disseminate good practices and finally to improve the quality of the visualizations.
PeerJ Computer Science, Volume 3; doi:10.7717/peerj-cs.142
Abstract:Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results; however, computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested and are hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.
PeerJ Computer Science, Volume 3; doi:10.7717/peerj-cs.141
Abstract:The continuous increase of unbound streaming data poses several challenges to established data stream processing engines. One of the most important challenges is the cost-efficient enactment of stream processing topologies under changing data volume. These data volume pose different loads to stream processing systems whose resource provisioning needs to be continuously updated at runtime. First approaches already allow for resource provisioning on the level of virtual machines (VMs), but this only allows for coarse resource provisioning strategies. Based on current advances and benefits for containerized software systems, we have designed a cost-efficient resource provisioning approach and integrated it into the runtime of the Vienna ecosystem for elastic stream processing. Our resource provisioning approach aims to maximize the resource usage for VMs obtained from cloud providers. This strategy only releases processing capabilities at the end of the VMs minimal leasing duration instead of releasing them eagerly as soon as possible as it is the case for threshold-based approaches. This strategy allows us to improve the service level agreement compliance by up to 25% and a reduction for the operational cost of up to 36%.
PeerJ Computer Science, Volume 3; doi:10.7717/peerj-cs.139
Abstract:Open-source lab equipment is becoming more widespread with the popularization of fabrication tools such as 3D printers, laser cutters, CNC machines, open source microcontrollers and open source software. Although many pieces of common laboratory equipment have been developed, software control of these items is sometimes lacking. Specifically, control software that can be easily implemented and enable user-input and control over multiple platforms (PC, smartphone, web, etc.). The aim of this proof-of principle study was to develop and implement software for the control of a low-cost, 3D printed microscope. Here, we present two approaches which enable microscope control by exploiting the functionality of the social media platform Twitter or player actions inside of the videogame Minecraft. The microscope was constructed from a modified web-camera and implemented on a Raspberry Pi computer. Three aspects of microscope control were tested, including single image capture, focus control and time-lapse imaging. The Twitter embodiment enabled users to send ‘tweets’ directly to the microscope. Image data acquired by the microscope was then returned to the user through a Twitter reply and stored permanently on the photo-sharing platform Flickr, along with any relevant metadata. Local control of the microscope was also implemented by utilizing the video game Minecraft, in situations where Internet connectivity is not present or stable. A virtual laboratory was constructed inside the Minecraft world and player actions inside the laboratory were linked to specific microscope functions. Here, we present the methodology and results of these experiments and discuss possible limitations and future extensions of this work.
PeerJ Computer Science, Volume 3; doi:10.7717/peerj-cs.140
Abstract:Betweenness, a widely employed centrality measure in network science, is a decent proxy for investigating network loads and rankings. However, its extremely high computational cost greatly hinders its applicability in large networks. Although several parallel algorithms have been presented to reduce its calculation cost for unweighted networks, a fast solution for weighted networks, which are commonly encountered in many realistic applications, is still lacking. In this study, we develop an efficient parallel GPU-based approach to boost the calculation of the betweenness centrality (BC) for large weighted networks. We parallelize the traditional Dijkstra algorithm by selecting more than one frontier vertex each time and then inspecting the frontier vertices simultaneously. By combining the parallel SSSP algorithm with the parallel BC framework, our GPU-based betweenness algorithm achieves much better performance than its CPU counterparts. Moreover, to further improve performance, we integrate the work-efficient strategy, and to address the load-imbalance problem, we introduce a warp-centric technique, which assigns many threads rather than one to a single frontier vertex. Experiments on both realistic and synthetic networks demonstrate the efficiency of our solution, which achieves 2.9× to 8.44× speedups over the parallel CPU implementation. Our algorithm is open-source and free to the community; it is publicly available through https://dx.doi.org/10.6084/m9.figshare.4542405 . Considering the pervasive deployment and declining price of GPUs in personal computers and servers, our solution will offer unprecedented opportunities for exploring betweenness-related problems and will motivate follow-up efforts in network science.
PeerJ Computer Science, Volume 3; doi:10.7717/peerj-cs.136
Abstract:With a code injection attack (CIA) an attacker can introduce malicious code into a computer program or system that fails to properly encode data that comes from an untrusted source. A CIA can have different forms depending on the execution context of the application and the location of the programming flaw that leads to the attack. Currently, CIAs are considered one of the most damaging classes of application attacks since they can severely affect an organisation’s infrastructure and cause financial and reputational damage to it. In this paper we examine and categorize the countermeasures developed to detect the various attack forms. In particular, we identify two distinct categories. The first incorporates static program analysis tools used to eliminate flaws that can lead to such attacks during the development of the system. The second involves the use of dynamic detection safeguards that prevent code injection attacks while the system is in production mode. Our analysis is based on nonfunctional characteristics that are considered critical when creating security mechanisms. Such characteristics involve usability, overhead, implementation dependencies, false positives and false negatives. Our categorization and analysis can help both researchers and practitioners either to develop novel approaches, or use the appropriate mechanisms according to their needs.