A machine learning approach for accelerating DNA sequence analysis

Abstract
The DNA sequence analysis is a data and computationally intensive problem and therefore demands suitable parallel computing resources and algorithms. In this paper, we describe an optimized approach for DNA sequence analysis on a heterogeneous platform that is accelerated with the Intel Xeon Phi. Such platforms commonly comprise one or two general purpose host central processing units (CPUs) and one or more Xeon Phi devices. We present a parallel algorithm that shares the work of DNA sequence analysis between the host CPUs and the Xeon Phi device to reduce the overall analysis time. For automatic worksharing we use a supervised machine learning approach, which predicts the performance of DNA sequence analysis on the host and device and accordingly maps fractions of the DNA sequence to the host and device. We evaluate our approach empirically using real-world DNA segments for human and various animals on a heterogeneous platform that comprises two 12-core Intel Xeon E5 CPUs and an Intel Xeon Phi 7120P device with 61 cores.

This publication has 20 references indexed in Scilit: