A GPU Implementation of Fast Parallel Markov Clustering in Bioinformatics Using EllPACK-R Sparse Data Format

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies

p. 173-175
https://doi.org/10.1109/act.2010.10

Abstract

The massively parallel computing using graphical processing unit (GPU), which based on tens of thousands of parallel threats within hundreds of GPU's streaming processors, has gained broad popularity and attracted researchers in a wide range of application areas from finance, computer aided engineering, computational fluid dynamics, game physics, numerics, science, medical imaging, life science, and so on, including molecular biology and bioinformatics. Meanwhile, Markov clustering algorithm (MCL) has become one of the most effective and highly cited methods to detect and analyze the communities/clusters within an interaction network dataset on many real world problems such us social, technological, or biological networks including protein-protein interaction networks. However, as the dataset become bigger and bigger, the computation time of MCL algorithm become slower and slower. Hence, GPU computing is an interesting and challenging alternative to attempt to improve the MCL performance. In this poster paper we introduce our improvement of MCL performance based on ELLPACK-R sparse dataset format using GPU computing with the Compute Unified Device Architecture tool (CUDA) from NVIDIA (called CUDA-MCL). As the results show the significant improvement in CUDA-MCL performance and with the low-cost and widely available GPU devices in the market today, this CUDA-MCL implementation is allowing large-scale parallel computation on off-the-shelf desktop machines. Moreover the GPU computing approaches potentially may contribute to significantly change the way bioinformaticians and biologists compute and interact with their data.

Keywords

This publication has 11 references indexed in Scilit:

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs
BMC Bioinformatics, 2009
Scalable Parallel Programming with CUDA
Queue, 2008
Data-Parallel Computing
Queue, 2008
Graph Clustering Via a Discrete Uncoupling Process
SIAM Journal on Matrix Analysis and Applications, 2008
Network‐based prediction of protein function
Molecular Systems Biology, 2007
Evaluation of clustering algorithms for protein-protein interaction networks
BMC Bioinformatics, 2006
Advanced computing for systems biology
Briefings in Bioinformatics, 2006
BioGRID: a general repository for interaction datasets
Nucleic Acids Research, 2006
A hybrid clustering approach to recognition of protein families in 114 microbial genomes
BMC Bioinformatics, 2004
An efficient algorithm for large-scale detection of protein families
Nucleic Acids Research, 2002

Cited by 2 articles