FPGA implementation of K-means algorithm for bioinformatics application: An accelerated approach to clustering Microarray data

1 June 2011

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 248-255
https://doi.org/10.1109/ahs.2011.5963944

Abstract

The Microarray is a technique used by biologists to perform many genome experiments simultaneously, which produces very large datasets. Analysis of these datasets is a challenge for scientists especially as the number of genome databases is increasing rapidly every year. K-means clustering is an unsupervised data mining technique used widely by bioinformaticians to analyze Microarray data. However, K-means can take between a few seconds to several days to process Microarray data depending on the size of these datasets. This puts a limit on the complexity of biological problems which can be asked by bioinfomaticians, and hence may result in an incomplete solution to the problem. In order to overcome such problems, we propose a highly parallel hardware design to accelerate the K-means clustering of Microarray data by implementing the K-means algorithm in Field Programmable Gate Arrays (FPGA). Our implementation is particularly suitable for server solution as it allows for processing many different datasets simultaneously. We have designed, and implemented five k-mean cores on Xilinx Virtex4 XC4VLX25 FPGA, and tested them on a sample of real Yeast Microarray data. Our design achieved about 51.7× speed-up when compared to a software model while being 206.8× more energy efficient.

Keywords

This publication has 10 references indexed in Scilit:

A roadmap of clustering algorithms: finding a match for a biomedical application
Briefings in Bioinformatics, 2008
Evaluating power and energy consumption of FPGA-based custom computing machines for scientific floating-point computation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
K-means Clustering for Multispectral Images Using Floating-Point Divide
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
High Speed Document Clustering in Reconfigurable Hardware
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Hyperspectral images clustering on reconfigurable hardware using the k-means algorithm
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Microarray Bioinformatics
Published by Cambridge University Press (CUP) ,2003
Experience with a Hybrid Processor: K-Means Clustering
The Journal of Supercomputing, 2003
Precision and error analysis of MATLAB applications during automated hardware synthesis for FPGAs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware
Published by Association for Computing Machinery (ACM) ,2001
Design issues for hardware implementation of an algorithm for segmenting hyperspectral imagery
Published by SPIE-Intl Soc Optical Eng ,2000

Cited by 60 articles