A Heuristic Data Distribution Scheme for data mining applications on grid environments
- 1 June 2008
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence)
- p. 2398-2404
- https://doi.org/10.1109/fuzzy.2008.4630704
Abstract
Effective data distribution techniques can significantly reduce the total execution time of a program on grid computing environments, especially for data mining applications. In this paper, we describe a linear programming formulation for the data distribution problem on grids. Furthermore, a heuristic method, named HDDS (heuristic data distribution scheme), is proposed to solve this problem. We implement the parallel association rule mining method and conduct the experimentations on our grid testbed. Experimental results showed that data mining programs using our HDDS to distribute data could execute more efficiently than traditional schemes could.Keywords
This publication has 13 references indexed in Scilit:
- Meta-Systems: An Approach Combining Parallel Processing and Heterogeneous Distributed Computing SystemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Scheduling divisible loads on star and tree networks: results and open problemsIEEE Transactions on Parallel and Distributed Systems, 2005
- Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed SystemsCluster Computing, 2003
- A novel data distribution technique for host-client type parallel applicationsIEEE Transactions on Parallel and Distributed Systems, 2002
- The Grid: A New Infrastructure for 21st Century SciencePhysics Today, 2002
- The Anatomy of the Grid: Enabling Scalable Virtual OrganizationsThe International Journal of High Performance Computing Applications, 2001
- A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing SystemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- Globus: a Metacomputing Infrastructure ToolkitThe International Journal of Supercomputer Applications and High Performance Computing, 1997
- Parallel mining of association rulesIEEE Transactions on Knowledge and Data Engineering, 1996
- Partitioning techniques for large-grained parallelismIEEE Transactions on Computers, 1988