Acceleration of a Feature Selection Algorithm Using High Performance Computing

Open Access

1 September 2020

conference paper
conference paper
Published by MDPI AG in Proceedings

Vol. 54 (1), 54
https://doi.org/10.3390/proceedings2020054054

Abstract

Feature selection is a subfield of data analysis that is on reducing the dimensionality of datasets, so that subsequent analyses over them can be performed in affordable execution times while keeping the same results. Joint Mutual Information (JMI) is a highly used feature selection method that removes irrelevant and redundant characteristics. Nevertheless, it has high computational complexity. In this work, we present a multithreaded MPI parallel implementation of JMI to accelerate its execution on distributed memory systems, reaching speedups of up to 198.60 when running on 256 cores, and allowing for the analysis of very large datasets that do not fit in the main memory of a single node.

Keywords

FEATURE SELECTION
MUTUAL INFORMATION
MACHINE LEARNING
HIGH PERFORMANCE COMPUTING
MPI