A Survey of Data Mining Activities in Distributed Systems

Abstract
Distributed systems, which may be utilized to do computations, are being developed as a result of the fast growth of sharing resources. Data mining, which has a huge range of real applications, provides significant techniques for extracting meaningful and usable information from massive amounts of data. Traditional data mining methods, on the other hand, suppose that the data is gathered centrally, stored in memory, and is static. Managing massive amounts of data and processing them with limited resources is difficult. Large volumes of data, for instance, are swiftly generated and stored in many locations. This becomes increasingly costly to centralize them at a single location. Furthermore, traditional data mining methods typically have several issues and limitations, such as memory restrictions, limited processing ability, and insufficient hard drive space, among others. To overcome the following issues, distributed data mining's have emerged as a beneficial option in several applications According to several authors, this research provides a study of state-of-the-art distributed data mining methods, such as distributed common item-set mining, distributed frequent sequence mining, technical difficulties with distributed systems, distributed clustering, as well as privacy-protection distributed data mining. Furthermore, each work is evaluated and compared to the others.