A Spark-Based Parallel Fuzzy $c$ -Means Segmentation Algorithm for Agricultural Image Big Data

Abstract
With the explosive growth of image big data in the agriculture field, image segmentation algorithms are confronted with unprecedented challenges. As one of the most important image segmentation technologies, the fuzzy C-means (FCM) algorithm has been widely used in the field of agricultural image segmentation as it provides simple computation and high quality segmentation. However, due to its large amount of computation, the sequential FCM algorithm is too slow to finish the segmentation task within an acceptable time. This paper proposes a parallel FCM segmentation algorithm based on the distributed memory computing platform Apache Spark for agricultural image big data. The input image is first converted from the RGB color space to the Lab color space and generates point cloud data. Then, point cloud data are partitioned and stored in different computing nodes, in which the membership degrees of pixel points to different cluster centers are calculated, and the cluster centers are updated iteratively in a data-parallel form until the stopping condition is satisfied. Finally, point cloud data are restored after clustering for reconstructing the segmented image. On the Spark platform, the performance of the parallel fuzzy C-means algorithm is evaluated and reaches an average speedup of 12.54 on 10 computing nodes. Experimental results show that the Spark-based parallel fuzzy C-means algorithm can obtain a significant increase in speedup, and the agricultural image testing set delivers a better performance improvement of 128% than the Hadoop-based approach. This research indicates that the Spark-based parallel FCM algorithm provides a faster speed of segmentation for agricultural image big data and has better scaleup and sizeup rates.
Funding Information
  • National Natural Science Foundation of China (61602388)
  • Natural Science Basic Research Plan in Shaanxi Province of China (2017JM6059)
  • China Postdoctoral Science Foundation (2017M613216)
  • Shaanxi Province Postdoctoral Science Foundation (2016BSHEDZZ121)
  • Natural Science Foundation of Hubei Province (2017CFB592)
  • Fundamental Research Funds for the Central Universities (2452016081, 2452015194)