A Multi-Variable Sentinel-2 Random Forest Machine Learning Model Approach to Predicting Perennial Ryegrass Biomass in Commercial Dairy Farms in Southeast Australia

Abstract
One of the most valuable and nutritionally essential agricultural commodities worldwide is milk. The European Union and New Zealand are the second- and third-largest exporting regions of milk products and rely heavily on pasture-based production systems. They are comparable to the Australian systems investigated in this study. With projections of herd decline, increased milk yield must be obtained from a combination of animal genetics and feed efficiencies. Accurate pasture biomass estimation across all seasons will improve feed efficiency and increase the productivity of dairy farms; however, the existing time-consuming and manual methods of pasture measurement limit improvements to utilisation. In this study, Sentinel-2 (S2) band and spectral index (SI) information were coupled with the broad season and management-derived datasets using a Random Forest (RF) machine learning (ML) framework to develop a perennial ryegrass (PRG) biomass prediction model accurate to +/−500 kg DM/ha, and that could predict pasture yield above 3000 kg DM/ha. Measurements of PRG biomass were taken from 11 working dairy farms across southeastern Australia over 2019–2021. Of the 68 possible variables investigated, multiple simulations identified 12 S2 bands and 9 SI, management and season as the most important variables, where Short-Wave Infrared (SWIR) bands were the most influential in predicting pasture biomass above 4000 kg DM/ha. Conditional Latin Hypercube Sampling (cLHS) was used to split the dataset into 80% and 20% for model calibration and internal validation in addition to an entirely independent validation dataset. The combined internal model validation showed R2 = 0.90, LCCC = 0.72, RMSE = 439.49 kg DM/ha, NRMSE = 15.08, and the combined independent validation had R2 = 0.88, LCCC = 0.68, RMSE = 457.05 kg DM/ha, NRMSE = 19.83. The key findings of this study indicated that the data obtained from the S2 bands and SI were appropriate for making accurate estimations of PRG biomass. Furthermore, including SWIR bands significantly improved the model. Finally, by utilising an RF ML model, a single ‘global’ model can automate PRG biomass prediction with high accuracy across extensive regions of all seasons and types of farm management.
Funding Information
  • Dairy Australia
  • Gardiner Dairy Foundation
  • Agriculture Victoria