Optimization of k value and lag parameter of k-nearest neighbor algorithm on the prediction of hotel occupancy rates

Abstract
Hotel occupancy rates are the most important factor in hotel business management. Prediction of the rates for the next few months determines the manager's decision to arrange and provide all the needed facilities. This study performs the optimization of lag parameters and k values of the k-Nearest Neighbor algorithm on hotel occupancy history data. Historical data were arranged in the form of supervised training data, with the number of columns per row according to the lag parameter and the number of prediction targets. The kNN algorithm was applied using 10-fold cross-validation and k-value variations from 1-30. The optimal lag was obtained at intervals of 14-17 and the optimal k at intervals of 5-13 to predict occupancy rates of 1, 3, 6, 9, and 12 months later. The obtained k-value does not follow the rule at the square root of the number of sample data.
Funding Information
  • Universitas Islam Nahdlatul Ulama Jepara (13/SP3R/LPPM/UNISNU/IV/2019)