MODEL REGRESI LINIER BAYESIAN DENGAN APLIKASI PADA DATA PENUNDAAN PENERBANGAN

Abstract
Bayesian linear regression is an approach to linear regression where statistical analysis depend of Bayesian inference. The Bayesian model on big data uses a summary of data statistics as input; Statistical summary can be calculated from each subset, then a statistical summary of the full dataset is obtained from the sum of the summary statistics for each subset. Recent developments in data science and research, produce large datasets that are too large to be analyzed as a whole due to the limitations of computer memory or storage capacity. To overcome this, a program package was introduced from R namely BayesSummaryStatLM for the Bayesian linear regression model with the Markov Chain Monte Carlo implementation that overcomes this limitation. Then the program package from R, ff is used to read data in large datasets while calculating statistics summary. In this study Bayesian linear regression model used with several choices of prior distribution for unknown model parameters, and illustrates in simulation data and real datasets for flight delay data in US 2008. The application of simulation data and flight delay data produces a plot of density functions for the β parameters has a shape resembling a plot of Normal distribution density function, whereas for plot parameters the density function has a shape resembling the plot of Inverse Gamma distribution density function. In the simulation data, the estimator for each parameter produced has a value that approach to the value of the specified parameter (True Value). This is also indicated by the narrow credible interval for each parameters.