Abstract
Building operation data are important for monitoring, analysis, modeling, and control of building energy systems. However, missing data is one of the major data quality issues, making data imputation techniques become increasingly important. There are two key research gaps for missing sensor data imputation in buildings: the lack of customized and automated imputation methodology, and the difficulty of the validation of data imputation methods. In this paper, a framework is developed to address these two gaps. First, a validation data generation module is developed based on pattern recognition to create a validation dataset to quantify the performance of data imputation methods. Second, a pool of data imputation methods is tested under the validation dataset to find an optimal single imputation method for each sensor, which is termed as an ensemble method. The method can reflect the specific mechanism and randomness of missing data from each sensor. The effectiveness of the framework is demonstrated by 18 sensors from a real campus building. The overall accuracy of data imputation for those sensors improves by 18.2% on average compared with the best single data imputation method.