Synoptic Map-Pattern Classification Using Recursive Partitioning and Principal Component Analysis

Abstract
A method for classifying synoptic-scale maps into discrete groups is introduced. Tree-based recursive partitioning models are used to develop mappings between synoptic-scale circulation fields and the leading linear and nonlinear principal components (PCs) of weather elements observed at a surface station. Statistically unique but climatically insignificant patterns are avoided by identifying map patterns based on their association with indices related to local weather conditions. The method requires few user-adjustable parameters and includes an algorithm that provides objective guidance for determining the appropriate number of map patterns to retain. The classification method is demonstrated using daily sea level pressure and 500-hPa geopotential height maps from a domain covering British Columbia and the northeastern Pacific Ocean. The linear and nonlinear weather element PCs are derived from daily measurements of surface temperature, dewpoint temperature, cloud opacity, and u and υ wind components taken at Vancouver, British Columbia. Classification performance is tested by applying the method to precipitation and air quality scenarios. Results are compared with those from unsupervised map-pattern classifications based on the k-means clustering algorithm. Results from recursive partitioning models using linear weather element PCs as targets were better than those from the k-means algorithm. Recursive partitioning trees using nonlinear PCs as targets performed slightly worse than those using linear PCs as targets. Interestingly, trees using gridpoint circulation data as inputs outperformed models that used truncated PCs of the circulation data as inputs. Poorer results were found not to result from loss of information due to truncation of the PCs. Instead, the way information is encoded in principal component analysis (PCA) may be responsible for the poor classification performance in the recursive partitioning models using circulation PCs as inputs.