A Debiased Ranked Probability Skill Score to Evaluate Probabilistic Ensemble Forecasts with Small Ensemble Sizes

Open Access

15 May 2005

journal article
Published by American Meteorological Society in Journal of Climate

Vol. 18 (10), 1513-1523
https://doi.org/10.1175/jcli3361.1

Abstract

The ranked probability skill score (RPSS) is a widely used measure to quantify the skill of ensemble forecasts. The underlying score is defined by the quadratic norm and is comparable to the mean squared error (mse) but it is applied in probability space. It is sensitive to the shape and the shift of the predicted probability distributions. However, the RPSS shows a negative bias for ensemble systems with small ensemble size, as recently shown. Here, two strategies are explored to tackle this flaw of the RPSS. First, the RPSS is examined for different norms L (RPSS_L). It is shown that the RPSS_L₌₁ based on the absolute rather than the squared difference between forecasted and observed cumulative probability distribution is unbiased; RPSS_L defined with higher-order norms show a negative bias. However, the RPSS_L₌₁ is not strictly proper in a statistical sense. A second approach is then investigated, which is based on the quadratic norm but with sampling errors in climatological probabilities considered in the reference forecasts. This technique is based on strictly proper scores and results in an unbiased skill score, which is denoted as the debiased ranked probability skill score (RPSS_D) hereafter. Both newly defined skill scores are independent of the ensemble size, whereas the associated confidence intervals are a function of the ensemble size and the number of forecasts. The RPSS_L₌₁ and the RPSS_D are then applied to the winter mean [December–January–February (DJF)] near-surface temperature predictions of the ECMWF Seasonal Forecast System 2. The overall structures of the RPSS_L₌₁ and the RPSS_D are more consistent and largely independent of the ensemble size, unlike the RPSS_L₌₂. Furthermore, the minimum ensemble size required to predict a climate anomaly given a known signal-to-noise ratio is determined by employing the new skill scores. For a hypothetical setup comparable to the ECMWF hindcast system (40 members and 15 hindcast years), statistically significant skill scores were only found for a signal-to-noise ratio larger than ∼0.3.

Keywords

This publication has 16 references indexed in Scilit:

Probabilistic seasonal prediction of the winter North Atlantic Oscillation and its impact on near surface temperature
Climate Dynamics, 2004
On Using “Climatology” as a Reference Strategy in the Brier and Ranked Probability Skill Scores
Monthly Weather Review, 2004
A strategy for high‐resolution ensemble prediction. II: Limited‐area experiments in four Alpine flood events
Quarterly Journal of the Royal Meteorological Society, 2001
Seasonal Predictions, Probabilistic Verifications, and Ensemble Size
Journal of Climate, 2001
Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems
Weather and Forecasting, 2000
Impact of Ensemble Size on Ensemble Prediction
Monthly Weather Review, 1998
Impact of model resolution and ensemble size on the performance of an Ensemble Prediction System
Quarterly Journal of the Royal Meteorological Society, 1998
Ensemble size for numerical seasonal forecasts
Tellus A: Dynamic Meteorology and Oceanography, 1997
A Scoring System for Probability Forecasts of Ranked Categories
Journal of Applied Meteorology, 1969
On the “Ranked Probability Score”
Journal of Applied Meteorology, 1969

Cited by 78 articles