Title: Probabilistic Weather Forecasting Using Bayesian Model Averaging
1Probabilistic Weather Forecasting Using
Bayesian Model Averaging
- J. McLean Sloughter
- Adviser Tilmann Gneiting
- GSR Susan Joslyn
- Committee members Adrian Raftery Cliff Mass
- 8 May, 2009
This work was supported by MURI, JEFS, NSF
grants
2- Background
- Motivation
- Ensemble forecasting
- Bayesian model averaging
- Dissertation outline
- BMA for vector wind
- Data
- Decomposing the problem
- Bias-correction
- Error distributions
- Model
- Results
- Future directions
- References
- Acknowledgements
3Why probabilistic forecasting?
- Situations where certain ranges or thresholds are
of interest - Situations where knowing not just the most likely
outcome, but possible extremes are important - Situations involving a cost / loss analysis,
where probabilities of different outcomes need to
be known - Examples
- Wind energy
- Military
- Sailing
- Airports
- Winter road maintenance
4Ensemble Forecasting
48-hour forecasts for maximum wind speeds on 7
August 2003
5Ensemble Forecasting
- Single forecast model is run multiple times with
different initial conditions - Forecasts created on a 12-km grid, and bilinearly
interpolated to locations of interest - Ensemble mean tends to outperform individual
members - Spread-skill relationship spread of forecasts
tends to be correlated with magnitude of error
6Ensemble Forecasting
- Would like the ensemble to look like draws from
the same distribution as the observed values - Ensemble only captures one source of variability
uncertainty in initial conditions - Ensemble distribution is underdispersed relative
to observed values - Ensemble members agree with one another more than
they agree with observations
7Bayesian model averaging (BMA)
- Weighted average of multiple component models
- One component per ensemble member
- Each component a distribution of observed value
conditioned on an ensemble member forecast - Model fit based on training data past sets of
forecasts / observations - Use a sliding window of training data
- Weights determined by how well each member fits
the training data
8Bayesian model averaging
where
is the deterministic forecast from member k,
is the weight associated with member k, and
is the estimated density function for y given
member k
9- Background
- Motivation
- Ensemble forecasting
- Bayesian model averaging
- Dissertation outline
- BMA for vector wind
- Data
- Decomposing the problem
- Bias-correction
- Error distributions
- Model
- Results
- Future directions
- References
- Acknowledgements
10Dissertation Outline
- Precipitation forecasting
- Sloughter et al., 2007, MWR
- Extends BMA to a specific case of skewed and
non-continuous distributions - Wind speed forecasting
- Sloughter et al., 2009, JASA
- Extends methods of Sloughter et al. (2007) to
other forms of skewed and non-continuous
distributions - Examines robustness of BMA to details of model
selection - Vector wind forecasting
- This talk
- Extends BMA to multivariate distributions
11- Background
- Motivation
- Ensemble forecasting
- Bayesian model averaging
- Dissertation outline
- BMA for vector wind
- Data
- Decomposing the problem
- Bias-correction
- Error distributions
- Model
- Results
- Future directions
- References
- Acknowledgements
12BMA for vector wind
- Methods exist for using Bayesian Model Averaging
to create probabilistic forecasts for weather
quantities that can be expressed as a mixture of
normals (Raftery et al., 2005), such as
temperature and pressure. - Expanded to be applied to non-continuous and
skewed quantities such as precipitation and wind
speed in Sloughter et al. 2007, Sloughter et al.
2009. - A method is needed for modeling multivariate
quantites such as wind vectors.
13Knot
- A knot is a measure of speed used in nautical,
meteorological, and aviation settings - 1.852 kilometers per hour
- 1.151 miles per hour
- 0.514 meters per second
- Sailors would throw out the chip log (a board
designed to stay stationary in water) tied to a
rope with knots spaced 7 fathoms (42 feet) apart - They would then count how many knots were fed out
in 30 seconds
14Knot
- 4-6 knots is a light breeze leaves move, breeze
can be felt on ones face - 11-16 knots is a moderate breeze dust and paper
will be blown about, whitecaps will form on the
water - 20-21 knots is generally the threshold for
issuing a small craft advisory - 34-40 knots is a gale small branches break from
trees, walking becomes difficult
15Data
- This work uses wind data from the Pacific
Northwest for the full year 2003, plus November
and December 2002 (results for 2003 data, 2002
used only for training) - Instantaneous vector wind measurements
- Measured in knots
- Each forecast consists of 8 ensemble members
- Data were available for 343 days, missing for 83
days - A total of 38091 observations, averaging 111
observations per day - All work that follows is based on 30-day training
periods, with 2-day-ahead forecasting
16Data
- Data from Surface Airway Observation stations
- Airports in BC, Washington, Oregon, Idaho, and
California
17Decomposing the problem
- Wind has two dimensions, east/west direction and
north/south direction - BMA uses a mixture distribution with one
component per ensemble member - Consider each mixture component a bivariate
distribution parameterized in terms of a mean
vector and a covariance matrix - Assume that the mean of the distribution is some
function of the forecast vector, and that the
covariance matrix does not depend upon the
forecast (exploratory plots support these
assumptions)
18Decomposing the problem
- h(fk) is the mean (a bias-corrected forecast)
- BV(0, Q) is the distribution of the forecast
error - Model the distribution of the errors rather than
the observed values - Has the advantage of having constant parameters
across forecast values - Can then be decomposed into two separate
problems - bias-correcting the forecast
- modeling the error distribution
19Bias-correction
- For simplicity, consider affine bias corrections
- Two potential forms of bivariate bias correction
- Additive bias-correction
- Full affine bias-correction
- Where Y is the observed wind vector, fk is the
kth vector forecast, ak is an additive bias
vector, and Bk is a transformation matrix
20Bias-correction
Bivariate root mean squared error (in knots) for
one ensemble member
- Out-of-sample results using 30-day training
period - Similar results hold for other ensemble members
- Affine bias-correction shows a marked improvement
21Error distributions
- Now deal with the error field (observations minus
bias-corrected forecasts) - Exploratory work suggests that the distributions
are ellipsoidal, but have heavier tails than
normal distributions - Transform the error vector (rkcosqk, rksinqk)T by
raising the magnitude of the vector to the 4/5
power while preserving the angle - Model this as a bivariate normal distribution
22Model
- Thus, our final model is
- Where the gk are the distributions on y implied
by the distributions of the transformed error
vectors - Model parameters are estimated globally using all
observation locations
23Model
- Bias-correction fit via linear regression
(separate bias correction for each mixture
component) - Weights and covariance matrix estimated via
maximum likelihood using the EM algorithm - Use latent variables zkst which are indicators
that forecast k was the best forecast at station
s at time t
24Model
25Model
26Results
- We simulate a large number of forecasts from our
distribution - Can evaluate the forecast of either the wind
vector or derived quantities (marginal speed or
direction) from the empirical distribution of our
forecasts - Essentially creating a new, larger ensemble of
forecasts that should be better-calibrated than
the original ensemble
27Example
- To illustrate what the BMA distribution is doing,
consider the case of forecasting at Omak,
Washington on February 4th, 2003
28(No Transcript)
29Results
- Our goal is to maximize sharpness subject to
calibration (the Gneiting principle) - By calibration, we mean that we want our
probability distribution function to be correct
if we forecast an event as happening with
probability .9, we want it to happen 90 of the
time - By sharpness, we mean that we want predictive
intervals to be as narrow as possible
30Results
- For univariate quantities, the verification rank
histogram is a tool that can be used to assess
the calibration of an ensemble forecast - Find the rank of each forecast relative to the
ensemble members - If the ensemble is properly calibrated, the
observation and forecasts should be
interchangeable - If so, each potential rank of the forecast should
have equal probability - Thus, a histogram of the ranks should look flat
31Results
- For multivariate quantities, there is an
analogous multivariate rank histogram (MVRH),
again based on the assumption of exchangeability - Define if and only if in every dimension
- For each member of the combined set of the
observation and the forecasts, find the pre-rank -
- The multivariate rank is the rank of the
observation pre-rank, with any ties resolved at
random - If we have a set of 8 forecasts and 1
observation, there are 9 possible rankings of the
observation relative to the forecasts
32Results
- MVRH for the raw ensemble (left) and BMA forecast
distribution (right) - Raw ensemble is under-dispersed
- BMA forecast distribution is much
better-calibrated
33Results
- The energy score (ES) is a scoring rule for
multivariate probabilistic forecasts that takes
into account both calibration and sharpness - In the univariate case, it reduces to the
continuous ranked probability score (CRPS) - P is the predictive distribution, x the observed
wind vector, X and X independent random
variables with distribution P
34Results
- There may still be interest in a point forecast
as well - We can use the spatial median as a point forecast
- We can assess the quality of a multivariate point
forecast using the multivariate mean absolute
error (MMAE)
35Results
- BMA outperforms climatology and the raw ensemble
both in terms of the probabilistic forecast and
the deterministic forecast
36Results marginal speed and direction
- Again consider verification rank histograms to
assess calibration - Both speed (top) and direction (bottom) are much
improved by BMA
37Results marginal speed and direction
- CRPS is the scalar equivalent of the energy score
- DCRPS is the angular equivalent
- Scalar point forecasts can be assessed by the
MAE, and angular point forecasts by the mean
directional error (MDE) - Can also look at coverage and width of 77.8
prediction intervals for scalar forecasts
coverage assesses calibration, width assesses
sharpness
38Results marginal speed and direction
- Wind speed
- Wind direction
39Results marginal speed and direction
- We can see that for both speed and direction, BMA
improves the quality of both the probabilistic
and deterministic forecasts - BMA produces marginal distributions that are
better-calibrated than the raw ensemble and
sharper than climatology
40- Background
- Motivation
- Ensemble forecasting
- Bayesian model averaging
- Dissertation outline
- BMA for vector wind
- Data
- Decomposing the problem
- Bias-correction
- Error distributions
- Model
- Results
- Future directions
- References
- Acknowledgements
41Future Directions
- Develop a BMA method to explicitly model marginal
instantaneous wind speed and compare to the
performance of the forecasts from this model
(current BMA for marginal wind speed is for
maximum wind speeds, not instantaneous) - Incorporate spatial information, either through
explicitly modeling some spatial structure to our
parameters or by estimating parameters locally
rather than globally - Investigate using an exponential forgetting for
training data rather than a sliding window, which
could allow for faster computation through the
use of updating formulae for parameter estimates - Extend multivariate methods to jointly forecast
multiple weather quantities simultaneously
42References
- Raftery, A.E., Gneiting, T., Balabdaoui, F. and
Polakowski, M. (2005). Using Bayesian Model
Averaging to calibrate forecast ensembles.
Monthly Weather Review, 133, 1155-1174. - Sloughter, J. M., Raftery, A. E., Gneiting, T.
and Fraley, C. (2007). Probabilistic quantitative
precipitation forecasting using Bayesian model
averaging. Monthly Weather Review, 135,
3209-3220. - Sloughter, J. M., Gneiting, T., and Raftery, A.E.
(2009). Probabilistic wind speed forecasting
using ensembles and Bayesian model averaging.
Journal of the American Statistical Association,
accepted. - Mass, C., Joslyn, S., Pyle, P., Tewson, P.,
Gneiting, T., Raftery, A., Baars, J., Sloughter,
J. M., Jones, D., and Fraley, C. (2009).
PROBCAST A web-based portal to mesoscale
probabilistic forecasts. Bulletin of the American
Meteorological Society, in press. - http//probcast.com
43Acknowledgements
- Committee
- Tilmann Gneiting - adviser
- Adrian Raftery, Cliff Mass - committee members
- Susan Joslyn - GSR
- Statistics folks
- Veronica Berrocal, Chris Fraley, Thordis
Thorarinsdottir, Will Kleiber, Larissa Stanberry,
Matt Johnson, Robert Yuen, Michael Polakowski,
Nicholas Johnson - Atmospheric Sciences folks
- Jeff Baars, Eric Grimit, Jeff Thomason, Tony
Eckel - APL folks
- Patrick Tewson, John Pyle, David Jones, Janet
Olsonbaker, Scott Sandgathe - Psychology folks
- Limor Nadav-Greenberg, Buzz Hunt, Queena Chen,
Jared Le Clerc, Rebecca Nichols, Sonia Savelli