It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Description:

It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena J. McLean Sloughter Based on research being conducted under Adrian E. Raftery and ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 34

Provided by: mcle71

Learn more at: http://www.stat.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

1
It Never Rains But It PoursModeling Mixed
Discrete-Continuous Weather Phenomena

J. McLean Sloughter

Based on research being conducted under Adrian E.
Raftery and Tilmann Gneiting
This work was supported by the DoD
Multidisciplinary University Research Initiative
(MURI) program administered by the Office of
Naval Research under Grant N00014-01-10745.
2
Ensemble Forecasting

Single forecast model is run multiple times with
different initial conditions
Ensemble mean tends to outperform individual
members
Spread-skill relationship spread of forecasts
tends to be correlated with magnitude of error
Model is underdispersive (not calibrated)

3
Spread/Skill Plot
Spread max forecast min forecast Skill
abs(forecast mean observed)
4
Ensemble Member Forecasts
48-hour forecasts for precipitation at 5pm Oct
20, 2003 From http//www.atmos.washington.edu/emm
5rt/ensemble.cgi
5
Bayesian Model Averaging

Weighted average of multiple models
Weights determined by posterior probabilities of
models
Posterior probabilities given by how well each
member fits the training data
Weights, then, give an indication of the relative
usefulness of ensemble members

6
BMA for ensembles
where
is the forecast from member i,
is the weight associated with member i, and
is the estimated distribution function for Y
given member i
Picture taken from Raftery, Balabdaoui, Gneiting,
and Polakowski (2003), Calibrated
MesoscaleShort-Range Ensemble Forecasting Using
Bayesian Model Averaging.
7
The Trouble With Our Models

Forecasts never predict zero artifact of
differential equations used to create forecasts
Observed wind speed is often zero
Wind speed, even ignoring zeroes, is not normally
distributed

8
What Wind Speed Looks Like
Wind Speed Histogram Several exceptionally
high values make it harder to see clearly
Wind Speed Histogram truncated to only go up to
fifty there is a spike at zero
Wind Speed Histogram without zeroes
9
What Forecasts Look Like
Forecast histogram on left (all eight forecasts
have similar histograms) and observed histogram
on right even after removing zeroes from the
actual histogram, the shape is still not quite
right actual is more sharply skewed.
10
The Problem With Reality

As we saw, the histograms for forecasts do not
match the histogram for observations very well
Maximal observed value is 124.000
Maximal values for each model

AVN CMCG Eta GASP JMA NGPS TCWB UKMO
44.591 54.566 44.722 51.732 51.829 45.383 45.125 49.243
11
More Trouble With Reality
AVN CMCG Eta GASP JMA NGPS TCWB UKMO Y
AVN 1.000 0.826 0.845 0.797 0.795 0.783 0.731 0.840 0.417
CMCG 0.826 1.000 0.822 0.807 0.819 0.797 0.770 0.805 0.402
Eta 0.845 0.822 1.000 0.789 0.793 0.781 0.747 0.800 0.406
GASP 0.797 0.807 0.789 1.000 0.788 0.779 0.747 0.786 0.394
JMA 0.795 0.819 0.793 0.788 1.000 0.788 0.756 0.783 0.400
NGPS 0.783 0.797 0.781 0.779 0.788 1.000 0.757 0.779 0.388
TCWB 0.731 0.770 0.747 0.747 0.756 0.757 1.000 0.721 0.384
UKMO 0.840 0.805 0.800 0.786 0.783 0.779 0.721 1.000 0.415
Y 0.417 0.402 0.406 0.394 0.400 0.388 0.384 0.415 1.000
Pairwise correlations Y is observed value,
others are the various forecasts
12
(No Transcript)
13
Time Trends
Left - average observed wind speed per day Right
same, but smoothed to average over 3-week
interval
14
Time Trend Troubles

Higher winds in summer, lower in winter (note
that this appears to be an odd trait of the
northwest)
Need model to reflect seasonal patterns
Would still like to just have a simple model
based on forecasts

15
A Recap Of All The Things That Make Life
Interesting and Miserable

Distribution not normal
Time
Forecasts not very highly corellated with
observations
Zeroes

16
What to do about distributions?

Model using another distribution Gammas and
Weibulls are popular models for windspeed
Can apply a transformation to the data

Left Root of forecast windspeed from model
1 Right Root of observed windspeed (excluding
zeroes for easier visualization)
17
What to do about time?

Rather than using all available data as training
set, only train on recent data
Trade-off between lower variance of estimates
with more data, and better picture of current
trends with less data
Previous research (on temperature and pressure)
indicates 40-day window of training data seems to
give a good balance

18
What to do about the forecasts?

Were not making the forecasts, just using them
We can apply a bias-correction by performing a
linear regression of observed on forecasts (this
is commonly done in forecasting already)
We can see from our weight terms which models
perform better and which perform worse, and
report that to the folks making the forecasts
We can hope that the science of meteorology
continues to move forward as it has thus far

19
What to do about zeroes?

We need a model that includes a point mass at
zero
Two main possibilities
We could model a weighted average of eight
distributions, each of which is a normal plus a
point mass at zero
Or, we could first model probability of zero or
non-zero, then, conditioned on non-zero, the
weighted average of eight normals
We will pursue the second option for now

20
So, lets get to it then

Probability of zero can be modeled by a logistic
regression on the eight forecasts
Then, the weighted average of normals can be
determined by the EM algorithm
Assume each normal has the bias-corrected
forecast as its mean, and has a constant variance
Alternate between predicting membership based on
weights and variances, and weights and variances
based on membership
Make sure to also include probability of being
non-zero when evaluating our functions

21
Lets try out a simple test case first

Generate a sample of 100,000 ordered triples
(x1, x2, y), x1 uniform over 30 to 50, x2 uniform
over 10 to 20
logistic regression coefficients of a10, b1-.2,
b2-.6
with probability determined by logistic
regression, y0
otherwise, with probability .6, y is normal
around x1 with sd of 1, and with probability .4
is normal around x2 with sd of 3.14

22
How did we do?

Predicted logisitic coefficients of a10.257,
b1-0.206, b2-0.600
weights of 0.598 and 0.402
sds of 1.000 and 3.414
Seems to be able to model pretty well under ideal
artificial conditions
So now lets try the real thing

23
How do we do?
RMSE from creating forecasts for 33 days, using
40 day training periods black is without
modeling zeroes, red is with modeling zeroes
24
Iterations to convergence again, black is without
modeling zeroes, red is with
25
How do we feel about this?

Including modeling of zeroes doesnt appear to
help our error much (p0.4734), which is somewhat
disappointing
However, we get our model much faster
(p2.104e-08), which is a concern when having to
do a lot of these

26
What We Would Have Done Next Had We Not Been
Distracted By More Pressing Matters

Consider fitting different distributions rather
than using a transformation
Fit a model with point masses at zero for each
individual component
Try additional bias corrections
Compare results for different training windows
Investigate importance of starting values in EM
algorithm
Evaluate performances of prediction intervals
rather than just prediction means

27
Three Months Later

In the distraction interim, precipitation data
became available
Precipitation forecasts tend to be better than
wind forecasts
Weather people tend to be more interested in
precipitation forecasts
And so, we have abandoned (for now) wind speed,
and are looking at precipitation instead

28
How is rain different?

Distribution doesnt look normal, even under
transformations
Models do predict zero, but we still see point
masses at zero

29
Conditional Histograms
Observed given forecast from 1.5 to 4
Observed given forecast from 55 to 80
in .01
30
Fitting Gammas
Shape Parameter Rate Parameter
Coef. Of Variation
31
Somethings Not Quite Right
Sample Mean Estimated Mean
32
At least something worked out nicely
Proportion of Zeroes
33
And Now?

First, find out whats going funny with our gamma
fitting
Then, try to come up with a way to do some sort
of gamma fitting in the EM algorithm
Then, look at all those things we wanted to look
at before

Write a Comment

User Comments (0)