It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena

Description:

It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena J. McLean Sloughter Based on research being conducted under Adrian E. Raftery and ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 34
Provided by: mcle71
Learn more at: http://www.stat.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: It Never Rains But It Pours: Modeling Mixed Discrete-Continuous Weather Phenomena


1
It Never Rains But It PoursModeling Mixed
Discrete-Continuous Weather Phenomena
  • J. McLean Sloughter

Based on research being conducted under Adrian E.
Raftery and Tilmann Gneiting
This work was supported by the DoD
Multidisciplinary University Research Initiative
(MURI) program administered by the Office of
Naval Research under Grant N00014-01-10745.
2
Ensemble Forecasting
  • Single forecast model is run multiple times with
    different initial conditions
  • Ensemble mean tends to outperform individual
    members
  • Spread-skill relationship spread of forecasts
    tends to be correlated with magnitude of error
  • Model is underdispersive (not calibrated)

3
Spread/Skill Plot
Spread max forecast min forecast Skill
abs(forecast mean observed)
4
Ensemble Member Forecasts
48-hour forecasts for precipitation at 5pm Oct
20, 2003 From http//www.atmos.washington.edu/emm
5rt/ensemble.cgi
5
Bayesian Model Averaging
  • Weighted average of multiple models
  • Weights determined by posterior probabilities of
    models
  • Posterior probabilities given by how well each
    member fits the training data
  • Weights, then, give an indication of the relative
    usefulness of ensemble members

6
BMA for ensembles
where
is the forecast from member i,
is the weight associated with member i, and
is the estimated distribution function for Y
given member i
Picture taken from Raftery, Balabdaoui, Gneiting,
and Polakowski (2003), Calibrated
MesoscaleShort-Range Ensemble Forecasting Using
Bayesian Model Averaging.
7
The Trouble With Our Models
  • Forecasts never predict zero artifact of
    differential equations used to create forecasts
  • Observed wind speed is often zero
  • Wind speed, even ignoring zeroes, is not normally
    distributed

8
What Wind Speed Looks Like
Wind Speed Histogram Several exceptionally
high values make it harder to see clearly
Wind Speed Histogram truncated to only go up to
fifty there is a spike at zero
Wind Speed Histogram without zeroes
9
What Forecasts Look Like
Forecast histogram on left (all eight forecasts
have similar histograms) and observed histogram
on right even after removing zeroes from the
actual histogram, the shape is still not quite
right actual is more sharply skewed.
10
The Problem With Reality
  • As we saw, the histograms for forecasts do not
    match the histogram for observations very well
  • Maximal observed value is 124.000
  • Maximal values for each model

AVN CMCG Eta GASP JMA NGPS TCWB UKMO
44.591 54.566 44.722 51.732 51.829 45.383 45.125 49.243
11
More Trouble With Reality
AVN CMCG Eta GASP JMA NGPS TCWB UKMO Y
AVN 1.000 0.826 0.845 0.797 0.795 0.783 0.731 0.840 0.417
CMCG 0.826 1.000 0.822 0.807 0.819 0.797 0.770 0.805 0.402
Eta 0.845 0.822 1.000 0.789 0.793 0.781 0.747 0.800 0.406
GASP 0.797 0.807 0.789 1.000 0.788 0.779 0.747 0.786 0.394
JMA 0.795 0.819 0.793 0.788 1.000 0.788 0.756 0.783 0.400
NGPS 0.783 0.797 0.781 0.779 0.788 1.000 0.757 0.779 0.388
TCWB 0.731 0.770 0.747 0.747 0.756 0.757 1.000 0.721 0.384
UKMO 0.840 0.805 0.800 0.786 0.783 0.779 0.721 1.000 0.415
Y 0.417 0.402 0.406 0.394 0.400 0.388 0.384 0.415 1.000
Pairwise correlations Y is observed value,
others are the various forecasts
12
(No Transcript)
13
Time Trends
Left - average observed wind speed per day Right
same, but smoothed to average over 3-week
interval
14
Time Trend Troubles
  • Higher winds in summer, lower in winter (note
    that this appears to be an odd trait of the
    northwest)
  • Need model to reflect seasonal patterns
  • Would still like to just have a simple model
    based on forecasts

15
A Recap Of All The Things That Make Life
Interesting and Miserable
  • Distribution not normal
  • Time
  • Forecasts not very highly corellated with
    observations
  • Zeroes

16
What to do about distributions?
  • Model using another distribution Gammas and
    Weibulls are popular models for windspeed
  • Can apply a transformation to the data

Left Root of forecast windspeed from model
1 Right Root of observed windspeed (excluding
zeroes for easier visualization)
17
What to do about time?
  • Rather than using all available data as training
    set, only train on recent data
  • Trade-off between lower variance of estimates
    with more data, and better picture of current
    trends with less data
  • Previous research (on temperature and pressure)
    indicates 40-day window of training data seems to
    give a good balance

18
What to do about the forecasts?
  • Were not making the forecasts, just using them
  • We can apply a bias-correction by performing a
    linear regression of observed on forecasts (this
    is commonly done in forecasting already)
  • We can see from our weight terms which models
    perform better and which perform worse, and
    report that to the folks making the forecasts
  • We can hope that the science of meteorology
    continues to move forward as it has thus far

19
What to do about zeroes?
  • We need a model that includes a point mass at
    zero
  • Two main possibilities
  • We could model a weighted average of eight
    distributions, each of which is a normal plus a
    point mass at zero
  • Or, we could first model probability of zero or
    non-zero, then, conditioned on non-zero, the
    weighted average of eight normals
  • We will pursue the second option for now

20
So, lets get to it then
  • Probability of zero can be modeled by a logistic
    regression on the eight forecasts
  • Then, the weighted average of normals can be
    determined by the EM algorithm
  • Assume each normal has the bias-corrected
    forecast as its mean, and has a constant variance
  • Alternate between predicting membership based on
    weights and variances, and weights and variances
    based on membership
  • Make sure to also include probability of being
    non-zero when evaluating our functions

21
Lets try out a simple test case first
  • Generate a sample of 100,000 ordered triples
    (x1, x2, y), x1 uniform over 30 to 50, x2 uniform
    over 10 to 20
  • logistic regression coefficients of a10, b1-.2,
    b2-.6
  • with probability determined by logistic
    regression, y0
  • otherwise, with probability .6, y is normal
    around x1 with sd of 1, and with probability .4
    is normal around x2 with sd of 3.14

22
How did we do?
  • Predicted logisitic coefficients of a10.257,
    b1-0.206, b2-0.600
  • weights of 0.598 and 0.402
  • sds of 1.000 and 3.414
  • Seems to be able to model pretty well under ideal
    artificial conditions
  • So now lets try the real thing

23
How do we do?
RMSE from creating forecasts for 33 days, using
40 day training periods black is without
modeling zeroes, red is with modeling zeroes
24
Iterations to convergence again, black is without
modeling zeroes, red is with
25
How do we feel about this?
  • Including modeling of zeroes doesnt appear to
    help our error much (p0.4734), which is somewhat
    disappointing
  • However, we get our model much faster
    (p2.104e-08), which is a concern when having to
    do a lot of these

26
What We Would Have Done Next Had We Not Been
Distracted By More Pressing Matters
  • Consider fitting different distributions rather
    than using a transformation
  • Fit a model with point masses at zero for each
    individual component
  • Try additional bias corrections
  • Compare results for different training windows
  • Investigate importance of starting values in EM
    algorithm
  • Evaluate performances of prediction intervals
    rather than just prediction means

27
Three Months Later
  • In the distraction interim, precipitation data
    became available
  • Precipitation forecasts tend to be better than
    wind forecasts
  • Weather people tend to be more interested in
    precipitation forecasts
  • And so, we have abandoned (for now) wind speed,
    and are looking at precipitation instead

28
How is rain different?
  • Distribution doesnt look normal, even under
    transformations
  • Models do predict zero, but we still see point
    masses at zero

29
Conditional Histograms
Observed given forecast from 1.5 to 4
Observed given forecast from 55 to 80
in .01
30
Fitting Gammas
Shape Parameter Rate Parameter
Coef. Of Variation
31
Somethings Not Quite Right
Sample Mean Estimated Mean
32
At least something worked out nicely
Proportion of Zeroes
33
And Now?
  • First, find out whats going funny with our gamma
    fitting
  • Then, try to come up with a way to do some sort
    of gamma fitting in the EM algorithm
  • Then, look at all those things we wanted to look
    at before
Write a Comment
User Comments (0)
About PowerShow.com