An Overview of the Benefits of Calibration Using Reforecasts - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

An Overview of the Benefits of Calibration Using Reforecasts

Description:

... size -- small benefit. large training sample size ... The benefit you'll get from a much smaller training sample size is correspondingly much smaller. ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 32
Provided by: CDCN8
Category:

less

Transcript and Presenter's Notes

Title: An Overview of the Benefits of Calibration Using Reforecasts


1
An Overview of the Benefits of Calibration Using
Reforecasts
prepared for 2006 NCEP Ensemble Users Workshop,
Oct-Nov 2006
NOAA Earth System Research Laboratory
  • Tom Hamill and Jeff Whitaker
  • NOAA / ESRL, Physical Sciences Div.
  • tom.hamill_at_noaa.gov
  • Inspiration Paul Dallavalle presentation,
    Norfolk WAF/NWP meeting, summer 1996

2
NOAAs reforecast data set
  • Reforecast definition a data set of
    retrospective numerical forecasts using the same
    model as is used to generate real-time forecasts.
  • Model T62L28 NCEP GFS, circa 1998
  • Initial States NCEP-NCAR Reanalysis II plus 7
    /- bred modes.
  • Duration 15 days runs every day at 00Z from
    19781101 to now. (http//www.cdc.noaa.gov/people/j
    effrey.s.whitaker/refcst/week2).
  • Data Selected fields (winds, hgt, temp on 5
    press levels, precip, t2m, u10m, v10m, pwat,
    prmsl, rh700, heating). NCEP/NCAR reanalysis
    verifying fields included (Web form to download
    at http//www.cdc.noaa.gov/reforecast).
  • Real-time probabilistic precipitation forecasts
    http//www.cdc.noaa.gov/reforecast/narr

3
Main Points
  • Large improvement in probabilistic forecast skill
    and reliability by calibrating using large,
    stable data set of NWP forecasts / obs.
  • Generally
  • smaller training sample size --gt small benefit.
  • large training sample size --gt large benefit.
  • Improvements are larger for surface variables
    (surface temperature, precipitation) than for
    upper-air variables (Z500).
  • Use for bias correction, of course. But also
    useful for calibration of spread deficiencies,
    statistical downscaling.

4
More background in January 2006 BAMS and other
articles. Reference list provided after
conclusions.
5
Wouldnt it be nice if we could calibrate with
only a past few forecasts?
But, consider training with a short sample in a
climatologically dry region. How could you
calibrate this latest forecast?
youd like enough training data to have
some similar events at a similar time of year to
this one.
6
Calibration principles
  • Would like f(OF), that is, the pdf of the
    expected observed state given the forecast.
  • Calibration should implicitly
  • adjust for model bias
  • adjust for any spread deficiency
  • downscale (coarse prediction grid --gt predictable
    local detail in observations).

7
Analog high-resolution precipitation forecast
calibration technique
(actually run with 10 to 75 analogs)
8
Analog high-resolution precipitation forecast
calibration technique
Approximate O F
(actually run with 10 to 75 analogs)
9
Reforecasts and statistical downscaling
Downscaling using PRISM / Mountain Mapper
technology (C. Daly. Oregon St., NOAA RFCs,
OHD)
10
Recent OR-WA floods, 3-6 day forecast
11
Verified over 25 years of forecasts skill
scores use conventional method of calculation
which may overestimate skill (Hamill and Juras
2006).
?
12
Effect of training sample size
colors of dots indicate which size analog
ensemble provided the largest amount of skill.
13
Calibration of Z500, T850, T2m
  • Errors generally more well behaved than
    precipitation more normally distributed.
  • However, spread deficiency worse for T2m.

14
Calibration techniques
  • Uncalibrated PDF from raw ensemble
  • Gross Bias Correction
  • (1) Calculate Mean B F - O
  • (2) Corrected ens raw ens B
  • Analog method
  • Similar to method for precipitation, but now find
    forecast analogs using only the current grid
    points data. 50 members.
  • Wilks and Hamill (2006, MWR, to appear) found
    that many other calibration methods (e.g.
    logistic regression, non-homogeneous Gaussian
    regression) were similar in performance.

15
Verification of Z500, T850, T2m
  • Northern Hemisphere (Z500, T850) 00Z North
    American surface obs with gt 97 complete record
    from 1979 - 2004 (T2m)
  • Use continuous ranked probability skill score
    (CRPSS 0no skill, 1perfect) use method of
    calculation in Hamill and Juras (2006, Oct.
    QJRMS) to avoid overestimating skill when
    climatology varies.

16
Z500 CRPSS
17
T850 CRPSS
18
T2m CRPSS
?
19
Issues (1) should reanalyses be part of
reforecast process?
  • Want homogeneous
  • characteristics of forecasts skill the same for
    1980s forecasts as 2006 forecasts.
  • Part of better skill of current forecasts is the
    better initial condition.
  • Reanalysis would improve skill of old forecasts.
  • Reanalyses should use same or similar model as
    used in reforecasts.

20
Issues (2) Are reforecasts still necessary with
improved models?
ECMWF produced a short reforecast data
set. Calibration using their week-2 reforecasts p
roduced a skill increase of 11 for our
reforecast, skill improvement was 16
Whitaker and Vitart (2006)
21
Issues (3) NCEP proposes a single-member T126
reforecast. Is that enough?
Analog reforecast process repeated, as in prior
cartoon. But now rather than matching ensemble-mea
n pattern, match todays control forecast to
past control forecast. Grey area measures
degradation relative to baseline using
ensemble mean. Not much degradation in skill,
esp. at short leads! (and you dont even have to
run an ensemble to get a probabilistic forecast).
22
Conclusions
  • Large improvement in probabilistic forecast skill
    and reliability by calibrating using large,
    stable data set of NWP forecasts / obs.
  • The benefit youll get from a much smaller
    training sample size is correspondingly much
    smaller.
  • Improvements are larger for surface variables
    (surface temperature, precipitation) than for
    upper-air variables (Z500).
  • Calibration achieves more if you do more than a
    bias correction for the mean error.

23
References
Hamill, T. M., J. S. Whitaker, and X. Wei, 2003
Ensemble re-forecasting improving medium-range
forecast skill using retrospective forecasts.
Mon. Wea. Rev., 132, 1434-1447.
http//www.cdc.noaa.gov/people/tom.hamill/reforeca
st_mwr.pdf Hamill, T. M., J. S. Whitaker, and
S. L. Mullen, 2005 Reforecasts, an important
dataset for improving weather predictions. Bull.
Amer. Meteor. Soc., 87, 33-46. http//www.cdc.noaa
.gov/people/tom.hamill/refcst_bams.pdf
Whitaker, J. S, F. Vitart, and X. Wei, 2006
Improving week two forecasts with multi-model
re-forecast ensembles. Mon. Wea. Rev., 134,
2279-2284. http//www.cdc.noaa.gov/people/jeffrey.
s.whitaker/Manuscripts/multimodel.pdf Hamill,
T. M., and J. S. Whitaker, 2006 Probabilistic
quantitative precipitation forecasts based on
reforecast analogs theory and application. Mon.
Wea. Rev., in press. http//www.cdc.noaa.gov/peopl
e/tom.hamill/reforecast_analog_v2.pdf Hamill,
T. M., and J. Juras, 2006 Measuring forecast
skill is it real skill or is it the varying
climatology? Quart. J. Royal Meteor. Soc., in
press. http//www.cdc.noaa.gov/people/tom.hamill/s
kill_overforecast_QJ_v2.pdf Wilks, D. S., and
T. M. Hamill, 2006 Comparison of ensemble-MOS
methods using GFS reforecasts. Mon. Wea. Rev., in
press. http//www.cdc.noaa.gov/people/tom.hamill/W
ilksHamill_emos.pdf Hamill, T. M. and J. S.
Whitaker, 2006 White Paper. Producing
high-skill probabilistic forecasts
using reforecasts implementing the National
Research Council vision. Available at
http//www.cdc.noaa.gov/people/tom.hamill/whitepap
er_reforecast.pdf .
24
Daily Max Temp CRPSS
  • Notes
  • Skill much lower than T850 station data, Tmax
    trained on 00Z temp, worse model biases?
  • (2) Consistent 1-day impact of large sample size

Wilks 45-d?
25
back
26
Prior 45 days?
27
(No Transcript)
28
Bias correction using forecast and observed CDFs?
29
T2m CRPSS, low and high climatological spread
30
850 hPa temperature bias for a grid point in
the central U.S.
Spread of yearly bias estimates from
31-day running mean F-O Note the spread is
often larger than the bias, especially for long
leads.
31
Comparison against NCEP medium-range T126
ensemble, ca. 2002
the improvement is a little bit of increased
reliability, a lot of increased resolution.
Write a Comment
User Comments (0)
About PowerShow.com