Sei-Young Park - PowerPoint PPT Presentation

About This Presentation
Title:

Sei-Young Park

Description:

www.emc.ncep.noaa.gov – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 44
Provided by: noaa175
Category:
Tags: gpsmet | park | sei | young

less

Transcript and Presenter's Notes

Title: Sei-Young Park


1
Quality Control and the application of Cross
Validation in the Real-Time Mesoscale
Analysis(RTMA) system
Sei-Young Park
Sei-Young Park
KMA/NWPD, NCEP/EMC
Manuel Pondeca, Jim Purser, David Parrish, Geoff
Dimego John Derber, Xiujuan Su, Wan-Shu Wu, Geoff
Manikin
NCEP/EMC
sypark_at_kma.go.kr
2
Contents
  • Introduction of RTMA
  • Quality control in RTMA
  • Gross error check
  • Variational QC
  • Use list vs. Reject list
  • Cross Validation
  • Hilbert curves
  • Summary and conclusion

3
Real-Time Mesoscale Analysis (RTMA)
  • The RTMA is a fast-track, proof-of-concept effort
    intended to
  • leverage and enhance existing analysis
    capabilities in order to generate experimental
    CONUS-scale hourly NDFD-matching analyses
  • establish a real-time process that delivers a
    sub-set of fields to allow preliminary
    comparisons to NDFD forecast grids
  • also provide estimates of analysis uncertainty
  • establish benchmark for future AOR (Analysis Of
    Record) efforts
  • build constituency for subsequent AOR development
    activities

4
Real-Time Mesoscale Analysis (RTMA)
  • Procedure
  • Temperature dew point at 2 m wind at 10 m
  • RUC forecast/analysis (13 km) is downscaled by
    GSD to 5 km NDFD grid
  • Downscaled RUC used as first-guess in NCEPs
    2DVar analysis of ALL surface observations
  • Estimate of analysis error/uncertainty
  • Precipitation NCEP Stage II analysis
  • Sky cover NESDIS GOES sounder effective cloud
    amount
  • Logistics
  • Hourly within 30 minutes
  • 5 km NDFD grid in GRIB2
  • Operational at NCEP Q3 FY2006
  • Distribution of analyses and estimate of analysis
    error/uncertainty via AWIPS SBN as part of OB7.2
    upgrade end of CY2006
  • Archived at NCDC

5
Quality control in RTMA
QC is very important when using the high density
and unverified new data. This is one of the
reasons why the Mesonets have not been used,
despite their high data density. Therefore,
applying QC with reasonable methods is the first
step to using these data in the analysis system.
  • 1. Gross error check decided by the observation
    increment (residual)
  • 2. Variational QC
  • 3. Use list vs. Reject list of Mesonet Wind
  • Analysis will be concentrated on the Mesonet wind.

temperature
wind
6
1. Gross Error Check
Obs vs. Anal
Obs vs. Guess
Limit (o-a)/R 10
Limit (o-a)/R 5
7
2. Variational QC
(Y-Hxb )
The distributions of departure
often reveal a more frequent occurrence of large
departures than expected from the corresponding
Gaussian (normal) distribution with the same mean
and standard deviation-showing as wide Tails.
By Erik Andersson, 1999,2006
8
Variational QC
By Erik Andersson, 1999,2006
9
Variational QC
By Erik Andersson, 1999,2006
10
Var QC weight function vs. IV( A 0.08 for
Metar, Synoptic sea and land )
A 0.08(288)
A 0.1(288)
A 0.06(288)
11
Distribution of the innovation (VarQC)
Obs vs. Aanl
Obs vs. Guess
Limit (o-a)/R 10
Limit (o-a)/R 5
12
3. Uselist of Mesonet wind
  • Mesonets comprise majority of obs but they are
    not as good as other conventional sfc obs sources
  • 5/6 of all Mesonet data are from AWS which
    includes most school sites and APRSWXNE(citizens
    network)
  • No mesonet winds used in current RUC (or NAM) due
    to slow wind bias.
  • GSD has constructed a Uselist of acceptable
    networks based on overall siting strategies etc.
  • It depends on the Mesonet
    provider name.
  • GSD Uselist was applied in the RTMA and has been
    running on the parallel system.
  • Continuing need for scrutiny of mesonet quality

Provider name OK-Meso Oklahoma
Mesonet WT-Meso West Texas Mesonet APG
U.S. Army Aberdeen Proving Grounds CODOT
Colorado Department of Transportation FLDOT
Florida Dep of Transportation INDOT
Indiana Dep of Transportation MNDOT
Minnesota Dep of Transportation DCNet
DCNet GoMOOS Gulf of Maine Ocean Observing
System GPSMET ESRL/GSD Ground-Based
GPS NOS-PORT National Ocean Service Physical
Oceanographic Real-Time System RAWS
Remote Automated Weather Stations
MesoWestAGRIMET
U.S. Bureau of Reclamation MesoWestAQ NOAA
Air Resources Laboratory Special Operations and
Resource Division MesoWestARL FRD
NOAA Air Resources Laboratory
Field Research Division MesoWestARL SORD NOAA
Air Resources Laboratory Special Operations and
Resource Division MesoWestDOERD Department
of Energy Office of Repository Development MesoWes
tDUGWAY U.S. Army Dugway Proving
Grounds MesoWestITD Idaho Transportation
Department MesoWestMT DOT Montana Dep. of
Transportation MesoWestTOOELE U.S. Army
Desert Chemical Depot, Tooele County
13
Number distribution of wind data (U)
For Var QC Var_pg0.05, wgtlim0.25, Gross10 m/s
With uselist
Without uselist
250
4000
1000
4500
14
Number distribution of wind data (V)
For Var QC Var_pg0.05, wgtlim0.25, Gross10 m/s
With uselist
Without uselist
250
4000
1000
4500
15
Verification of the Uselist
  • 2006.5.23.00.2006.6.14.23. (23days, hourly)

without uselist with uselist
BIAS -0.687 0.046
RMSE 2.046 1.737
16
Uselist of Mesonet wind
VarQC
CASE 1 2006.3.14.15 UTC
With uselist
Without uselist
All obs data
All obs data
17
VarQC
Uselist of Mesonet wind
CASE 1 2006.3.14.15 UTC
With uselist
Without uselist
18
Uselist of Mesonet wind
VarQC
CASE 2 2006.11.25.12 UTC
With uselist
Without uselist
All obs data
All obs data
19
Uselist of Mesonet wind
VarQC
CASE 2 2006.11.25.12 UTC
With uselist
Without uselist
20
4. Reject list of Mesonet wind
  • Rejest list constructed by the rejected data in
    gross error check and VarQC
  • - hourly made and updated
  • - It depends on the station
    name.

station name lat lon 1
MLGC1 x 32.880 243.570 2 FHCC1 x
32.990 243.930 3 LTHC1
34.020 243.810 4 BPNC1 x 34.380
242.310 5 PIVC1 x 35.450
241.720 6 INTC1 x 36.120
242.910 7 QTWA3 x 36.580
246.270 8 TS037 x 36.620
241.790 9 QBRA3 x 36.790 246.240
10 BADU1 x 37.150 246.050 11
HP001 32.890 243.580 12 GDSN2
x 35.810 244.530 13 A36 x
36.540 244.460 14 AR221 x
34.190 243.290 15 AR745 34.500
242.680 16 C6728 34.840
240.920 17 H0099 x 34.380 242.400
18 PHELN 34.450 242.370 19
HSPRA 34.450 242.680 20 APPLE
34.510 242.820
21
Distribution of rejected data
2006.6.8.2006.6.20. (13 days)
25 50 (16.4)
0 25 (81.4)
50 75 (1.8)
75 100 (0.6)
22
Distribution of rejected data
2006.11.21.12.
2006.11.21.19.
23
Verification of the reject list
24
Verification of the reject list
2006.11.10.2006.11.17. (8 days)
25
Verification of the reject list
2006.11.10.2006.11.17. (8 days)
26
Estimates of RTMA Analysis Accuracy
Cross-Validation (CV)
  • NWP data assimilation gauges the quality of
    initial conditions via model forecast skill.
  • Cross-validation is really only way to verify
    analysis for analysis sake
  • Withhold small percentage observations from
    analysis (10)
  • Validate analysis at those withheld obs
  • Measure ability of analysis system to reproduce
    their values
  • Now built into GSI
  • Can withhold and internally compare analysis
  • Baseline CV also computed internally based on a
    simple single-pass Cressman analysis scheme
  • Future performance metrics will be based on
    improvement over this Baseline

Ordering of the each type ps, t, q, uv, spd
10 withhold
27
Surface Obs Stations
28
A. CTRL
Number of training set 10 (10) 1st set
Number of test set 10 (10) 1st set
29
What is the Hilbert Curves?
  • It is an example of a "space-filling" curve
    discovered by David Hilbert in the early 1900's.
  • It literally covers every point in a
    square. Like all good fractals, it is generated
    in iterations.

Iteration 0 iteration 1 iteration 2
30
Why should be the Hilbert curves for Cross
Validation?
  • "A space-filling Hilbert curve provides an
    efficient and convenient tool for arranging
    randomly located data in a serial ordering from
    which it is then possible to draw multiple
    non-overlapping subsets of data, each subset
    tending to be more evenly distributed in space
    than the complete dataset.  Each such subset can
    be used as independent validating data for a
    corresponding analysis that uses only the
    complement of that subset. In this way, a
    cross-validation of the parameters defining the
    covariance models can be carried out and the
    parameters optimized"

31
Random and tanhx distribution
By Jim Purser
32
Number of test set 10 (10) 1st set
Number of training set 10 (10) 1st set
A. CTRL
33
Comparison of the three test sets
A. CTRL
B. Hilbert_curve
34
Var QC vs. No Var QC for CV
35
Reject list vs. no-list ( No CV)
Ctrl without reject list Exp with reject list
2006.11.20.2006.11.26. (7 days)
-0.275 -0.266
2.6973 2.7005
36
Reject list with CV
Ctrl without reject list Exp with reject list
2006.11.20.2006.11.26. (7 days)
-0.099 -0.093
1.8941 2.0049
37
Anisotropic background error parameter
Rltop Function correlation length In the
anisotropic background error covariance model,
the pattern is very sensitive to the parameter.
Rltop 500
Rltop 250
Rltop 900
isotropic
38
Anisotropic background error parameter tests
BIAS distribution for 10 test sets (WIND)
  • Considered Parameters
  • Scale length
  • Function correlation length

RMSE distribution for 10 test sets (WIND)
39
Anisotropic background error parameter tests
Mean bias of 10 test sets in experiments
1 isotropic 2 w1.0_t1.0_w900_t100 3
w1.3_t1.0_w900_t100 4 w1.6_t1.0_w900_t100
5 w1.0_t1.0_w500_t100 6 w1.3_t1.0_w500_t100
7 w1.6_t1.0_w500_t100 8
w1.0_t1.0_w500_t500 9 w1.3_t1.0_w500_t500 10
w1.6_t1.0_w500_t500
iso
exp9
exp7
40
Analysis Increment (U-wind)
anl-ges (iso)
anl-ges (exp9)
anl(iso)-anl(exp9)
Shaded smoothed terrain Solid analysis
increment
41
Analysis Increment (U-wind)
anl-ges (iso)
Shaded smoothed terrain Solid analysis
increment
anl-ges (exp9)
anl-ges (exp7)
anl(iso)-anl(exp9)
anl(iso)-anl(exp7)
42
Summary and conclusion
  • RTMA - Phase I of AOR
  • - leverage and enhance
    existing analysis capabilities in order to
    generate experimental CONUS-scale hourly
    NDFD-matching analyses
  • - establish a real-time
    process that delivers a sub-set of fields to
    allow preliminary comparisons to NDFD forecast
    grids
  • QC in RTMA By gross error check, VarQC and
    reject list, the efficient QC could be done for
    Mesonet wind data.
  • Cross Validation As the accurate validation
    methods, the Cross validation is built in GSI and
    tested in RTMA.
  • Hilbert curves For getting the homogenous test
    sets, Hilber curves was introduced and
    successfully implemented in the RTMA system.
  • Anisotropic background error It was shown that
    the CV could be carried out to define the proper
    parameters.

43
Thank you !!
Write a Comment
User Comments (0)
About PowerShow.com