Benchmark database based on surrogate climate records

About This Presentation

Title:

Benchmark database based on surrogate climate records

Description:

Start with homogeneous data. Multiple surrogate and synthetic realisations ... Deadline for the return of the homogeneous data. Questions. Ideas for a better benchmark ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 27

Provided by: Vic447

Category:

more less

Transcript and Presenter's Notes

Title: Benchmark database based on surrogate climate records

1
Benchmark database based on surrogate climate
records

Victor Venema

2
Goals of COST-HOME working group 1

Literature survey
Benchmark dataset
Known inhomogeneities
Test the homogenisation algorithms (HA)

3
Benchmark dataset

Real (inhomogeneous) climate records
Most realistic case
Investigate if various HA find the same breaks
Good meta-data
Synthetic data
For example, Gaussian white noise
Insert know inhomogeneities
Test performance
Surrogate data
Empirical distribution and correlations
Insert know inhomogeneities
Compare to synthetic data test of assumptions

4
Creation benchmark Outline talk

Start with homogeneous data
Multiple surrogate and synthetic realisations
Mask surrogate records
Add global trend
Insert inhomogeneities in station time series
Published on the web
Homogenize by COST participants and third parties
Analyse the results and publish

5
1) Start with homogeneous data

Monthly mean temperature and precip (France)
Later also daily data
Later maybe other variables
Homogeneous
No missing data
Detrended
20 to 30 years is enough for good statistics
Longer surrogates are based on multiple copies
Larger scale correlations are small
Distribution well defined with 30a data
Generated networks are 50, 100 and 200 a long

6
2) Multiple surrogate realisations

Multiple surrogate realisations
Temporal correlations
Station cross-correlations
Empirical distribution function
Annual cycle removed before, added at the end
Number of stations between 5 and 20
Cross correlation varies as much as possible
Show plot temporal structure of surrogates
Show plot cross correlations

7
One station with annual cycle
8
One station anomalies
9
Multiple stations 10 year zoom
10
Multiple stations 10 year zoom
11
IAAFT algorithm smoothes jumps
12
3) Mask surrogate records

Beginning of records jagged (rough)
Linear increase in number of stations
Last station after 25 of full time
End of record all stations are measuring
Influence of jagged edge on detection and
correction
But trend is also increasing in time (i.e.
different)!
Is this a problem?

13
3) Mask surrogate records
14
4) Add global trend

NASA GISS GISS Surface Temperature Analysis
(GISTEMP) by J. Hansen
Global mean surface temperature
Last year of any surrogate network is 1999

15
5) Insert inhomogeneities in stations

Random breaks (implemented)
Frequency of breaks 1/20a, 1/40a
Size constants for temperature 0.25, 0.5, 1.0 C
Size factors for rain 0.8, 0.9, 1.1, 1.2
Simultaneous breaks
Frequency of breaks 1/50a
In 10 to 50 of network

16
5) Insert inhomogeneities in stations

Outliers
Frequency 1 3
Size 99 and 99.9 percentiles
Local trends (only temperature)
Linear increase or decrease in one station
Duration 30, 60a
Maximum size 0.2 to 1.5 C
Frequency once in 10 of the stations

17
6) Published on the web

Inhomogeneous data will be published on the
COST-HOME homepage
Everyone is welcome to download and homogenize
the data

18
7) Homogenize by participants

Return homogenised data
Should be in COST-HOME file format (next slide)
Return break detections
BREAK
OUTLI
BEGTR
ENDTR
Multiple breaks at one data possible

19
7) Homogenize by participants

COST-HOME file format http//www.meteo.uni-bonn.d
e/ venema/themes/homogenisation/costhome_fileforma
t.pdf
For benchmark COST homogenisation software
One data and one quality-flag file per station
Filename variable, resolution, quality, station
ASCII network-file with station names
ASCII break-file with dates and station names

20
COST-HOME file format monthly data
21
COST-HOME file format network file
22
8) Analyse the results

Detailed analysis will be performed in the
working groups
Detection
Correction
Daily data homogenisation
Synthetic and surrogate data
RMS Error
No. breaks detected (function of size)
Application reduction in the scatter in the
trends
Performance difference between synthetic
(Gaussian, white noise) and surrogate data

23
Work in progress

Monthly precipitation
Implement some inhomogeneity types
Daily data other inhomogeneities
Synthetic data (Gaussian white noise)
More input data!
Agree on the details of the benchmark
Next meeting?
Set deadline for the availability benchmark
Deadline for the return of the homogeneous data

24
Questions

Ideas for a better benchmark
For example, for other inhomogeneities, constants
Types of inhomogeneities for daily data
Automatic processing
In the order of 100 networks

25
(No Transcript)
26
7) Homogenize by participants

COST-HOME file format http//www.meteo.uni-bonn.d
e/ venema/themes/homogenisation/costhome_fileforma
t.pdf
For benchmark COST homogenisation software
Regular ASCII matrix (columns)
One data and one quality-flag file per station
Yearly, daily, subdaily data columns for time,
one for data
Monthly data year column, 12 columns for data
Filename variable, resolution, quality, station
ASCII network-file with station names
ASCII break-file with dates and station names