Release of Statistical Data and Metadata Exchange SDMX Standards, Version 1'0 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Release of Statistical Data and Metadata Exchange SDMX Standards, Version 1'0

Description:

The World Bank. Initial meeting in June, 2002. Version 1.0 Standards Released ... Case Study Examination of XML, registry, and web services technologies ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 31
Provided by: agre9
Category:

less

Transcript and Presenter's Notes

Title: Release of Statistical Data and Metadata Exchange SDMX Standards, Version 1'0


1
Release of Statistical Data and Metadata Exchange
(SDMX) Standards, Version 1.0
  • Arofan Gregory
  • Aeon LLC

2
Outline
  • Background
  • Business Scope and Requirements
  • Technical Design Approach
  • SDMX Version 2.0 and Beyond

3
Background
  • SDMX is a joint initiative of seven international
    and regional organizations
  • Bank for International Settlements (BIS)
  • European Central Bank (ECB)
  • European Statistical Commission (EUROSTAT)
  • International Monetary Fund (IMF)
  • Organization for Economic Cooperation and
    Development (OECD)
  • United Nations (UN)
  • The World Bank
  • Initial meeting in June, 2002
  • Version 1.0 Standards Released September 30, 2004
  • Will be put forward to ISO for international
    standardization

4
Goal
  • To explore common e-standards and ongoing
    standardization activities that could allow us to
    gain efficiency and avoid duplication of effort
    in our own work and possibly for the work of
    others in the field of statistical information.

5
Initial Projects
  • Case Study Examination of XML, registry, and
    web services technologies
  • Batch Data Exchange EDIFACT and XML formats for
    exchange of large databases
  • Metadata Common Vocabulary Harmonized
    definitions of common terms and statistical
    concepts
  • Metadata Repositories Metadata reporting and
    resources on the web

6
Clarification
  • Statistical Information refers primarily to
    aggregated statistical data and metadata
  • Not survey results
  • Not quality assurance test results
  • In future, this may be expanded to include
    microdata/raw data
  • Not currently in scope

7
Examples
  • Financial and economic data
  • Trade data
  • Development data
  • Health data
  • Education
  • Environmental
  • Population
  • And so on

8
Statistical Information Flows
www.z.orgwww.hub.org
180 Countries
Internet, Search, Navigation
www.y.org
www.x.org
9
Challenges
  • Duplication of data
  • Multiplicity of formats
  • Timeliness of data reporting
  • Quality of reported data
  • Pushing data is inefficient
  • Inconsistent or missing metadata
  • Lack of semantic agreement

10
Solutions
  • Step One Standardize and information model and
    formats
  • Step Two Standardize architecture, services, and
    metadata

11
SDMX v 1.0 Package
  • Framework Document
  • Information Model (UML Conceptual Design)
  • SDMX-ML
  • SDMX-EDI
  • Implementors Guide for Format Standards
  • Web Services Guidelines
  • 200 pages of comments from public review in
    summer of 2004

12
Overall Technical Approach
  • Model-driven
  • The SDMX information model is a meta-model
  • All formats are derived from the information
    model, and are equivalent
  • Requirements-driven
  • Different formats created for different use cases
  • Different but consistent formats for each domain

13
Meta-Models
  • Each domain uses one or more models for their
    data
  • The data is just tables of numbers
  • Common structure
  • Each model describes how metadata is attached to
    that domains structure
  • A meta-model describes how the domain models can
    be described

14
Key Families
  • The information model describes how metadata is
    attached to multi-dimensional cubes of data
  • The structural description is termed a key
    family
  • Each axis has an associated concept and
    representation (dimensions)
  • Additional metadata can be attached and
    represented at different levels

15
Key Family Example
16
SDMX Formats
  • SDMX-EDI
  • EDIFACT format for describing key families,
    codelists, and concepts
  • EDIFACT format for generically describing data
  • SDMX-ML
  • XML schema for key families, concepts, codelists
  • XML schema for generically describing data
  • XML schema for shared constructs
  • XML schema for common administrative data
  • XML schema for data and metadata queries
  • PLUS

17
Key-Family Specific SDMX-ML Formats
  • Utility Schemas (typical XML schemas for
    validation and guided tools)
  • Compact Data Schemas (Large databases, full and
    partial datasets, incremental updates)
  • Cross-Sectional Schemas (non-time-series data)
  • Each meets a different use case

18
Key-Family-Specific Schemas
  • Each domain model (key family) is mapped to a
    namespace which is owned by the creator of the
    key family
  • Mappings are made from a standard XML expression
    of the model in a standard fashion
  • If you can process the key family XML, you can
    predict exactly what each derived schema will
    look like
  • If you can predict what each schema looks like,
    you can generate a lot of the code needed to
    process it

19
Key Family X
Compact Data XML Instance of X
Compact Data XML Schema For X
(structures)
(equivalent)
(derived)
Utility Data XML Instance of X
Utility Data XML Schema For X
(structures)
(derived)
(equivalent)
(derived)
Key Family In Generic Structure XML
Cross-Sectional Data XML Instance of X
Cross-Sectional Data XML Schema For X
(structures)
Data in Generic Data XML
(equivalent)
20
An Example
ltDimension nameFrequency/gt
KEY FAMILY
UTILITY INSTANCE
ltKeygt ltFrequencygtQlt/Frequencygt lt/Keygt
COMPACT INSTANCE
ltKey FrequencyQ OtherDim1x /gt
21
Other Info About SDMX-ML
  • Venetian Blind style generally used
  • Type-rich schemas
  • A pinch of Garden of Eden
  • OASIS UBL Naming and Design Rules
  • Not slavishly followed, but used in most cases
  • XML Namespaces used to package schema modules
  • We did use substitution groups, though
  • Emphasis on simplicity
  • As simple as possible and no simpler by use case

22
Early Adopters - examples
  • Federal Reserve (many financial data sets)
  • UN/TRADECOM (commodity trade data)
  • NAWWE (national accounts data)
  • External Debt Joint Hub (external debt)

23
Web Services Guidelines
  • Suggested set of services for
  • Obtaining metadata
  • Obtaining data
  • Advocates use of WS-Interoperability profiles
    for
  • SOAP
  • WSDL
  • Will be expanded in version 2.0

24
Starter Toolkit (v 1.0)
  • Simple freeware tools
  • Key family creation and management
  • SDMX-ML ? ? SDMX-EDI transforms
  • Key Family ? Standard schema transforms
  • Transforms between different types of XML for a
    single key family
  • Data publishing tools (to HTML, CSV)
  • Data validation tools
  • Data creation tools
  • Conformance testing tools

25
Version 2.0 and Beyond
  • SDMX Content Standards
  • SDMX Core Statistical Concepts A set of
    universal concepts and rules for their use in key
    families (eg, Frequency, Reference Country) plus
    a system for describing domain core concept
    sets
  • Metadata Common Vocabulary Harmonized
    definitions of terms and concepts
  • SDMX Core Statistical Subject-Matter Domains A
    harmonized categorization of all statistical
    domains
  • These will be published and maintained by SDMX,
    not put forward to ISO

26
Version 2.0 and Beyond (cont.)
  • SDMX Registry Services Standard services
    interfaces for registration, navigation, and
    querying of SDMX registries
  • SDMX Web Services Specifications for creating
    interoperable web services using SDMX standards
  • SDMX Pure Metadata Reporting Formats for
    metadata reporting independent of data reporting
    flows
  • Enhanced formats for existing data and metadata
    formats
  • Will also have a starter toolkit, including a
    registry implementation based on FreebXML
    Registry/Repository

27
SDMX Reference Implementation
Creditor Data
Debtor Database
Joint External Debt Hub
SDMX Registry
Creditor Data
Creditor Data
Creditor Data
28
Target Timeline
  • SDMX version 1.0 standards available now
  • Toolkit for v 1.0 over next 6 months
  • Version 2.0 standards Q2/3 of 2005
  • Version 2.0 toolkit Q3/4 of 2005

29
Summary
  • Increased access to data more usable
  • Increased efficiency in processing
  • Greater transparency through metadata
  • Reduce reporting errors, higher quality
  • Version 2.0
  • Process efficiency gain (pullnot push)
  • Greater visibility through registry

30
For More Information
  • SDMX website http//www.sdmx.org
  • Sign-up for e-alerts on site
  • Join contact group for public reviews
  • Questions stuart.feder_at_bis.org,
    agregory_at_aeon-llc.com
Write a Comment
User Comments (0)
About PowerShow.com