Title: Release of Statistical Data and Metadata Exchange SDMX Standards, Version 1'0
1Release of Statistical Data and Metadata Exchange
(SDMX) Standards, Version 1.0
2Outline
- Background
- Business Scope and Requirements
- Technical Design Approach
- SDMX Version 2.0 and Beyond
3Background
- SDMX is a joint initiative of seven international
and regional organizations - Bank for International Settlements (BIS)
- European Central Bank (ECB)
- European Statistical Commission (EUROSTAT)
- International Monetary Fund (IMF)
- Organization for Economic Cooperation and
Development (OECD) - United Nations (UN)
- The World Bank
- Initial meeting in June, 2002
- Version 1.0 Standards Released September 30, 2004
- Will be put forward to ISO for international
standardization
4Goal
- To explore common e-standards and ongoing
standardization activities that could allow us to
gain efficiency and avoid duplication of effort
in our own work and possibly for the work of
others in the field of statistical information.
5Initial Projects
- Case Study Examination of XML, registry, and
web services technologies - Batch Data Exchange EDIFACT and XML formats for
exchange of large databases - Metadata Common Vocabulary Harmonized
definitions of common terms and statistical
concepts - Metadata Repositories Metadata reporting and
resources on the web
6Clarification
- Statistical Information refers primarily to
aggregated statistical data and metadata - Not survey results
- Not quality assurance test results
- In future, this may be expanded to include
microdata/raw data - Not currently in scope
7Examples
- Financial and economic data
- Trade data
- Development data
- Health data
- Education
- Environmental
- Population
- And so on
8 Statistical Information Flows
www.z.orgwww.hub.org
180 Countries
Internet, Search, Navigation
www.y.org
www.x.org
9Challenges
- Duplication of data
- Multiplicity of formats
- Timeliness of data reporting
- Quality of reported data
- Pushing data is inefficient
- Inconsistent or missing metadata
- Lack of semantic agreement
10Solutions
- Step One Standardize and information model and
formats - Step Two Standardize architecture, services, and
metadata
11SDMX v 1.0 Package
- Framework Document
- Information Model (UML Conceptual Design)
- SDMX-ML
- SDMX-EDI
- Implementors Guide for Format Standards
- Web Services Guidelines
- 200 pages of comments from public review in
summer of 2004
12Overall Technical Approach
- Model-driven
- The SDMX information model is a meta-model
- All formats are derived from the information
model, and are equivalent - Requirements-driven
- Different formats created for different use cases
- Different but consistent formats for each domain
13Meta-Models
- Each domain uses one or more models for their
data - The data is just tables of numbers
- Common structure
- Each model describes how metadata is attached to
that domains structure - A meta-model describes how the domain models can
be described
14Key Families
- The information model describes how metadata is
attached to multi-dimensional cubes of data - The structural description is termed a key
family - Each axis has an associated concept and
representation (dimensions) - Additional metadata can be attached and
represented at different levels
15Key Family Example
16SDMX Formats
- SDMX-EDI
- EDIFACT format for describing key families,
codelists, and concepts - EDIFACT format for generically describing data
- SDMX-ML
- XML schema for key families, concepts, codelists
- XML schema for generically describing data
- XML schema for shared constructs
- XML schema for common administrative data
- XML schema for data and metadata queries
- PLUS
17Key-Family Specific SDMX-ML Formats
- Utility Schemas (typical XML schemas for
validation and guided tools) - Compact Data Schemas (Large databases, full and
partial datasets, incremental updates) - Cross-Sectional Schemas (non-time-series data)
- Each meets a different use case
18Key-Family-Specific Schemas
- Each domain model (key family) is mapped to a
namespace which is owned by the creator of the
key family - Mappings are made from a standard XML expression
of the model in a standard fashion - If you can process the key family XML, you can
predict exactly what each derived schema will
look like - If you can predict what each schema looks like,
you can generate a lot of the code needed to
process it
19Key Family X
Compact Data XML Instance of X
Compact Data XML Schema For X
(structures)
(equivalent)
(derived)
Utility Data XML Instance of X
Utility Data XML Schema For X
(structures)
(derived)
(equivalent)
(derived)
Key Family In Generic Structure XML
Cross-Sectional Data XML Instance of X
Cross-Sectional Data XML Schema For X
(structures)
Data in Generic Data XML
(equivalent)
20An Example
ltDimension nameFrequency/gt
KEY FAMILY
UTILITY INSTANCE
ltKeygt ltFrequencygtQlt/Frequencygt lt/Keygt
COMPACT INSTANCE
ltKey FrequencyQ OtherDim1x /gt
21Other Info About SDMX-ML
- Venetian Blind style generally used
- Type-rich schemas
- A pinch of Garden of Eden
- OASIS UBL Naming and Design Rules
- Not slavishly followed, but used in most cases
- XML Namespaces used to package schema modules
- We did use substitution groups, though
- Emphasis on simplicity
- As simple as possible and no simpler by use case
22Early Adopters - examples
- Federal Reserve (many financial data sets)
- UN/TRADECOM (commodity trade data)
- NAWWE (national accounts data)
- External Debt Joint Hub (external debt)
23Web Services Guidelines
- Suggested set of services for
- Obtaining metadata
- Obtaining data
- Advocates use of WS-Interoperability profiles
for - SOAP
- WSDL
- Will be expanded in version 2.0
24Starter Toolkit (v 1.0)
- Simple freeware tools
- Key family creation and management
- SDMX-ML ? ? SDMX-EDI transforms
- Key Family ? Standard schema transforms
- Transforms between different types of XML for a
single key family - Data publishing tools (to HTML, CSV)
- Data validation tools
- Data creation tools
- Conformance testing tools
25Version 2.0 and Beyond
- SDMX Content Standards
- SDMX Core Statistical Concepts A set of
universal concepts and rules for their use in key
families (eg, Frequency, Reference Country) plus
a system for describing domain core concept
sets - Metadata Common Vocabulary Harmonized
definitions of terms and concepts - SDMX Core Statistical Subject-Matter Domains A
harmonized categorization of all statistical
domains - These will be published and maintained by SDMX,
not put forward to ISO
26Version 2.0 and Beyond (cont.)
- SDMX Registry Services Standard services
interfaces for registration, navigation, and
querying of SDMX registries - SDMX Web Services Specifications for creating
interoperable web services using SDMX standards - SDMX Pure Metadata Reporting Formats for
metadata reporting independent of data reporting
flows - Enhanced formats for existing data and metadata
formats - Will also have a starter toolkit, including a
registry implementation based on FreebXML
Registry/Repository
27SDMX Reference Implementation
Creditor Data
Debtor Database
Joint External Debt Hub
SDMX Registry
Creditor Data
Creditor Data
Creditor Data
28Target Timeline
- SDMX version 1.0 standards available now
- Toolkit for v 1.0 over next 6 months
- Version 2.0 standards Q2/3 of 2005
- Version 2.0 toolkit Q3/4 of 2005
29Summary
- Increased access to data more usable
- Increased efficiency in processing
- Greater transparency through metadata
- Reduce reporting errors, higher quality
- Version 2.0
- Process efficiency gain (pullnot push)
- Greater visibility through registry
30For More Information
- SDMX website http//www.sdmx.org
- Sign-up for e-alerts on site
- Join contact group for public reviews
- Questions stuart.feder_at_bis.org,
agregory_at_aeon-llc.com