The Data Documentation Initiative DDI - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

The Data Documentation Initiative DDI

Description:

With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy ... University of Surrey, United Kingdom. Swedish Social Science Data Service (SSD) ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 33
Provided by: sssa7
Category:

less

Transcript and Presenter's Notes

Title: The Data Documentation Initiative DDI


1
The Data Documentation Initiative (DDI)
A Metadata Specification for Social Science Data
  • Ron Nakao
  • Social Science Data and Software (SSDS)
  • Stanford University Libraries
  • With input from Gretchen Gano, Sanda Ionescu, Jim
    Jacobs, Nancy McGovern, Wendy Thomas, Mary
    Vardigan
  • Presented to the DLF Fall Forum 2007 -
    Philadelphia, PA

2
Presentation Overview
  • What is the DDI?
  • What is the DDI Alliance?
  • A taste of the DDI specification
  • Futures

3
What is the DDI?
The Data Documentation Initiative (DDI) is an
effort to establish an international XML-based
standard for the content, presentation,
transport, and preservation of documentation for
datasets in the social and behavioral sciences.
4
What is the DDI Alliance?
  • Host Institutions
  • Member Institutions
  • Organization Structure
  • Director
  • Steering Committee
  • Expert Committee
  • Working Groups

5
DDI Alliance Host Institutions and Associations
  • Inter-University Consortium for Political and
    Social Research (ICPSR) - b.1962
  • Roper Center for Public Opinion Research -
    b.1946
  • Council of European Social Science Data Archives
    (CESSDA) - b.1976
  • International Federation of Data Organizations
    (IFDO) - b.1977
  • International Association of Social Science
    Information, Service, and Technology (IASSIST) -
    b.1974

6
DDI Alliance Member Institutions (30)
  • University of Alberta, Canada
  • University of California, Berkeley --
    Computer-Assisted Survey Methods Program and
    UCDATA
  • University of California, California Digital
    Library
  • Centre for Survey Research and Methodology (ZUMA)
  • Centro De Investigaciones Sociologicas (CIS),
    Spain

7
DDI Alliance Member Institutions (30)
  • CEPS/INSTEAD -- Luxembourg
  • Danish Data Archive
  • Data Archiving and Networked Services (DANS), The
    Netherlands
  • Emory University
  • Finnish Social Science Data Archive
  • German Socio-Economic Panel Study (SOEP)
  • University of Guelph, Canada
  • Harvard-MIT Data Center

8
DDI Alliance Member Institutions (30)
  • Inter-university Consortium for Political and
    Social Research (ICPSR)
  • Massachusetts Institute of Technology (MIT)
  • University of Minnesota
  • National Opinion Research Center (NORC)
  • Norwegian Social Science Data Service (NSD)
  • Open Data Foundation, Tucson, Arizona
  • Princeton University
  • Roper Center
  • Stanford University

9
DDI Alliance Member Institutions (30)
  • University of Surrey, United Kingdom
  • Swedish Social Science Data Service (SSD)
  • Swiss Data Archive for the Social Sciences
    (SIDOS)
  • United Kingdom Data Archive (UKDA)
  • University of Wisconsin
  • World Bank, Development Data Group (DECDG)
  • Yale University
  • Zentralarchiv fuer Empirische Sozialforschung,
    University of Koeln

10
DDI Alliance Structure
Technical Implementers Committee
Director
Controlled Vocabularies
Steering Committee
Expert Committee (Voting members observers)
Qualitative Data
Outreach and Usability
Aggregate, Geography Time
Comparative Data/Families of Datasets
Instrument Documentation
11
What is the DDI specification?
  • A word about social science data codebooks
  • DDI 1 2
  • Data Life Cycle DDI 3

12
Heres some Data
00100 1 D10 99990049241004701500497830050237005
10840052982005469900556410057759005778500587170059
27900623000064131006485900687180071076007231300750
33007655300797310080552009763501397030165893019077
50227186023402202644340325362000000076292 00100
1A D10 99990033952003440300347790035055003531300
35740003639400370300037711003842100390550039919004
02370040842004116900420870042662004290100430550041
49200406990040805004109200445230044754004967200503
59004810500473330049092000000076292 00100 1B D10
9999000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
00000000000034140006909001011300112370011405001090
400108220011315001213576292 00100 1C D10
99990001560000165800016720001713000177700018910001
84500018800002278000231300025570003250000357000041
59000379500039400004197000541900063730005748000648
80006726000769700068960006867000744100108290014778
00206060021973001933376292 00100 1D D10
99990013729001095400133320013469001399400153510016
46000167310017770001705100171050016110001849300191
30001989500226910024217002399300256050029313003254
40033021004543200813750104159012242501545930160235
01856730242982000000076292 00100 1 SS0
99990049241004701500497830050237005108400529820054
69900556410057759005778500587170059279006230000641
31006485900687180071076007231300750330076553007973
10080552009763501286790152799015814201855550199905
02276000267854000000076292 00100 1A SS0
99990033952003440300347790035055003531300357400036
39400370300037711003842100390550039919004023700408
42004116900420870042662004290100430550041492004069
90040805004109200410120041220004117500411330041092
00407390040416000000076292 00100 1B SS1
99990000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
00000000003414200636340093148009314900931490093148
00931500093147009315076292 00100 1C SS1
99990015603001657900167150017129001777400189140018
44800187960022777002312800255730032498003570200415
87003795000394040041972005418500637250057480006488
30067257007696800635140063247006168100884440126240
01773610180887014839476292 00100 1D SS0
99990013729001095400133320013469001399400153510016
46000167310017770001705100171050016110001849300191
30001989500226910024217002399300256050029313003254
40033021004543200749520095939010148401262630136874
01598100200035000000076292 0010334 I62
99990000000000000000000000001672000176100018370001
95300020640002163000213300022350002390000251000026
81000287000031310003447000373900040220004302000469
20005100000548900061410006972000794100088590010000
00115010012998000000076292 0010334 X51
99990000000000000000000000000000000005300000430000
06300000570000048-00001400000480000069000005000000
68000007000000910000101000008500000760000070000009
10000087000007600001190000135000013900001160000129
00001500000130000000076292 0010664 62
99990000000000000000000000000000000000000033120003
35000033950003497000363400037880003913000402700041
32000428100044550004659000488800051370005354000559
10005878000623000065980006982000765100088210010000
00111020012344000000076292 0010664 X51
99990000000000000000000000000000000000000000000000
01100000130000030000003900000420000033000002900000
26000003600000410000046000004900000510000042000004
40000051000006000000590000058000009600001530000134
00001100000112000000076292 0010770 D32
99990005345000546300056610007738000744500075540007
77700084850009454001015000096630010255001145900119
76001251800137430015396001671200183600019273002152
40024664002837200317350037681005242500773110079612
00906610102778000000076292 0010771 D32
99990006061000595700059030008174000805100076720007
96600089170009847001082200101320010723001201600125
75001327600144600016219001761600193890020342002265
30025817002971100332080038920005361100785560081405
00923040105775000000076292 0010774 D12
99990000000000000000000000000000000000000040390003
97900039840004087000418100040620004023000407300040
90000407000041180004202000426600043580004367000433
50004483000473000049890005426000667100091760010000
00101620011100000000076292 0010775 D12
99990000000000000000038320004818000467700043400004
28800043290004444000457500042740004156000419700041
73000413000041810004270000432300043880004385000434
20004435000477100050060005398000657900092430010000
00101060011001000000076292 0010776AXD61
19750000000000000000000000000000000000000000000000
00000000000000000000057000005190000512000051200004
92000048200005120000547000053600005530000518000051
40000555000057500005420000618000095200012160001000
00011230001353000128976292 0010776IAD61
19750000206000020300002430000277000020300001660000
15900001580000170000025100001710000145000015300001
42000014500004150000286000010300000910000099000009
70000164000018300002210000362000047000014610001000
00005640000395000038176292 0010776IAZ 2
99990000423000041600004980000567000041700003410000
32600003240000348000051500003500000297000031400002
91000029800008500000587000021200001860000203000019
80000337000037500004530000743000096300029960002050
00011570000810000078176292
13
And Heres the Codebook
14
A digital Codebook (pdf)
15
Evolution of the DDI
  • Concept of DDI and definition of needs grew out
    of the data archival community
  • 1995 - DDI efforts initiated by ICPSR
  • 1997 - XML DTD released
  • 2000 - DDI 1.0 released
  • 2003 - DDI 2.0 released
  • - DDI Alliance formed
  • 2007 - DDI 3.0 Candidate Draft Release
  • 2008 - DDI 3.0 Final Release

16
DDI Early Development
  • 2000 DDI 1.0
  • Simple survey
  • Archival data formats
  • Microdata only
  • 2003 DDI 2.0
  • Aggregate data (based on matrix structure)
  • Added geographic material to aid geographic
    search systems and GIS users

17
DDI versions 1 2
  • Document Description
  • Study Description
  • Data Files Description
  • Variable Description
  • Other Study-Related Materials

18
DDI 3 The Data Life Cycle
19
Capturing the Data Life Cycle
  • Study Unit

- Research question - Funding - Concepts -
Background research
20
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection

- Instrument - Data collection process -
Questionnaire
21
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection
  • Logical Product

- Intellectual content of data - Relationship to
questions and concepts - Relationship to
processing (recodes, weighting, derivations,
imputations)
22
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection
  • Logical Product
  • Physical Data Product

- Describes the structure (microdata,
tabular,aggregate, Ncube)
23
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection
  • Logical Product
  • Physical Data Product
  • Physical instance

- Each describes a single data file (e.g., Census
data by state...each state is an instance)
24
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection
  • Logical Product
  • Physical Data Product
  • Physical instance
  • Instance (METS-inspired)
  • An instance module wraps the other modules.
    Like a table of contents to a group of studies
    and files and modules it brings everything
    together.

25
Capturing the Data Life Cycle
  • Study Unit
  • Data Collection
  • Logical Product
  • Physical Data Product
  • Physical instance
  • Instance
  • Archive

- Each archive can add its own local information
with an archive module.
26
Capturing the Data Life Cycle
  • Group module
  • Describe concepts, questions, and variables that
    occur in several studies.
  • Describe a series (e.g., CPS, Eurobarometer)
  • - Describe a collection of studies (not a series)
    and identify the common comparable concepts,
    questions and variables.

27
Capturing the Data Life Cycle
  • Group module
  • Comparative module
  • The Comparative module contains information for
    comparing concepts, questions, and variables
    between or among Study Units that have been
    housed in a Group.

28
DDI 3.0 Geography Example
lt?xml version"1.0" encoding"UTF-8"?gt ltrCoverage
xmlnsr"ddireusable0_1" xmlnsxsi"http//www.
w3.org/2001/XMLSchema-instance"
xmlnsxhtml"http//www.w3.org/1999/xhtml"
xsischemaLocation"ddireusable0_1
Schemas/reusable.xsd"gt ltrSpatialCoveragegt ltrIden
tificationgt ltrIDgtGEOCOVlt/rIDgt ltrIdentifyingAgen
cygtMPClt/rIdentifyingAgencygt ltrVersiongt1.0lt/rVer
siongt lt/rIdentificationgt ltrBoundingBoxgt
ltrWestLongitudegt-177.1lt/rWestLongitudegt
ltrEastLongitudegt-61.48lt/rEastLongitudegt
ltrSouthLatitudegt13.71lt/rSouthLatitudegt
ltrNorthLatitudegt76.63lt/rNorthLatitudegt lt/rBoun
dingBoxgt ltrDescription translated"false"
translatable"true"gt ltxhtmlpgtUnited States,
Region, Division, State, County, County
Subdivision, Place, Tract/Block Numbering Area
within Place/Remainder within County
Subdivision.lt/xhtmlpgt lt/rDescriptiongt ltrSpatial
ObjectgtPolygonlt/rSpatialObjectgt ltrGeographicStru
cturegt ltrGeographygt ltrIdentificationgt ltrIDgtG001
lt/rIDgt
29
DDI - User Community
  • Data archives and libraries world-wide (e.g.,
    ICPSR, CESSDA)
  • Health Canada
  • Statistics Canada
  • World Bank
  • WHO (World Health Surveys)
  • Gallup-Europe
  • Metadata Management Toolkit (IHSN)

30
International Household Survey Network (IHSN)
  • To coordinate and improve survey collecting
    operations in developing countries
  • Developed to support the survey collection
    activities of the International Household Survey
    Network (IHSN)
  • Sponsors 18 organizations, such as ILO, UNESCO,
    World Bank, UNICEF, WHO, UNDP, Eurostat
  • Goal improve the quality of collected data and
    encourage more dissemination and long-term
    preservation
  • 100 DDI compliant

31
Futures
  • Continued development of DDI
  • Outreach, train, promote
  • Expand Alliance membership
  • Foster tools development
  • Build ties interoperability with other metadata
    specifications
  • Funding
  • ISO Standard status

32
Thats all folks! Thanks!
  • http//www.ddialliance.org/
Write a Comment
User Comments (0)
About PowerShow.com