Introduction to DDI 3'0 - PowerPoint PPT Presentation

1 / 155
About This Presentation
Title:

Introduction to DDI 3'0

Description:

Grouping structure documents studies related on one or several dimensions (time, ... Supports grouping and comparing studies. Supports creation of metadata registries ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 156
Provided by: san104
Category:
Tags: ddi | introduction

less

Transcript and Presenter's Notes

Title: Introduction to DDI 3'0


1
Introduction to DDI 3.0
  • Sanda Ionescu
  • ICPSR
  • CESSDA Expert Seminar, September 2007

2
DDI Version 3.0
  • Radically different.
  • More complex
  • (but certainly doable!)
  • Brings important benefits.

3
Workshop Schedule
  • 1430 1510 Overview (40)
  • 1510 1535 Structure and Technical
  • Mechanisms (25)
  • 1535 1545 Break (10)
  • 1545 1610 Study Unit Modules Content (25)
  • 1610 1630 Variable Markup Example (20)
  • 1630 1640 Break (10)
  • 1640 1710 Grouping Modules Content and
  • Examples (30)
  • 1710 1730 Getting Started (20)

4
DDI 3.0
  • Overview

5
DDI BackgroundDevelopment History
  • 1995 A grant-funded project initiated and
    organized by ICPSR proposes to create a new
    standard for documenting social science data, to
    replace OSIRIS tagged codebooks.
  • First drafts used SGML, then converted to
    Web-friendly XML.
  • 2000 DDI Version 1.0 published as a mainly
    document- and codebook-centric standard.

6
DDI BackgroundDevelopment History
  • 2003 DDI Version 2.0 published with extended
    scope
  • Aggregate data coverage (based on matrix
    structure)
  • Additional geographic representation to assist
    geographic search systems and GIS users
  • Versions 1.0 through 2.1 (latest published) are
    backwards compatible, and based on the same
    structure.

7
DDI BackgroundDevelopment History
  • February 2003 Formation of the DDI Alliance, a
    self-sustaining membership organization whose
    members have a voice in the development of the
    DDI specification.
  • http//www.ddialliance.org/

8
DDI BackgroundDevelopment History
  • Version 3.0
  • 2004-2006 Planning and Development

  • November 2006 Internal Review
  • February 2007 Public Review
  • July 2007 Candidate Draft Release
  • http//www.ddialliance.org/ddi3/index.html

9
Benefits of using DDI as an XML-based standard
  • Interoperability
  • Enables seamless exchange and reuse by other
    systems.
  • Repurposing
  • Provides a core document from which different
    types of outputs can be generated.
  • Value-added documentation
  • Tagging carries intelligence in the document by
    describing content.
  • Enhanced Data Discovery
  • Increases precision and granularity of searches.
  • Support for Data Analysis
  • Variables description is accepted as input by
    online analysis systems.
  • Multiple presentation formats
  • ASCII text PDF HTML RTF.
  • Preservation-friendly
  • Non-proprietary format.

10
Why DDI 3.0?
  • DDI 3.0 presents new features in response to
  • Perceived needs of
  • -Data users
  • -Data producers
  • -Data archivists/librarians
  • Developments in documenting and archiving data
  • Advances in XML technology

11
DDI 3.0 and the Data Life Cycle Model
  • DDI Versions 1/2 were codebook-centric
  • Closely followed the structure of traditional
    print codebooks.
  • Captured data documentation at a single, frozen
    point in time archiving.

12
DDI 3.0 and the Data Life Cycle Model
  • Version 3.0 is Life Cycle oriented
  • -Designed to cover all stages in the life cycle
    of a data collection
  • pre-production production
    post-production secondary use

13
Life Cycle Coverage in DDI 3.0
Planning for the Study Proposal / Design
Study Purpose / Outline Concepts Study
Population Author(s) Funding Sources
Version 3.1 Survey / Sample Design Pre-testing
14
Life Cycle Coverage in DDI 3.0
  • Proposal becomes reality

Data Collection methodology sampling, time,
etc. Instrument characteristics
Questionnaire Data cleaning, weighting, coding,
etc.
15
Life Cycle Coverage in DDI 3.0
  • Publishing the data

Physical representation Data format, Record
structure, Statistics.
Intellectual content Variables, Categories,
Codes.
16
Life Cycle Coverage in DDI 3.0
  • Archiving / (Re)Distributing the data collection

Processing checks
Holdings, availability and access conditions
17
Life Cycle Coverage in DDI 3.0
  • DDI becomes visible to the outside world

DDI Instance Pulls together all life cycle
stages Acquires its own identity as an
object Becomes a tool for data discovery and
analysis
18
Life Cycle Coverage in DDI 3.0
  • Secondary use of data new conceptual framework

New DDI Instance New Purpose New Logical
Product New Physical Description of Data
19
DDI 3.0 and the Data Life Cycle Model
  • Advantages of Life Cycle orientation
  • Allows capture and preservation of metadata
    generated by different agents at different points
    in time.
  • Facilitates tracking changes and updates in both
    data and documentation.

20
DDI 3.0 and the Data Life Cycle Model
  • Advantages of Life Cycle orientation
  • Enables investigators, data collectors and
    producers to document their work directly in DDI,
    thus increasing the metadatas visibility and
    usability.
  • Benefits data users, who need information from
    the full data life cycle for optimal discovery,
    evaluation, interpretation, and re-use of data
    resources.

21
New / Extended Functionalities in DDI 3.0
Questionnaire
  • Versions 1/2
  • No instrument coverage.
  • Question text only as part of variable
    description.
  • No documentation for question flow / conditions.
  • Version 3.0
  • Full description of instrument as a separate
    entity.
  • Documents specific use of questions flow,
    conditions, loops.
  • Compatible with Computer Assisted Interviewing
    software.

22
New / Extended Functionalities in DDI 3.0
Complex Data
  • Versions 1/2
  • Inadequate representation of complex /
    hierarchical data


  • Version 3.0
  • Detailed documentation for complex / hierarchical
    data
  • Logical structure of records
  • Record Types and Relationships
  • Relevant variables key-link, case
    identification, record type locator
  • Physical layout of records
  • Single hierarchical file for all records,
    multiple rectangular files, relational database,
    etc.

23
New / Extended Functionalities in DDI 3.0
Aggregate Data
  • Versions 1/2
  • Initially designed for microdata only
  • Aggregate data section added in V 2.1 to support
    limited representation (Census-type data,
    delimited files)
  • Version 3.0
  • Adds support for tabular, spreadsheet-type,
    representation of aggregate data
  • Aggregate data transport option cell content may
    be included inline with the data item description

24
New / Extended Functionalities in DDI 3.0 Data
Transport
  • Versions 1/2
  • -None
  • Version 3.0
  • -In-line inclusion enabled for both aggregate
    data
  • and microdata

25
New / Extended Functionalities in DDI 3.0
Longitudinal / Time Series / Cross-national
DataComparability
  • Versions 1/2
  • -None
  • Version 3.0
  • -Grouping structure documents studies related on
    one or several dimensions (time, geography,
    language, etc.) as well as their comparability

26
New / Extended Functionalities in DDI 3.0
Increased Multilingual Support
  • Versions 1/2
  • Limited
  • ltanytag xmllanggt
  • Version 3.0
  • Support for multiple language use and
    translations
  • ltInternationalStringType xmllang
    translated translatablegt
  • ltVariablegt
  • ltLabel xmllangger translatedfalse
    translatabletruegt
  • Geburtsjahrlt/Labelgt
  • ltLabel xmllangeng translatedtruegtYe
    ar of Birthlt/Labelgt
  • lt/Variablegt

27
DDI 3.0 Specification Schema-based
  • Versions 1/2
  • DTD-based
  • Version 3.0
  • Schema-based
  • Data typing supports machine actionability
  • Use of namespaces supports
  • Modularity
  • Extensibility and reuse
  • Alignment with / use of other standards

28
DDI 3.0 Specification Machine-actionable
  • Versions 1/2
  • Machine-readable
  • Version 3.0
  • Machine-actionable
  • 1. Data typing increased use of controlled
    vocabularies and standard codes
  • 2. Larger set of required elements
  • Predictable content a more consistent
  • base for programming

29
DDI 3.0 Modular Structure
  • Version 1/2
  • Single file, hierarchical design
  • Version 3.0
  • Modular design
  • Facilitates reuse
  • Facilitates versioning and maintenance
  • Supports life cycle model
  • Allows flexibility in organizing the DDI
    Instance
  • Supports grouping and comparing studies
  • Supports creation of metadata registries

30
DDI 3.0 Alignment with other metadata standards
  • Versions 1/2
  • MARC, Dublin Core (bibliographic standards)
  • Version 3.0
  • MARC, DC, but also
  • SDMX (Statistical Data and Metadata Exchange)
  • ISO 11179 (Metadata Registries)
  • FGDC (Digital Geospatial Metadata)
  • - ISO 19115 (Geographic Information Metadata)

31
DDI 1/2 or DDI 3.0?
  • DDI 3.0 will not supersede DDI 2.1.
  • Both versions will
  • coexist
  • continue to be maintained
  • be used according to specific needs.
  • All DDI 1/2 markup will not have to be migrated
    to Version 3.0.

32
DDI 3.0
  • Structure and Mechanisms

33
DDI 3.0 Modular Structure
  • Building blocks of DDI 3.0
  • Modules
  • Schemes

34
DDI 3.0 Modular Structure
  • Modules
  • Document different aspects of a study, or group
    of studies, following the data through their life
    cycle (Conceptual Components, Data Collection,
    Logical Product, Physical Instance, etc.)
  • Schemes
  • Include collections of sibling objects that are
    traditionally components of a variable
    description Concepts, Universes, Questions,
    Variable Labels and Names, Categories, Codes.

35
DDI 3.0 Modular Structure
  • Modules
  • Can live independently (have their own schemas)
    or connected to one another within a hierarchical
    structure.
  • Schemes
  • Can live semi-independently (need a higher-level
    wrapper as they do not have their own schemas) or
    in-line within a Study Unit or Group module.

36
DDI 3.0 Modular Structure
  • DDI 3.0 model a multi-branched hierarchy
  • Module level

DDI Instance
Resource Package
Group
Study Unit
Subgroup
Study Unit
Conceptual Components
Data Collection
Archive
(Sub)group
Study Unit
Organizations
Study Unit
Subgroup
37
DDI 3.0 Modular Structure
  • DDI 3.0 model a multi-branched hierarchy
  • Within modules

Data Collection
Question Scheme
Processing
Methodology
Sampling
Time Method
Question Item
Question Item
Weighting
Coding
38
DDI 3.0 Modular Structure
  • Relationships are established through
  • In-line inclusion
  • (Relational order is explicit)
  • Referencing Internal
  • External
  • (Relational order is implicit)

39
DDI 3.0 Structural mechanisms
  • Enable modular design and help actualize its
    benefits.
  • Inheritance
  • Referencing
  • Identification

40
DDI 3.0 Inheritance
  • Inheritance is based on the hierarchical
    structure of the model.
  • In DDI 3.0 a number of elements are reused at
    different levels of the hierarchy.
  • When the same element is present at multiple
    levels, lower levels inherit content from the
    upper levels, and only need to specify
    differences (local overrides).

41
DDI 3.0 InheritanceExample
  • Instance Coverage Spatial 50 US states
  • -Study Unit A no Spatial Coverage defined
  • will be inherited
    from Instance
  • -Study Unit B Coverage Spatial 48
    coterminous states
  • supersedes
    definition in Instance

42
DDI 3.0 Referencing
  • DDI 3.0 modular structure is dependent upon
    creating relationships by reference.
  • Referencing implies bringing up the content of a
    DDI object within, or in association with,
    another object, by specifying its Unique
    Identifier.
  • Identifiers are the key links between DDI objects.

43
DDI 3.0 ReferencingExample
  • Data Collection Module
  • Question Scheme Question ID Q1
  • Text How many days in the past week did you
    watch the national network news on TV?
  • Conceptual Components Module
  • Concept Scheme Concept ID C1
  • Description Exposure to national TV news

Logical Product Module Variable Scheme
Variable ID V1 Name V043014 Label Days
past week watch natl news on TV
Question Reference ID Q1
Concept Reference ID C1
44
DDI 3.0 ReferencingExample
45
DDI 3.0 Identification
  • Consistency in building and using identifiers is
    needed for
  • Proper functioning of reference systems, enabling
    a smooth exchange and reuse of existing metadata.
  • Machine-actionability of DDI instances, allowing
    them to serve as a basis for running programs and
    processes.

46
DDI 3.0 Identification
  • Element types used in the Identification system

47
DDI 3.0 IdentificationElement Types
  • Non-identified elements
  • Require context, which is provided by containing
    parents.
  • Example codes within code schemes
  • Are not reusable.
  • Example variable and category statistics

48
DDI 3.0 IdentificationElement Types
  • Identifiables
  • Carry their own ID
  • May be referenced / reused
  • Cannot be versioned or maintained, except as part
    of a complex parent element
  • (Example Variable a change implies a new
    version of the entire scheme).

49
DDI 3.0 IdentificationElement Types
  • Versionables
  • Carry their own ID
  • Carry their own Version content changes are
    important to note
  • (Example Concept may be independently
    versioned within a scheme).

50
DDI 3.0 IdentificationElement Types
  • Maintainables
  • Are higher level DDI objects
  • Are both identifiable and versionable
  • Can also be published and maintained as separate
    entities
  • (Example all modules, schemes, comparison maps)

51
DDI 3.0 Identification Structure
  • Maintainable elements
  • URN and / or ID Identifying Agency
  • Versioning
    Information

  • Version

  • Version Date

  • Version
    Responsibility

  • Version
    Rationale
  • Versionable elements
  • URN and / or ID Versioning Information
  • Identifiable elements
  • URN and / or ID

52
DDI 3.0 Identification StructureNon-specified
Identification information is inherited from the
levels above.
  • Example 1
  • Inheritance is assumed.
  • Maintainable Variable Scheme
  • ID VarScheme_AIdentifying Agency ICPSR
  • Version 1.0
  • Identifiable Variable
  • ID Var_1
  • Identifying Agency
  • Version

53
DDI 3.0 Identification StructureNon-specified
Identification information is inherited from the
levels above.
  • Example 2
  • Inheritance is applied by default
  • Maintainable Logical Product
  • ID LogicalProd_Y
  • Identifying Agency ICPSR
  • Version 1.0
  • Maintainable Variable Scheme
  • ID VarScheme_A
  • Identifying Agency
  • Version
  • Example 1
  • Inheritance is assumed
  • Maintainable Variable Scheme ID VarScheme_A
  • Identifying Agency ICPSR
  • Version 1.0
  • Identifiable Variable
  • ID V1
    Identifying Agency
  • Version

54
DDI 3.0 Identification Structure IDs
  • Uniqueness of Identifiers is necessary for both
    internal and external referencing
  • 1) All IDs MUST be unique within a
    maintainable
  • 2) All maintainables MUST have unique IDs
    across an Agency

55
DDI 3.0 Identification Structure Creating
unique Identifiers
  • A DDI Instance may include multiple
    maintainables
  • at different hierarchical levels
  • Instance (maintainable) unique ID within
    Identifying Agency
  • Study Unit (maintainable) unique ID within
    Identifying Agency
  • Logical Product (maintainable) unique ID
    within Identifying Agency
  • Variable Scheme (maintainable)
    unique ID within Identifying Agency

56
DDI 3.0 Identification Structure Creating
Unique Identifiers
Markup
  • Instance_A (unique at ICPSR)
  • StudyUnit_1
  • Logical Product_1
  • VariableScheme_1
  • Variable_1
  • Instance_B (unique at ICPSR)
  • StudyUnit_1
  • Logical Product_1
  • VariableScheme_1
  • Variable_1

Post-markup Variable ID Instance_AStudyUnit_1Log
icalProduct_1VariableScheme_1Variable_1 Instance_B
StudyUnit_1LogicalProduct_1VariableScheme_1Variabl
e_1
57
DDI 3.0 Identification Structure URNs
  • Have a fixed structure and MUST include object
    ID, Identifying Agency, and Version.
  • For versionable and identifiable elements, the
    containing maintainable is specified.
  • Take precedence when both a URN and the
    Identification sequence are used for the same
    object.
  • May be constructed post-markup from the
    Identification sequence.

58
DDI 3.0 IdentificationURN Structure
Identifying Agency
Object ID
Object Version
  • Examples
  • Maintainables
  • urnddi3.0StudyUnitddialliance.org
    StudyUnit_ID1.0
  • Versionables
  • urnddi3.0ConceptSchemeddialliance.orgConceptS
    cheme_ID1.0
  • ConceptConcept_ID
    2.1
  • Identifiables
  • urnddi3.0VariableSchemeddialliance.orgVariabl
    eScheme_ID1.0

  • VariableVariable_ID

Object name
59
DDI 3.0 Referencing
  • Reference structure
  • URN, and/or
  • Referenced objects ID Identifying Agency
    Version

  • Containing Module ID

  • Containing Scheme ID

60
DDI 3.0 Reuse of Information
  • Referencing
  • Mechanisms for REUSE
  • Inheritance
  • Reuse of Information
  • Facilitates development of documentation
    throughout the study life cycle
  • Promotes interoperability and standardization
    across organizations
  • Saves markup time and effort
  • Reduces the risk of human entry error
  • Provides a basic level of implicit comparability

61
DDI 3.0 Modules
  • Content, Markup Examples

62
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Resource Package
Concepts
Study Unit
Subgroup
Study Unit
Sub(Group)
Data Coll.
Logical Pr.
etc
63
Other specialized DDI 3.0 modules
  • Aggregate Data
  • NCube Logical Product
  • Inline NCube Record Layout
  • NCube Record Layout
  • Tabular NCube Record Layout
  • Inline Microdata
  • Dataset
  • User-specific Markup Templates
  • DDI Profile

64
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
65
DDI 3.0
  • Modules used to mark up a simple study

66
DDI 3.0 modules for documenting a single,
survey-type study
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
67
DDI 3.0 modules for documenting a single,
survey-type study
  • Instance
  • Study Unit
  • Conceptual Component
  • Data Collection
  • Logical product
  • Physical Data Product
  • Physical Instance
  • Archive
  • Organizations
  • Reusable
  • XHTML

68
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
69
DDI Instance -- wrapper for all modules --
  • Identification
  • URN
  • Identification Sequence
  • Name
  • Citation ( optional DC Elements)
  • Coverage
  • Topical
  • Spatial
  • Temporal
  • Group (module) repeatable
  • Resource Package (module) - repeatable
  • Study Unit (module) - repeatable
  • Other Material(s)
  • Note(s)
  • Translation Information

70
Coverage in DDI 3.0
  • Study American National Election Study (ANES),
    2004
  • Topical Coverage
  • Subject
  • Historical and Contemporary Electoral Processes
  • Keyword
  • Electoral campaigns
  • Political attitudes
  • Political participation
  • Spatial Coverage
  • Description United States
  • Top level nation
  • Lowest level congressional district
  • Temporal Coverage
  • Date 2004

71
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
72
Study Unit -- documents a single study --
  • Identification, Other Material(s), Note(s)
  • Citation
  • Abstract
  • Universe Reference
  • Funding Information
  • Purpose
  • Coverage
  • Analysis Unit
  • Embargo
  • Conceptual Component (module)
  • Data Collection (module)
  • Logical Product (module)
  • Physical Data Product (module)
  • Physical Instance (module)
  • Archive (module)
  • Organizations (module)

73
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
74
Conceptual Component-- lists concepts and
universes --
  • Identification, Other Material(s), Notes
  • Coverage
  • Concept Scheme or Reference to External Scheme
  • Vocabulary describes vocabulary used
  • Concept
  • Label
  • Description
  • Similar Concept
  • Difference
  • Concept Group
  • Concept Reference (nestable)
  • Universe Scheme or Reference to External
    Scheme
  • Universe
  • Human Readable
  • Machine Readable
  • Subuniverse
  • Subuniverse

75
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
76
Data Collection
  • Identification, Other Material(s), Note(s)
  • Coverage
  • Methodology
  • Time Method
  • Sampling
  • Collection Event
  • Data Collector
  • Data Source
  • Collection Date (s)
  • Mode of data collection
  • Question Scheme lists actual questions
  • Instrument documents question flow, conditions
  • Processing Event
  • Control and cleaning operations
  • Weighting
  • Data Appraisal Information
  • Coding

77
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
78
Logical Product-- documents intellectual content
of data --
  • Identification, Other Material(s), Note(s)
  • Coverage
  • Category Scheme or Reference to external
    category scheme
  • Category
  • Label
  • Derivation (if applicable)
  • Definition
  • Code Scheme or Reference to external code
    scheme
  • Category Scheme Reference
  • Hierarchy Type
  • Level (in the hierarchy)
  • Code
  • Category Reference
  • Value
  • Code (nestable)
  • Variable Scheme or Reference to external
    variable scheme

79
Logical ProductVariable Scheme Variable
  • Variable or Reference to an externally
    documented variable
  • Identification
  • Name
  • Label
  • Definition
  • Universe Reference
  • Concept Reference
  • Question Reference
  • Embargo Reference
  • Response Unit
  • Analysis Unit
  • Representation
  • Imputation
  • Derivation
  • Coding Instructions
  • Value Representation
  • Text

80
Logical ProductVariable Scheme Variable Group
  • Variable Group
  • Type
  • Label
  • Definition
  • Universe Reference
  • Concept Reference
  • Variable Reference (lists variables in the group)
  • Variable Group Reference (allows nesting of
    groups)
  • Variable Group Reference (use for externally
    documented Variable Group)

81
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
82
Physical Data Product-- Describes Physical
Layout of Data --
  • Identification, Other Material(s), Note(s)
  • Logical Product Reference
  • Gross Record Structure
  • Records Per Case
  • Variable Quantity
  • Logical Record Reference
  • Physical Record Reference
  • Related Logical Records
  • Record Layout
  • Data Item
  • Variable Reference
  • Physical Location
  • Value Location
  • StartPosition
  • Width

83
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
84
Physical Instance-- Documents a specific data
file ---
  • Identification, Other Material(s), Note(s)
  • Citation
  • Coverage
  • Physical Data Product Reference
  • Data File Identification
  • Location
  • URI
  • Gross File Structure
  • Creation Software
  • Case Quantity
  • Overall Record Count
  • Statistics
  • Logical Product Reference
  • Variable Statistics
  • Variable Reference
  • Total Responses
  • Summary Statistics
  • Category Statistics
  • Value

85
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
86
Archive
  • Identification, Other Material(s), Note(s)
  • Archive Specific
  • Item
  • Location
  • Call Number
  • URI
  • Format
  • Media
  • Availability Status
  • Access
  • Confidentiality Statement
  • Access Permission
  • Restrictions
  • Citation Requirement
  • Deposit Requirement
  • Access Conditions
  • Disclaimer
  • Contact
  • Funding Information

87
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
88
Organizations
  • Identification
  • Organization
  • URL
  • Individual
  • Individual
  • Organization
  • Title
  • Language
  • Role
  • Entity Reference
  • Organization Reference
  • Individual Reference
  • Description
  • Period
  • Relation
  • Organization Reference
  • Individual Reference
  • Description
  • Period
  • Name
  • Description
  • Location
  • Telephone
  • E-mail
  • Relation

89
DDI 3.0 Markup Example
  • A Survey Variable

90
Version 2.1 vs. Version 3.0 Example A survey
variable
  • ASCII codebook

91
Version 2.1 vs. Version 3.0 Example A survey
variable in Version 2.1
Data Description Variable
92
Version 2.1 vs. Version 3.0 Example A survey
variable in Version 2.1
nameV043015
93
Version 2.1 vs. Version 3.0 Example A survey
variable in Version 3.0
Logical Product Variable Scheme
Conceptual Component Concept Scheme Universe
Scheme
Data Collection Question Scheme
Logical Product Code Scheme
Logical Product Category Scheme
Physical InstanceStatistics
94
Version 2.1 vs. Version 3.0 Example A survey
variable in Version 3.0
Logical Product Category Scheme ID Category ID
Conceptual Component Concept Scheme Concept
ID Universe Scheme (Sub)Universe ID
Logical Product Variable Scheme ID Variable ID
Logical Product Code Scheme ID Code
Data Collection Question Scheme ID Question ID
Physical Instance Statistics Variable
Statistic Category Statistics
95
DDI 3.0 Markup A Survey VariableConcept
  • Concept Attention to
  • Presidential Campaign
  • on National TV

Conceptual Component Concept Scheme Concept
96
DDI 3.0 Markup A Survey VariableConcept
97
DDI 3.0 Markup A Survey VariableUniverse
Conceptual Component Universe
Scheme(Sub)Universe
(A7How many days in the PAST WEEK did you watch
the NATIONAL network news on TV? 0-7 8DK 9RF)
98
DDI 3.0 Markup A Survey VariableUniverse
99
DDI 3.0 Markup A Survey VariableQuestion ID,
Question Text
Data CollectionQuestion Scheme Question Item
100
DDI 3.0 Markup A Survey VariableQuestion ID,
Question Text
  • Other Response Domains

101
DDI 3.0 Markup A Survey VariableVariable name,
label, type of physical representation
Logical Product Variable Scheme Variable
102
DDI 3.0 Markup A Survey VariableVariable name,
label, type of physical representation
  • Other types of Representation

103
DDI 3.0 Markup A Survey VariableCategory
labels, missing data information
Logical Product Category Scheme Category
104
DDI 3.0 Markup A Survey VariableCategory
labels, missing data information
missingtrue
105
DDI 3.0 Markup A Survey VariableCategory Values
Logical Product Code Scheme Code
106
DDI 3.0 Markup A Survey VariableCategory Values
107
DDI 3.0 Markup A Survey VariableStatistics
Physical Instance Statistics Variable
Statistics Category Statistic
108
DDI 3.0 Markup A Survey VariableStatistics
109
DDI 3.0 Markup A Survey Variable Logical
Product Module
110
DDI 3.0 MarkupModules used in a full variable
description
  • Concept
  • Universe
  • Question
  • Values
  • Value Labels
  • Variable name
  • Variable label
  • Statistics
  • Location
  • Physical Data Product

111
DDI 3.0 Modular ApproachAdvantages
  • Modules and schemes can be independently
    maintained.
  • Pieces of information can be reused without being
    repeated.

112
DDI 3.0 Modular ApproachReusing information
113
Variable Markup in Version 2-- carries redundant
information--
114
Variable Markup in Version 3.0 Modular Approach
Reusing Information
115
DDI 3.0
  • Grouping

116
DDI 3.0 Groups
  • Entirely new feature in DDI 3.0.
  • Designed to document and compare related studies.

117
DDI 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
118
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
119
Group-- documents families of studies --
  • Identification, Other Material(s), Note(s)
  • Citation
  • Abstract
  • Universe
  • Funding Information
  • Purpose
  • Coverage
  • Universe Reference
  • Conceptual Component (module)
  • Data Collection (module)
  • Logical Product (module)
  • Archive (module)
  • Organizations (module)
  • Study Unit (module)
  • Group (module)
  • Comparative (module)

120
DDI 3.0 Grouping Attributes
  • Set of mandatory attributes indicate the nature
    of the relationships among group members
  • Group parameters
  • Time
  • Instrument
  • Panel (population of respondents)
  • Geography
  • Datasets
  • Language

121
DDI 3.0 Grouping Attributes Example
122
DDI 3.0 Types of Groups
  • Groups of studies may be
  • Formal (by design)
  • Designed to be compared (longitudinal,
    time-series, or cross-national studies)
  • Documented and compared through use of
    Inheritance
  • Informal (ad-hoc)
  • Decision to group and compare is taken
    post-production, or after the fact.
  • Comparability documented in the Comparative
    module

123
Formal Groups Inheritance
  • Example 1 Time-series Same questions repeated
    over time, same resulting variables.

Group (Studies A-C) Temporal Coverage_G11991-1993
Data Collection Question Scheme Logical
Product Variable Scheme
Study A Temporal Coverage 1991 (Replace
RefG_1) Physical Data Product Physical Instance
Statistics
Study B Temporal Coverage 1992 (Replace
RefG_1) Physical Data Product Physical Instance
Statistics
Study C Temporal Coverage 1993 (Replace
RefG_1) ....... Physical Data Product Physical
Instance
Study A Temporal Coverage 1991 (Replace
RefG_1) Physical Data Product Physical
Instance
Study B Temporal Coverage 1992 (Replace
RefG_1) Physical Data Product Physical
Instance
124
Formal Groups InheritanceAttributes Add,
Replace, Delete.
  • In a complex grouping structure inheritance paths
    may become quite intricate.
  • ID attributes ADD, REPLACE and DELETE are
    introduced to resolve potential inheritance
    ambiguities
  • ADD empty -gt flags element as a new addition.
  • REPLACE ReferenceType -gt referenced element
    is being replaced at the lower level (local
    override).
  • DELETE ReferenceType -gt referenced element is
    being deleted at the lower level.

125
Formal Groups Inheritance
  • Example 2 Time-series Same core questions
    repeated over time, different topical modules
    added to each iteration.

Group (Studies A-C) Data Collection Core
Questions(Q1-Q50) Logical Product Core Variables
(V1-V50)
Study A Topical Module Health Status Data
Collection ADD Questions (Q51A-Q80A) Logical
Product ADD Variables (V51A-V80A)
Study B Topical Module Gun Control Data
Collection ADD Questions (Q51B-Q80B) Logical
Product ADD Variables (V51B-V80B)
etc
126
Formal Groups Inheritance
  • Example 3 Any group by design some questions
    are not asked in some iterations.

Group (Studies A-E) Data Collection All
Questions (Q1-Q100) Logical Product All
Variables (V1-V100)
Group (Studies C-E) Data Collection DELETE
Questions Q60-Q69 Logical Product DELETE
Variables V60-V69
Study B Data Collection DELETE Question
Q55 Logical Product DELETE Variable V55
Study A
Study C
Study D
Study E
127
Formal Groups Inheritance
  • Example 4 (SOEP, Germany) Longitudinal Same
    variables, with different name each year.

(No name)
ADD Name only
128
Formal Groups Inheritance
  • Example 5 (SOEP, Germany) Longitudinal In 2002
    variable Income changes currency from DM to
    Euro change in question wording.

(No question)
ADD question only
129
Formal Groups Inheritance
  • Example 5 (SOEP, Germany) continued These
    variables also change names every year

130
Formal Groups Inheritance
  • Example 5 (SOEP, Germany) the final picture
    information is inherited down the hierarchy.

131
Inheritance in Formal Groups
  • Simplification of DDI Instances common metadata
    is only entered once.
  • More efficient means of documentation for new
    additions, only differences need to be specified.
  • Relational information embedded in the
    inheritance structure comparison becomes
    machine-actionable.

132
DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
133
Comparative -- documents comparability in ad-hoc
groups --
  • Identification, Note(s)
  • Comparison Description (human-readable)
  • Concept Map
  • Source Scheme Reference
  • Target Scheme Reference
  • Item Map
  • Source Item
  • Target Item
  • Map Type
  • Difference
  • Variable Map
  • Question Map
  • Category Map
  • Code Map
  • Universe Map

134
DDI 3.0 Using the Comparative Module
  • Instructions on how to use the Comparative
    Module and build comparison maps
  • DDI 3.0 User Guide, pp. 45-49.
  • http//www.ddialliance.org/D
    DI/ddi3

135
Producing DDI 3.0 markup
  • Getting started

136
DDI 3.0 Tools projects
  • DDI Toolkit
  • Core library for developing open source tools
  • Version 1/2 lt-gt Version 3.0 converters
  • DDI 3.0 URN resolution tool
  • DDI 3.0 validation tool
  • Version 3.0 stylesheets with display and editing
    layers
  • Grouping tool
  • Concept management tool
  • Registry applications

137
Producing DDI 3.0 markup-- Getting started --
  • Software to assist in document creation
  • DeXtris
  • XML browser
  • Converts DDI 1/2 to DDI 3.0
  • http//www.opendatafoundation.org/tools/dextris

138
DDI 3.0 Tools Using Dextris
139
DDI 3.0 Tools Using Dextris
140
DDI 3.0 Tools Using Dextris
141
DDI 3.0 Tools Using Dextris
142
DDI 3.0 Tools Using Dextris
143
DDI 3.0 Tools Using Dextris
144
DDI 3.0 Tools Using Dextris
145
DDI 3.0 Tools Using Dextris
146
DDI 3.0 Tools Using Dextris
147
Producing DDI 3.0 markup-- Getting started --
  • Software to assist in document creation
  • SPSS system to DDI 3.0 converter
  • (See description and link on DDI 3.0 Proof of
    Concept page)
  • http//www.ddialliance.org/DDI/ddi3/proof
    .html

148
Producing DDI 3.0 markup-- Getting started --
  • XML editors
  • oXygen
  • Create new DDI instance
  • Edit/update DDI instance
  • Validate DDI instance
  • View schemas

149
DDI 3.0 Viewing Schemas in oXygen
150
DDI 3.0 Viewing Schemas in oXygen
151
Producing DDI 3.0 markup-- Getting started --
  • Other tools to assist in producing DDI 3.0
    markup
  • DDI core template
  • Version 3.0 documentation
  • Module descriptions
  • Field level documentation
  • DDI Help Center
  • http//www.ddialliance.org/ddi3/index.html

152
Producing DDI 3.0 markup -- Using multiple
modules --
  • Resource
  • Getting Started with DDI 3.0
  • http//www.ddialliance.org/DDI/ddi3/gett
    ing-started.html

153
DDI Version 3.0Displaying Markup
  • Stylesheets
  • Basic
  • Web presentation in XHTML
  • Enhanced
  • Adds graphics for presenting frequencies
  • Automated calculation of valid percentages
  • http//www.ddialliance.org/DDI/ddi3/proof.html

154
DDI Version 3.0Questions? Comments?
  • Sanda Ionescu sandai_at_umich.edu
  • DDI Users Listserv
  • ddi-users_at_icpsr.umich.edu
  • http//www.ddialliance.org/codebook/listserv.html

155
The End
Write a Comment
User Comments (0)
About PowerShow.com