Developing a Distributed Data Dictionary Service Using LDAP and ISO11179 to Support STEPbased Data I - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Developing a Distributed Data Dictionary Service Using LDAP and ISO11179 to Support STEPbased Data I

Description:

Infrastructure services needed to bring STEP-based data to the desktop ... DC.Date.LastModified. JPL's Planetary Data System (PDS) PDS.Target_Name. PDS.Sampling_Factor ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 24
Provided by: metadatas2
Category:

less

Transcript and Presenter's Notes

Title: Developing a Distributed Data Dictionary Service Using LDAP and ISO11179 to Support STEPbased Data I


1
Developing a Distributed Data Dictionary Service
Using LDAP and ISO11179 to Support STEP-based
Data Integration and Re-Use
  • Jim URen
  • Jet Propulsion Laboratory
  • California Institute of Technology
  • January 22, 2003

2
Agenda
  • Overview of STEP ISOs family of product and
    engineering data standards
  • Infrastructure services needed to bring
    STEP-based data to the desktop
  • A proto-type data dictionary service based on
    ISO11179

3
What is STEP?
  • STandard for the Exchange of Product Data
  • an ISO standard (Part of ISO 10303 TC184/SC4)
  • Designed to cover all information through a
    products life cycle
  • Includes standard formats (APs), language
    (Express) and APIs (SDAI)

4
  • An ISO standard (10303) that consists of
    distributed parts


Application layers

Logical layers

Physical layers
5
STEP Development Process
  • a structured approach to developing standardized
    data models
  • Develop an activity model that scopes the
    development using IDEF0 formalisms (AAM
    Application Activity Model)
  • Develop a user-based model using EXPRESS(ARM
    Application Reference Model )
  • Develop a STEP model using an integrated resource
    library (AIM Application Interpreted Model)
  • Develop a test suite to validate the data model
    (ATS Abstract Test Suite)

6
STEP in Spacecraft Development
This slide provides high level information about
how STEP and other standards can be applied to
the engineering domains that are part of the
spacecraft development process.
How the family of STEP Data Standards can be
applied to Spacecraft Development
De-emphasized boxes indicate data models that
are IN DEVELOPMENT.
JAU 2002-10-25
SLIDE_STEP-in-Spacecraft-Development-Ver12d.ppt
7
High-level Diagram of Infrastructure Servicesto
supportInformation Access, Reuse and Integration
Visualization Service(s)
Information Service(s)
Tool Service(s)
User Level
Directory Service(s)
Translation Service(s)
Validation Service(s)
Modeling Service(s)
Transaction Level
Repository Service(s)
Part Library Service(s)
Data Level
  • Infrastructure Services should have
  • Standard interfaces
  • Distributed capabilities
  • Platform independence

DIAGRAM_STEP-Services-2002-08-30.ppt
JAU 2002-08-30
8
Problem
  • 1. Data dictionaries mean different things to
    different people
  • Vocabularies - human readable collections of
    terms and definitions pertaining to a domain
  • Data elements - machine interpretable parts
    used to build data mode
  • Data models (information models) - structured,
    machine interpretable collections of data
    elements that include the structured
    relationships between data elements
  • 2. Dictionaries do not communicate with each other

9
What is Needed
  • A mechanism that can be used to access, publish,
    update, relate and integrate data dictionaries
    (vocabularies, data elements, and data models)
  • Mechanism must be able to span domains and
    subdomains, e.g., engineering, science, and
    administrative
  • Mechanism must have both manual and automated
    interfaces
  • Mechanism should follow the distributed service
    model (e.g., DNS, Internet Domain Name Service,
    x.500 Directory, etc.)

10
A Solution
  • Develop a distributed data dictionary service
    using
  • LDAP Internet service protocol (LightWeight
    Directory Access Protocol)
  • ISO11179 attributes from a standard set of data
    elements
  • DSML XML DTD/Schema (Directory Service Markup
    Language)
  • Dublin Core Meta-data
  • the Service will store and relate vocabulary,
    data elements, and data model information

11
Advantages of LDAP
  • LDAP has many advantages, including
  • Universal Access - Internet directory standard,
    widely adopted and implemented by numerous
    vendors and open source software solutions
  • Simple - a relatively simple, high-level protocol
    with a straightforward API
  • Extensible - easily extended and adapted
  • Access Control and Security - connections can be
    authenticated and secured layered Internet
    security mechanism
  • Multi-Platform Development - C/C, Perl, Java,
    JavaScript, Python, PHP and other APIs are
    available, making LDAP services accessible from
    virtually any language, platform, or development
    environment

12
What is LDAP?
  • An Internet Standard from an IETF working
    group
  • RFC 1777 Lightweight Directory Access Protocol
  • RFC 1778 String Representation of Standard
    Attribute Syntaxes
  • RFC 1779 String Representation of Distinguished
    Names
  • RFC 1959 LDAP URL Format
  • RFC LDAP API
  • A distributed, hierarchical data base
  • Uses a multi-part naming convention to create
    unique records (distinguished names)
  • Cnbehaviour, dcvocabulary, dcPart233,
    dc10303, dcISO
  • cnrequirement_set, dcdata-element, dcPart233,
    dc10303, dcISO
  • cnTBR-apha1, dcschema, dcPart233, dc10303,
    dcISO
  • Includes ability to implement multiple levels of
    security

13
Example of an LDAP tree
ISO
10303
14496
9000
. . .
237
235
. . .
233
203
210
209
Vocabulary
Schema
Data Elements
14
Advantages of ISO 11179
  • an established international standard
  • widely supported - US Census Bureau, NIST,
    Defense Information System Agency, Environmental
    Security, DoE, DoJ, Bureau of Labor Statistics,
    DoT, EPA, etc.
  • Flexible use of elements within the schema
  • Easily implemented in an LDAP directory service -
    flexible and easily configured LDAP servers well
    suited to flexible 11179 schema

15
Data Dictionary Components for a given namespace
16
A Distributed Data Dictionary Serviceusing
Standards-based technology LDAP Protocol ISO
11179 meta-data schema DSML Dublin Core
Prototype service viewable at http//step.jpl.nas
a.gov/ldap
Supporting Automated Processes
Supporting Validation Scenarios
Supporting Data Modeling Activities
Supporting Terminology Lookups
17
A Proposed Data Element Naming Convention
  • A structured, multi-part naming system
  • similar to IP addressing and URLs
  • dot delimited names
  • follows convention used by Dublin Core Meta-data
    Initiative
  • short-name aliases could be supported in the
    planned distributed data dictionary service
  • e.g. author DC.Creator, keywordDC.Subject,
    etc.
  • Names would consist of domains, descriptors and
    qualifiers.

18
Examples of the Data Element Naming Convention
within JPL Domains
  • Dublin Core Meta-data Initiative (a JPL adopted
    standard)
  • DC.Date
  • DC.Date.Created
  • DC.Date.LastModified
  • JPLs Planetary Data System (PDS)
  • PDS.Target_Name
  • PDS.Sampling_Factor
  • JPLs Product Data Management System (PDMS)
  • PDMS.Version
  • PDMS.ReferenceDesignator
  • JPL New Business System (NBS)
  • NBS.HR.start_date
  • NBS.HR.employee_status

19
Terminology Lookup Scenarios
  • Resolving Ambiguous Terminology - an end user,
    needing to clarify use and meaning of a word used
    in a specific context, performs a multi-domain
    vocabulary lookup across multiple DD services
    looking for published vocabulary of referenced
    domain
  • Finding the Correct Acronym - an end user,
    confronted with a number of new acronyms used in
    a presentation, accesses a local DD service to
    look up the acronyms based within probable
    domains, thereby eliminating the alternative
    meanings e.g., searching for STEP standards work
    versus the JPL STEP project
  • Enabling Improved Search Engine Performance - as
    a search engine scans through a document, it
    discovers a keyword list and finds a reserved
    word the document includes a reference to a
    domain-specific vocabulary list in a DD service
    the search engine uses this vocabulary to be
    certain it is indexing the keywords in the right
    context
  • Building Glossaries for Technical Papers - an
    engineer or scientist writing a technical paper,
    needs to include a glossary of relevant terms in
    the paper by performing a multi-service search,
    terms and definitions that relate to the topic of
    the paper are quickly found and inserted into the
    paper with the corresponding attributions

20
Validation Scenarios
  • Validating Units of Measure - a system integrator
    receives an MCAD geometry model (e.g., STEP AP203
    Part 21 file) of a component to be integrated
    into any assembly automatically, a standard
    validation routine is performed against the
    schema located in a referenced data dictionary
    that checks for use of the units of measure
    called for in the contract and identified in the
    exchange file
  • Enabling Automated Repository Check-In - as a
    STEP model is checked into a PDM system, an
    automated validation routine checks the model
    using the schema (located in the DD service) that
    is identified in the Part 21 data file
  • Improving Quality of Data Handoffs - an MCAD
    geometry model is sent from design to thermal
    analysis and validation is performed using the
    correct schema version as referenced in the
    model validation is an automated process that
    occurs before any work is done with the model as
    it is transferred between domains
  • Validating for Adequacy and Range the PDS
    (NASAs Planetary Data System) central node
    receives a dataset description in template format
    to be ingested into the dataset catalogue
    database. Automatically, a standard validation
    routine is performed that checks for required
    keywords, key word values and value types in the
    dataset in template format against a
    corresponding structure stored in the PDS domain
    of the data dictionary service

21
Data Modeling Scenarios
  • Data Reuse in Modelling Activities- a data
    modeller, charged with developing an information
    model for a new application, uses data elements
    published in several DD services (much like a
    parts library), ensuring that the new information
    model will have compatible interfaces with data
    sets that share the same data elements or
    collection of elements
  • Creating a TDP (technical data package) - an
    application performs a schema check against
    objects about to be wrapped into a TDP (e.g.,
    STEP AP232 or PDM Schema TDP) to ensure their
    correct structure and meta-data content
  • Data Integration Enabled - an analyst, charged
    with integrating data from two or more data sets,
    accesses the correct version of each schema as
    referenced in the data set from the DD service
    space allowing them to identify/map interfaces
    between the data sets, e.g., MCAD-ECAD-cost data
  • Extending a schema - to solve a "local" problem,
    a data modeller uses data elements from a
    published collection of data items to extend an
    existing official schema the new schema is
    published in the DD service with traces/links
    back to the official schema

22
Data Archive Scenarios
  • Inputing information into an archive - a project
    in a post-launch phase would like to archive data
    to an institutional archive service using a
    translation service, data in proprietary data
    formats is translated into a standard, neutral
    format based on an open data model.
  • Retreiving information from an archive an
    engineer retreives a dataset from an archive and
    would like to validate the well formedness of
    the data before attempting to pull the assembly
    into a design.
  • Maintaining/updating an archive a standard data
    model is updated to a new version level a
    portion of the data in an archive service is in
    the older format the decision is made to update
    the data in the archive service to the new
    format an application checks the data out of the
    archive service and updates the data using the
    new data model and checks it back into the
    archive.

23
Whats next? (Completing the prototype)
  • Architecture development
  • UML Model (50)
  • Naming Convention (50)
  • Linking ontology (25)
  • Server configuration
  • 2nd and 3rd DD test nodes (33)
  • Wrapping existing DD DBs (10 )
  • Client configurations
  • LDAP URL (75 ) Java (33)
  • Python (33) Perl (33)
  • C/C (75) Unix Shell (25)
  • PHP (25) Native clients (25)
  • Security Configuration
  • Government (25)
  • Commercial (25)
Write a Comment
User Comments (0)
About PowerShow.com