Gold Compatibility Criteria and Review Process - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Gold Compatibility Criteria and Review Process

Description:

Dianne Reeves, NCI CBIIT. Baris Suzek, Georgetown. Lynne Wilkens, University of Hawaii ... George Komatsoulis. Avinash Shanbhag. Introduction. Levels of ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 46
Provided by: bmi9
Learn more at: https://medicine.osu.edu
Category:

less

Transcript and Presenter's Notes

Title: Gold Compatibility Criteria and Review Process


1
Gold Compatibility Criteria and Review Process
  • Robert Freimuth, Salvatore Mungal, Scott Oster,
    Lynne Wilkens
  • Daniela Smith, Michael Keller

caBIG Annual Meeting Washington, D.C. June 25,
2008
2
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

3
Working Groups
  • Programming Messaging Interfaces
  • Tahsin Kurc, OSU
  • Patrick McConnell, Duke University
  • Scott Oster, OSU
  • Andy Pople, University of Pittsburgh
  • Common Data Elements
  • Dianne Reeves, NCI CBIIT
  • Baris Suzek, Georgetown
  • Lynne Wilkens, University of Hawaii
  • Information Models
  • Bob Freimuth, Mayo Clinic
  • Lewis Frey, University of Utah
  • Rakesh Nagarajan, Washington University
  • Vocabularies
  • Jim Buntrock, Mayo Clinic
  • Sal Mungal, Duke University
  • Craig Stancl. Mayo Clinic
  • Stuart Turner, UC-Davis
  • Larry Wright, NCI/OC
  • Working Group Facilitators
  • Brian Davis, 3rd Millennium
  • Michael Keller, BAH
  • Daniela Smith, BAH
  • NCICB Facilitators
  • George Komatsoulis
  • Avinash Shanbhag

4
IntroductionLevels of Maturity
  • Legacy
  • System does not meet any of the requirements for
    interoperability
  • Implies no interoperability with an external
    system or resource
  • Bronze
  • Minimum requirements for a basic degree of
    interoperability
  • Silver
  • Rigorous set of requirements that significantly
    reduce the barrier to use of a resource by a
    remote party who was not involved in the
    development of that resource
  • Gold
  • Full semantic interoperability of disparate
    systems
  • Formalized grid architecture and data standards
  • Advertising, discovery, and use of all federated
    caBIG resources

5
Introduction
  • The need to incorporate additional criteria into
    the Gold maturity level of the caBIG
    Compatibility Guidelines is being driven by
  • The release of the caGrid 1.0 Infrastructure
  • The experience of the caGrid 1.0 reference
    implementation projects
  • The experience of early adopters of caGrid 1.0
  • The work done on UML model harmonization
  • The work done on vocabulary standards
  • The work done on CDE re-use

6
IntroductionCompatibility Guidelines v3.0
  • Released May 1, 2008
  • Gold Compatibility requirements
  • Information Models
  • Data Elements
  • Vocabularies/Terminologies Ontologies
  • Programming and Messaging Interfaces
  • Clarification/revision of Silver Compatibility
    requirements
  • Available through the caBIG web site
    https//cabig.nci.nih.gov

7
Compatibility Guidelineshttps//cabig.nci.nih.gov
8
IntroductionDevelopment of Gold Criteria
  • Goal
  • Development and release of the Gold compatibility
    criteria and review process
  • Approach
  • Kick off Working Group composed of XCWS
    participants (Jan 2008)
  • Sub-groups focused on Interfaces, CDEs,
    Vocabularies and Information Models
  • Each sub-group generated review criteria based on
    Compatibility Guidelines v3.0
  • Sub-Group Leads are identifying overlap and
    harmonizing the checklists (June)
  • The Sub-Group Leads will also develop a review
    process (June)
  • The review process and review criteria will be
    piloted (July)
  • Lessons learned will be captured
  • Review criteria and process will be updated as
    needed

9
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

10
Information ModelsFrom Silver to Gold
  • Silver level
  • Criteria for UML modeling
  • Criteria for semantic annotation
  • Gold level
  • Criteria to meet infrastructure requirements
  • Criteria to promote interoperability

11
Information ModelsCriteria for Semantic
Annotation
  • Criteria for semantic annotation is the same as
    for Silver level
  • UML names should accurately convey their meaning
    and be consistent with the definition.
  • This might be upgraded to "absolute" (TBD)
  • UML definitions should accurately describe what
    they represent and be sufficiently clear to
    enable accurate semantic annotation.
  • The concepts assigned to each class and attribute
    must be synonymous (consistent) with the
    developer-derived UML definition.

12
Information ModelsOverview of Criteria
  • Criteria to meet infrastructure requirements
    (caGrid)
  • The entire IM must be fully represented in the
    XML schema
  • Names for classes and attributes (TBD)
  • Criteria to promote interoperability
  • Model harmonization and reuse
  • Analytical services emphasis on reuse of whole
    classes for input objects, output objects, and
    parameters
  • Data services emphasis on reuse of components
    from the backbone model

13
Information ModelsOverview of Criteria
  • Criteria for reuse
  • Classes and attributes
  • Focus on the backbone model
  • Including associations, if applicable
  • Datatypes
  • Reused, or caBIG-approved
  • Enumerated value domains
  • Included in the model
  • Modeled same way, if reused
  • Reused components must be appropriate in the
    context of the information model and accurately
    capture the semantic meaning of the underlying
    data.

14
Information ModelsOverview of Criteria
  • Preferred order of reuse
  • CDEs from the Backbone Model
  • Standard CDEs that are not in the Backbone Model
  • CDEs from existing Gold level applications
  • CDEs from existing Silver level applications
  • CDEs registered in the caDSR
  • For the Information Models criteria, "CDE"
    refers to the corresponding class/attribute pair
    in the UML model

15
Information ModelsOverview of Criteria
  • Full reuse of classes and attributes from the
    backbone model is required
  • Justification and examples of interoperability
    are required otherwise
  • If a new CDE is required, partial reuse should be
    maximized
  • See order of preference
  • Justification and examples of interoperability
    are required in some cases
  • Developers will contribute to the evolution of
    the backbone model
  • Expansion with new attributes
  • Extension with new classes/attributes
  • Requires data to be a specialization of an
    existing class in the backbone model
  • Justification is required if similar, existing
    classes in the backbone model cannot be expanded
    or extended
  • Contributions will occur indirectly and through
    a controlled process

16
Information ModelsChanges to the Submission
Package
  • CDE reuse report
  • Breakdown by the source of each component
  • UML model reuse report
  • Identifies reuse of and deviations from the
    backbone model
  • Includes a list of "whole class" reuse for
    analytical services
  • Examples of interoperability (grid joins)
  • Illustrates use cases for how the application
    will interoperate with existing applications
  • Link to the XML schema

17
Information ModelsIssues for Future Discussion
  • Review the need for additional criteria
  • Content of enumerated VDs (PVs)
  • Non-enumerated VDs (number ranges, string
    character limits, etc)
  • Currently there is no way to represent this in
    the UML model
  • Semantics of associations
  • List of approved datatypes
  • Use of approved datatypes is required at Gold
    level
  • Evolution of the backbone model
  • When the backbone model is extended, do the new
    attributes belong in the base class or in a child
    class?
  • Process for versioning the backbone model
  • Clarify the requirements for existing Gold
    applications when the backbone model is revised

18
Information ModelsResource Requirements
  • Tooling needs
  • Enumerated VDs
  • Should be included in the model under review, and
    should also map to an existing VD in the caDSR
  • This will require two models, one for review and
    one for loading
  • Non-enumerated VDs (number ranges, string
    character limits, etc)
  • Include this information as tagged values?
  • Creation of reports for the submission package
  • CDE reuse report, UML model reuse report,
    vocabulary report, etc
  • Correspondence checks
  • UML model XMI caDSR API API docs XML
    schema
  • Process requirements
  • May require more engagement by mentors to ensure
    that the backbone model is considered for reuse
    early in development

19
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

20
Common Data Elements
  • Basis of CDE criteria
  • Elements must be well-formed and defined in order
    for data to be joined and shared
  • Silver level compatibility criteria
  • Ensured that ISO 11179 criteria are met
  • Paring of data element concept (Object
    property) and Value Domain
  • Registration in caDSR, an ISO 11179 CDE
    repository
  • Required that all administered elements have good
    semantics
  • Names are appropriate
  • Definitions exist and are clear
  • NCI Thesaurus codes in caDSR
  • Consistency between UML model and caDSR
    definitions
  • Gold level compatibility criteria
  • Focus on re-use of CDE standards and existing
    CDEs to facilitate interoperability

21
Common Data Elements
  • CDE Criteria in Gold Compatibility Matrix
  • CDEs designated as caBIG Standards by the VCDE
    workspace must be used as appropriate.
  • CDEs generated from the Backbone Model must be
    re-used as appropriate.
  • Existing validated CDEs in the caDSR must be
    re-used or otherwise justified before any new
    data elements are created.
  • Data elements must be expressed in caGrid
    standard metadata format
  • The data elements used by the service as part of
    its operations must be fully described in the
    caGrid metadata to facilitate effective
    discovery, advertisement and interoperability.

22
Common Data Elements
  • caBIG Standards by the VCDE workspace
  • About 20 CDE packages including 140 CDEs have
    been approved by VCDE (https//gforge.nci.nih.gov/
    frs/?group_id109)
  • Registration status on caDSR of Standard
  • Examples sex/gender, age, address, performance
    status
  • Re-use is absolute requirement
  • Full re-use requires use of all administered
    elements object/class, property/attribute, value
    domain, permissible values
  • Violation requires strong justification
    (regulatory requirements)
  • caBIG Backbone Model
  • CDEs in Backbone Model will be registered in
    caDSR
  • Will be promoted as standards
  • Re-use is absolute requirement
  • Validated CDEs
  • Work flow status of Released in caDSR
  • Provide list of those reviewed
  • Provide explanation of lack of re-use

23
Common Data Elements
  • Other Re-use Issues
  • For CDEs that would logically have a list of
    permissible values (PVs), enumeration or
    enumeration by reference is required
  • List of PVs with definitions must be available to
    facilitate re-use
  • For newly created CDEs, partial re-use of
    administered elements of standards is encouraged
    to facilitate interoperability
  • Data Element Concept re-use (Object/class
    Attribute/Property)
  • Value Domain re-use

24
Common Data Elements
  • caGrid service metadata
  • XML document must be provided
  • Concept codes for object, properties, value
    domain and permissible values in XML document
    must agree with caDSR mapping
  • Requirement is absolute

25
Common Data Elements
  • Tooling needs
  • Tooling to compare concept codes between service
    metadata, UML model and caDSR
  • Report of use of concept codes in application
    under review within caDSR
  • Report of use of similar concepts based on name
  • Helpful if caDSR could identify like CDEs
  • In particular, CDEs with the same attributes
    except that object class of person is substituted
    with patient or participant

26
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

27
Vocabularies
  • Vocabulary Criteria will map to
  • Full adoption of caBIG vocabulary standards as
    approved by the VCDE workspace.
  • Concept identification in systems must use the
    caBIG Identifier and Resolution Scheme
  • Metadata of vocabularies must be accessed through
    a standard caGrid Vocabulary API
  • Vocabularies must be discovered through a
    standard caGrid Vocabulary API

28
Vocabularies
  • Questions/Challenges to Address
  • Will many vocabularies really satisfy our
    vocabulary review criteria for caBIG Vocabulary
    standards?
  • How can we develop review criteria around Concept
    IDs and their resolution on caGrid as this is
    still under development?
  • What is the relationship of vocabulary
    standardization process to the Silver/Gold
    Compatibility Guidelines?

29
Changes in the Compatibility GuidelinesSilver
Vocabulary and Ontologies Matrix
  • Differentiated from bronze and silver where all
    data collection fields and attributes of data
    objects are approved by caBIG VCDE Workspace
  • Vocabularies used in data elements should be
    compatible with caBIG Identifier and Resolution
    Scheme
  • Approved vocabularies will provide a minimum set
    of core metadata
  • Approved vocabularies will be classified based on
    scope, intent, and purpose

30
Changes in the Compatibility GuidelinesSilver
Vocabulary and Ontologies Text
  • Updated to reflect review checklist
  • Vocabularies/Ontologies will be assessed via
    LexEVS on the grid (LexBIG and EVS will merge
    under the name of LexEVS)
  • Added description of caBIG Identifier Scheme for
    semantic classes
  • Added description of the caBIG Identifier
    Resolution Scheme for resolving identifiers

31
Changes in the Compatibility GuidelinesGold
Vocabulary and Ontologies Matrix
  • Differentiated from Silver based on usage of
    common identifier scheme and common vocabulary
    API
  • Full adoption of approved caBIG vocabulary
    standards
  • Vocabularies will utilize the caBIG identifier
    and Resolution Scheme
  • Vocabularies will be accessible through a
    standard vocabulary API
  • Compatible systems will reference standard
    vocabularies approved for use by gold systems

32
Changes in the Compatibility GuidelinesGold
Vocabulary and Ontologies Text
  • Added detail on gold compatibility
  • Approved caBIG vocabulary standards are enabled
  • Registered terminologies approved as caBIG
    standards for caBIG usage are accessed via
    terminology metadata and discovered through a
    caGrid vocabulary service (caGrid Vocabulary API)
  • Vocabulary is accessible through a standard
    caGrid vocabulary API
  • The current caGrid Vocabulary API is EVS. LexBIG
    and EVS will merge under the name of LexEVS

33
Gold Vocabularies
  • Tooling needs
  • Tooling to confirm correspondence between concept
    IDs, names, and/or definitions in the Information
    Model with the source terminology.
  • Tooling to confirm correspondence between concept
    IDs, names, and/or definitions in the Service
    Metadata with the source terminology.
  • These tools will not be implemented until the
    Resolution Scheme is fully implemented

34
Changes in the Compatibility GuidelinesGold
Vocabulary and Ontologies Text
  • Recap - Gold Vocabulary and Ontologies are
  • Accessed and discovered via caGrid services
  • Provided with a standard set of metadata
  • Mapped and implemented with caBIG Identifier and
    Resolution Scheme
  • Classified based on scope, purpose and intent
  • Creation of tools needed to help the vocabulary
    review process

35
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

36
Interfaces
  • Interface Criteria to map to
  • APIs are exposed as operations of a Grid service
    Object-Oriented client APIs are available for
    invoking those operations
  • Service operations use XML as data exchange
    format, and are invoked using standardized
    protocols and communication channels
  • Services provide public access to caGrid
    standardized service metadata and have capability
    to register it with a caGrid Index Service
  • Data-oriented services provide query access using
    the caGrid standardized query interface and
    language
  • Secure services must use the caGrid standardized
    mechanisms for authentication, trust management,
    and communication channel protection
  • Questions/Challenges to Address
  • What is the distinction between reviewing a
    Silver API vs. a Gold API?
  • Tooling for tedious consistency checking of
    various artifacts
  • How will schemas in the GME be mapped to UML
    models?

37
Changes in the Compatibility Guidelines Gold API
(Grid Services)
  • The primary change for Gold APIs is the move to
    grid services and APIs, where data is transported
    over the grid as well-defined XML
  • Tooling exists to make the development experience
    very similar to any existing Silver API which is
    client/server based
  • However, Gold compliance additionally requires
  • Adherence to several standards and specifications
  • Standardized approaches to metadata and security
  • Specific (additional) constraints for data query
    capabilities
  • Review process will focus on checking for
    standards compliance and consistency between
    existing artifacts (UML models, APIs, etc) and
    new grid-specific artifacts (WSDLs, XSDs, service
    metadata, etc)

38
Changes in the Compatibility Guidelines Gold API
(Metadata)
  • Gold compliance introduces the concept of
    service metadata to all systems which are
    exposed as grid services
  • Provides programmatic runtime access to metadata
    about the API, information model, CDEs, and
    vocabulary
  • Tooling exists to automate the development
    experience such that most information is
    extracted from existing system (e.g. caDSR) and
    metadata is created automatically
  • Review process will focus on checking for
    existence, syntax compliance, accuracy,
    registration, and consistency of metadata

39
Changes in the Compatibility Guidelines Gold API
(Security)
  • Gold compliance unifies standards, technologies,
    and methodologies for authentication,
    authorization, message transport, and trust
  • Built upon X.509, HTTPS, and web/grid service
    standards
  • Tooling exists to simplify accrual and use of
    credentials, management of trust, and service
    security configuration
  • Review process will focus appropriate use of
    authentication process, integration to caBIG
    trust fabric, and use of standards and
    technologies for transport

40
Changes in the Compatibility Guidelines Gold API
(Applications)
  • Gold compliant applications are expected to
    correctly leverage (secure) grid services, make
    use of the discoverable nature of the grid,
    present data using registered semantics, and
    build on existing tooling/languages/APIs when
    possible
  • Many high-level APIs and frameworks exist for
    application developers to leverage
  • Review process will focus use of the grid APIs
    and tools, presentation of data, and integration
    with security infrastructure

41
Changes in the Compatibility Guidelines Gold API
Review Challenges
  • Gold compliance introduces many new artifacts
    (grid service, metadata, WSDL, XSDs, etc)
  • Must be checked for consistency with each other
    and existing Silver artifacts (UML models, APIs,
    etc)
  • Magnitude of items to check practically requires
    automation
  • Review process is complicated if existing Silver
    review has been performed, or if both Silver and
    Gold compliance are eventually sought
  • Large systems with many components seeking
    reviews may have fuzzy boundaries (e.g. an
    application consisting of a UI and 1 or more grid
    services)
  • Some criteria are most ideally realized by
    emerging technology or software in development
    (e.g. caDSR/GME binding, identifiers, distributed
    vocabulary services, etc)
  • Gold criteria groups have more overlap/dependencie
    s than Silver criteria

42
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA

43
SummaryGold Criteria
  • Information Models
  • Model harmonization and reuse
  • UML model represented in the XML schema
  • CDEs
  • Reuse of standards
  • Concept codes in the XML document
  • Vocabularies
  • Use of vocabulary standards
  • caBIG identifier and Resolution Scheme
  • Program Interfaces
  • Grid services and security
  • Service metadata

44
Next Steps
  • Kick-off Working Groups
  • Finalize individual checklists
  • Sub-Group Leads harmonize checklists
  • Sub-Group Leads develop review process (based on
    Silver)
  • Working Group signs off on harmonized checklists
    and process
  • Pilot review process with gridPIR (July 2008)
  • Develop lessons learned
  • Modify review criteria and process as needed
  • Present to Architecture and VCDE WS for review
    and approval

45
Agenda
  • Introduction
  • Overview of Gold Criteria
  • Information Models
  • Common Data Elements
  • Vocabularies
  • Program Interfaces
  • Summary
  • QA
Write a Comment
User Comments (0)
About PowerShow.com