The Earth System Grid Discovery and Semantic Web Technologies - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

The Earth System Grid Discovery and Semantic Web Technologies

Description:

Scientific Web Technologies for Searching and Retrieving Scientific Data. ISWCII, Sanibel Island, FL, October 20, 2003 ... A geographically distributed team of ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 14
Provided by: markus67
Category:

less

Transcript and Presenter's Notes

Title: The Earth System Grid Discovery and Semantic Web Technologies


1
The Earth System Grid Discovery and Semantic Web
Technologies
  • Line Pouchard
  • Oak Ridge National Laboratory
  • Computer Science and Mathematics Division
  • Luca Cinquini, Gary Strand
  • National Center for Atmospheric Research

Scientific Web Technologies for Searching and
Retrieving Scientific Data ISWCII, Sanibel
Island, FL, October 20, 2003
2
  • A geographically distributed team of climate and
    computer scientists
  • Climate scientists are our target users
  • 20-100 simultaneous users
  • Scientists providing expertise and leadership to
    the Inter- Governmental Panel on Climate Change
    (IPCC)
  • A computing and data Grid collaboratory
    sponsored by the US
  • Department of Energy.
  • A distributed system for storage, access, and
    discovery of
  • post-processing data resulting from climate
    simulations on
  • super-computers.

3
ESG Collaboration Network
Grid and Network
Infrastructure
4
Current Status of Climate Data
  • Data sizes (estimated to be produced in the next
    3-4 years for IPCC) , types of storage, location
    of storage
  • NCAR (Boulder, CO) 8.961 Terabytes, NERSC
    (Berkeley CA) 3.514 TB, ORNL (Oak Ridge, TN)
    6.443 TB. Total 18 912 TB.
  • Stored on mass storage archives, disk caches and
    tapes.
  • Data replicated at 3 locations in the US.
  • Data format conventions and simulation output
    formats
  • Minimal metadata produced or associated by
    current simulations.
  • Multiple output formats.
  • Many complex standards.
  • Discovery and retrieval
  • Datasets are not described in details.
  • Metadata resides in the data managers head.
  • Largely manual access.
  • Different access mechanisms at different sites.

Far from seamless automated data discovery and
access
5
ESG goals for search and retrieval
  • Enable searches and downloads through a seamless
    process
  • Data search across multiple sites and storage
    locations.
  • Access to all ESG functionality from the desktop
    through a single point of entry (a Web Data
    portal).
  • Some degree of access control (authentication,
    certificates).
  • Keep track of datasets particularly on deep
    storage (archives, caches, tapes)
  • Data formats
  • Find related datasets campaign, ensembles
  • Simulation model descriptions and configurations
  • Related simulations parent, child, sibling
  • Browse-able, search-able, and extensible metadata
  • Several levels of users
  • easy-to-use, integrated tools (otherwise, no one
    will use them)
  • Collaborate with other groups CCLRC e-Science
    Center and the British Atmospheric Data Center.

6
Discovery Ontology and Metadata Services
7
Motivations for a prototype ontology
  • Development of an ESG metadata schema
  • Help structure and guide the development efforts
  • Provide a context
  • Trust
  • Provenance and logistic information
  • Data quality and curation
  • Prepare for a federation of data sources and
    inter-operability between metadata schemas
  • the ability to perform searches across these
    sources from a single point of entry.

8
ESG ontology concepts and relationships
  • Datasets
  • Files names (tells a lot)
  • Formats and conventions
  • Coverage (space, time, multi-dimensional physical
    grids)
  • Calendar years
  • Parameters
  • Related datasets
  • Campaigns
  • ESG Service
  • Used_by
  • Pedigree
  • Participants, roles in ESG
  • Provenance traces origins and transformations
  • Is_generated_by
  • Storage location
  • Scientific Use Simulations
  • has_parent, has-child, has_sibling
  • Input_type
  • hardware_type

9
Guiding principles for the development of an ESG
ontology
  • Separate entities describing things from
    entities describing processes.
  • Decouple concepts specific to a domain area from
    those common to other (Grid) projects.
  • Keep terminology intuitive to users.
  • Make explicit relationships between XML elements.
  • Ontology tools were used to analyze current ESG
    schemas at every stage of development.

10
Person 0,1 firstName 0,1 lastName 0,1
contact
isA
LEGEND
Object 1 id
Institution 0,1 name 0,1 type 0,1 contact
AbstractClass
Class
participant role
worksFor
isA
inheritance
association
Project 0,n topic type 0,1 funding
Activity 0,1 name 0,1 description 0,1
rights 0,n date type 0,n note 0,n
participant role 0,n reference uri
Service 0,1 name 0,1 description
isA
isPartOf
Campaign
isA
serviceId
Investigation
Ensemble
isA
isPartOf
Experiment
Analysis
Observation
Simulation 0,n simulationInput type 0,n
simulationHardware
hasParent hasChild hasSibling
Dataset 0,1 type 0,1 conventions 0,n date
type 0,n format type uri 0,1
timeCoverage 0,1 spaceCoverage
generatedBy
isPartOf
11
(No Transcript)
12
Discovery Services Architecture
Storage
Physical File Names Storage Location
Download
ESG Portal
Metadata
Searches
Searches
Discovery Service
Logical File Names
Logical File Names
Metadata Catalog Service
Replica Location Services
13
Leveraging Semantic Web efforts in Grid projects
  • The Semantic Web
  • Highlighted the need for sharing information
    based on content.
  • Provided web-based languages for knowledge
    acquisition and reasoning.
  • Offers directions for ontology reconciliation.
  • There exists ontologies in the Earth Sciences.
  • Challenges presented by ESG
  • Real-life complexity.
  • Scientists as beginners and expert users demand
    usability
  • Measures of success.
  • Changing a scientist s work habits requires an
    immediate and visible payoff
  • Data sizes scalability of the approach.
Write a Comment
User Comments (0)
About PowerShow.com