OPeNDAPandTHREDDS: AccessandDiscoveryofDistributedScientificData - PowerPoint PPT Presentation

About This Presentation
Title:

OPeNDAPandTHREDDS: AccessandDiscoveryofDistributedScientificData

Description:

OPeNDAPandTHREDDS: AccessandDiscoveryofDistributedScientificData – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: OPeNDAPandTHREDDS: AccessandDiscoveryofDistributedScientificData


1
OPeNDAP and THREDDSAccess and Discovery of Distr
ibuted Scientific Data
  • Yuan Ho
  • Ethan Davis
  • UCAR Unidata

2
Access and Discovery of Distributed Scientific Da
ta
  • OPeNDAP access to scientific data but no
    standard inventory or discovery mechanisms
  • THREDDS cataloging, describing, and discovery
    of scientific data

3
What is OPeNDAP
  • OPeNDAP (Open source Project for a Network Data
    Access Protocol) is a protocol for accessing
    distributed scientific data (aka DODS DAP).
  • OPeNDAP is a generic data exchange mechanism that
    lies at the core of a variety of discipline data
    system.
  • OPeNDAP is two reference implementations of the
    protocol (C and Java)
  • OPeNDAP is a software framework that simplifies
    all aspects of scientific data networking,
    allowing simple access to remote data.
  • OPeNDAP is a community of users and developers
  • OPeNDAP is a non-profit corporation called
    OPeNDAP Inc..

4
Design Principles
  • The user should be able to share their data via
    OPeNDAP over network (server).
  • The user should be able to use their application
    package to examine or analyze the data of
    interest (client).

5
Client/Server Interaction
  • Data access (client)
  • Access to remote data in users normal application
  • IDL (win32)
  • Matlab
  • Ferret
  • GrADS
  • Any netCDF application
  • Excel
  • Dont need to know the data format in which the
    data is stored
  • Can access data subsets.
  • Data publishing (server)
  • Network interface via http
  • DAP provides common/network representation for
    data
  • Can serve data in various formats
  • netCDF
  • HDF
  • SQL
  • FreeForm
  • JGOFS
  • DSP
  • Allows subsetting of data

6
OPeNDAP Status
  • OPeNDAP/DODS 3.4 release
  • OPeNDAP Java 1.1.3
  • OPeNADP Data Connector 2.3X
  • OPeNDAP DAP Specification 4.0

7
OPeNDAP Data Object
  • Three important OPeNDAP data objects
  • DDX
  • The DDX is an XML representation of the structure
    of all or part of a data set, as well as a
    description of the variables within that
    datasets.
  • Blob
  • Binary data transfer from the data source to the
    client. The Blob contains the serialized data
    represented by the DDX.
  • ErrorX
  • The ErrorX object is an XML document containing
    information about any errors that may have been
    encountered by the server while processing a
    request.

8
DDX Example
  • DDX Example
  • ltDatasets namefnoc1.nc
  • xmlnsxsihttp//www.w3.org/2001/
    XMLSchema-instance
  • xmlnshttp//www.opendap.org/ns/O
    PeNDAP
  • xsischemaLocationhttp//www.op
    endap.org/ns/OPeNDAP

  • http//dods.coas.oregonstate.edu8080/opendap/ope
    ndap.xsdgt
  • ltAttribute nameDescription
    typeStringgt
  • ltvaluegtFleet Numerical
    Wind Datalt/valuegt
  • lt/Attributegt
  • ltArray nameugt
  • ltAttribute
    namelong_name typeStringgt

  • ltvaluegtU_Wind_Vectorlt/valuegt
  • lt/Attributegt
  • ltFloat32/gt
  • ltdimension size16
    namelatitudegt
  • ltdimension size17
    namelongitudegt
  • ltdimension size21
    nametimegt

9
Variables and Attributes
  • Each variable consists of a name, a type, a value
    and a collection of Attributes.
  • Atomic variables atomic data types are
    indivisible.
  • integer, floating-point, string, and binary
    images.
  • Example
  • ltFloat64 nameDepth/gt
  • ltBinary namesound_sample size17623/gt
  • Constructor variables a constructor variable is
    assembled from collections of other variables,
    including both atomic and constructor types.
  • array, structure, grid, and sequence.
  • Example
  • ltArray nametempgt
  • ltByte/gt
  • ltdimension size5 namelon/gt
  • ltdimension size3 namelat/gt
  • lt/Arraygt

10
Variables and Attributes
  • An attributes is composed of a name, a type, and
    a value.
  • Each variable may have zero or more attributes.
  • Types Boolean, Byte, IntXX, UIntXX, FloatXX,
    String, URL.
  • Example
  • ltDataset nametestgt
  • ltStructure namemeasurementgt
  • ltAttribute namedata typeStringgt
  • ltvaluegt 18 Mar 03lt/valuegt
  • lt/Attributegt
  • ltAttribute nameother
    typeStructuregt
  • ltAttribute namesatellite_name
    typeStringgt
  • ltvaluegtGOESlt/valuegt
  • ltAttribute nameexperiment
    number typeint32gt
  • ltvaluegt898976lt/valuegt
  • lt/Attributegt
  • lt/Attributegt
  • ltFloat64 namevaluegt
  • ltArray nametime_seriesgt

11
Requests/Responses
  • Responses four categories of information pass
    from the server to client
  • Information about the data DDX
  • The data Blob
  • Error messages ErrorX object
  • Information about the server version messages
    and server capabilities document
  • Requests a constraint expression provides a way
    for client to request certain information from a
    dataset, such certain variables, or parts of
    certain variables.
  • Projection clause a collection of one or more
    project elements
  • Selection clause one or more select elements.
  • Example
  • ltConstraintgt
  • ltProject variable/sample/temp/gt
  • ltProject variable/sample/salt/gt
  • ltSelect condition/sample/saltgt34.0
    targetsample/gt
  • lt/Constraintgt

12
Problems of searching and retrieving datasets
from OPeNDAP server
  • Metadata
  • Use metadata metadata at the data level
  • Search metadata metadata at the directory level
  • OPeNDAP has been built from data level, high
    functionality at the data acquisition level.
  • OPeNDAP AIS (ancillary information service)
    adding metadata information into OPeNDAP data
    stream. The role of ancillary data is to
    translate and access of data
  • ODC is more a directory services with limit data
    searching functionality.

13
Summary of OPeNDAP
  • OPeNDAP data delivery architecture provides
    remote access of data via internat.
  • OPeNDAP uses HTTP (FTP, GridFTP, Telnet, et
    cetera) to transport its data object.
  • OPeNDAP has proved very versatile.
  • XML for the persistent form of the data objects.
  • OPeNDAP is a data access tool, need a data
    discovery tool to complement each other.

14
THREDDS Project
  • Develop a framework to bridge the gap between
    data providers and data users, to make scientific
    data discoverable and usable as well as
    referencable from scientific publications and
    educational materials.
  • The framework should be
  • Scalable for large and small projects
  • Easy to use yet powerful and flexible
  • Capable of supporting various user interfaces

15
THREDDS Catalogs
THREDDS catalogs are for communicating
information about a collection of datasets
  • Hierarchal structure of datasets
  • Dataset access methods
  • Structure on which to hang (reference) metadata

1
0..
0..
0..
0..
16
THREDDS Catalogs
THREDDS catalogs are for communicating
information about a collection of datasets
  • Hierarchal structure of datasets
  • Dataset access methods
  • Structure on which to hang (reference) metadata

1
0..
0..
0..
0..
17
THREDDS Catalogs
ltcatalog version"0.6"gt ltdataset
name"Unidata IDD Model Data"gt ltdataset
name"NCEP Eta 80km CONUS model data"gt
ltmetadata metadataType"DublinCore"
xlinkhref"http//server/dods/et
a.xml" /gt ltdataset name"NCEP Eta
80km CONUS 2003-09-24 12Z"gt
ltaccess serviceType"DODS"
urlPath"http//server/dods/2003092412_eta.
nc" /gt lt/datasetgt
18
THREDDS Catalogs
THREDDS catalogs are for communicating
information about a collection of datasets
  • Hierarchal structure of datasets
  • Dataset access methods
  • Structure on which to hang (reference) metadata

1
0..
0..
0..
0..
19
THREDDS Catalogs
ltcatalog version"0.6"gt ltdataset
name"Unidata IDD Model Data"gt ltdataset
name"NCEP Eta 80km CONUS model data"gt
ltmetadata metadataType"DublinCore"
xlinkhref"http//server/dods/et
a.xml" /gt ltdataset name"NCEP Eta
80km CONUS 2003-09-24 12Z"gt
ltaccess serviceType"DODS"
urlPath"http//server/dods/2003092412_eta.
nc" /gt lt/datasetgt
20
THREDDS Catalogs
THREDDS catalogs are for communicating
information about a collection of datasets
  • Hierarchal structure of datasets
  • Dataset access methods
  • Structure on which to hang (reference) metadata

1
0..
0..
0..
0..
21
THREDDS Catalogs
ltcatalog version"0.6"gt ltdataset
name"Unidata IDD Model Data"gt ltdataset
name"NCEP Eta 80km CONUS model data"gt
ltmetadata metadataType"DublinCore"
xlinkhref"http//server/dods/e
ta.xml" /gt ltdataset name"NCEP Eta
80km CONUS 2003-09-24 12Z"gt
ltaccess serviceType"DODS"
urlPath"http//server/dods/2003092412_eta.
nc" /gt lt/datasetgt
22
THREDDS Catalogs
ltdctitlegtNCEP Eta 80km CONUS model
datalt/dctitlegt ltdccreatorgtNOAA/NCEPlt/dccreatorgt
ltdcsubjectgtNCEP Eta Model data Real-time
datalt/dcsubjectgt ltdcdescriptiongt This
collection of real-time NOAA/NCEP Eta model data
contains five days worth of data. The data is on
a 80km CONUS grid (GRIB grid 211). Daily 00Z and
12Z runs are available where each dataset
includes analysis data and forecast data from a
single Eta run. Each dataset contains forecasts
for every 6 hours going out two and a half days
(60hrs) from the run time. lt/dcdescriptiongt
23
THREDDS Catalogs
THREDDS catalogs are for communicating
information about a collection of datasets
  • Hierarchal structure of datasets
  • Dataset access methods
  • Structure on which to hang (reference) metadata

1
0..
0..
0..
0..
24
THREDDS DQC(Dataset Query Capabilities)
  • THREDDS DQC documents describe how a subset of a
    data collection can be requested.
  • Large and time varying data collections are
    cumbersome to view as a hierarchical structure
  • THREDDS DQC documents describes the set of
    requests that can be made to one or more DQC
    services and the form of those requests.
  • THREDDS DQC documents are an abstract
    representation of a collection of datasets

25
THREDDS DQCSubsetting Large Collections
26
THREDDS DQC
lt?xml version"1.0" encoding"UTF-8"?gt ltqueryCapab
ility name"Unidata IDD NEXRAD Level 3 Radar
Data" version"0.2"gt ltquery
base"http//motherlode.ucar.edu/cgi-bin/thredds/R
adarServer.pl"
construct"append" returns"catalog"/gt
ltselectStation id"station" title"Stations"
multiple"true" required"true"gt
ltstation name"ANCHORAGE/Bethel AK" value"ABC"gt
ltlocation latitude"60.78"
longitude"-161.87"/gt lt/stationgt
lt/selectStationgt ltselectList
id"product" title"Products" multiple"true"
required"true"gt ltchoice name".5
reflectivity .54nm res" value"N0R"
description".5 reflectivity .54nm res
16 levels id 19/r"/gt
lt/selectListgt ltselectList id"time"
title"Times" required"true"gt ltchoice
name"Latest" value"latest"/gt
lt/selectListgt lt/queryCapabilitygt
27
THREDDS Services
  • THREDDS catalogs are sources of information about
    a collection of data on top of which complex
    services can be built. For instance, tools that
  • Provide interoperability with GIS systems
  • Supply external discovery systems with needed
    information (e.g., Dublin Core, DIF, FGDC)
  • Supply information to improve data display and
    analysis, e.g., geolocation information

28
THREDDS and Discovery Systems
  • To supply external discovery services with the
    information they require, we need
  • The proper information added to a catalog, e.g.,
    title and description of a dataset, spatial and
    temporal ranges, parameters, dataset ID.
  • Service to provide metadata in desired encoding
  • Service to feed information to discovery system
  • Use discovery systems to search for data

29
THREDDS and Discovery Systems
Communicate with Discovery Systems
THREDDS Services with data server
Discovery System (e.g., DLESE)
Dublin Core Generator
Metadata Harvester
Searches
Reads
Catalog
Writes
Metadata Repository
References
Data server
30
Search and Discovery Services
31
THREDDS Status
  • Working on new versions of the catalog and DQC
    schemas
  • Working on updating existing tools to use new
    schemas
  • Working with UCAR DMWG and NCAR CDP on enhancing
    descriptive metadata
  • Working with OPeNDAP developers on integrating
    THREDDS and OPeNDAP

32
OPeNDAP and THREDDS
  • Enhance OPeNDAP C implementation to serve
    THREDDS catalogs
  • THREDDS DQC replace OPeNDAP File Servers

33
OPeNDAP and THREDDSMore Information
  • OPeNDAP Web page http//www.unidata.ucar.edu/pack
    ages/dods/
  • OPeNDAP Email list dods_at_unidata.ucar.edu,
    subscribe at http//www.unidata.ucar.edu/packages/
    dods/home/mailLists/
  • THREDDS Email list thredds_at_unidata.ucar.edu,
    subscribe at http//www.unidata.ucar.edu/projects/
    THREDDS/maillists/
  • THREDDS Web page http//www.unidata.ucar.edu/proj
    ects/THREDDS/
  • Support questions support_at_unidata.ucar.edu
Write a Comment
User Comments (0)
About PowerShow.com