Title: A new Generation of endtoend Cyberinfrastructure and Data Services for Earth System Science Educatio
1A new Generation of end-to-end Cyberinfrastructure
and Data Services for Earth System Science
Education and Research
- EGU General Assembly 2005
- Vienna, Vienna
- 29 April 2005
- Dr. Mohan Ramamurthy
- Director, Unidata Program Center
- UCAR Office of Programs
- Boulder, CO
2Principal Drivers
- Technology
- Science
- Education Pedagogy
- Data Volume complexity
- Social, cultural organizational evolution
(global community, collaboration, views on
sharing)
3Technology Evolution Enabling a New Generation
of Data Services
- Internet the World Wide Web
- Commodity microprocessors
- Object-oriented programming
- Open standards
- Web services
- Extensible Markup Language (XML)
- Global, high-bandwidth and wireless networks
- Digital libraries
- Collaboratories
- Grid Computing/e-Science
- Data Portals and Federated, distributed Servers
- Geographic Information Systems
- Knowledge environments
- Ontologies and Semantic web
- Data mining and knowledge discovery
4Science Drivers
- Environmental problems like global change water
cycle transcend disciplinary as well as
geographic boundaries, requiring
multidisciplinary approaches and global teams for
solving them - Rapid advances in observational technologies,
especially in remote sensing - Increasing use of complex, coupled modeling
systems
Research studies on societal impact of
hurricane-related flooding involve integrating
data from atmospheric sciences, oceanography,
hydrology, geology, geography, and social
sciences.
5Science Drivers Examples
6Education Drivers
- A holistic Earth-system science approach to
education - Active, student-centered learning. i.e., learning
science by doing science - Observations (data)
- Tools (models, visualization)
- Discovery
7Data Services An Evolution
- An evolution from proprietary data systems
towards more open Data Services - The transition is not without challenges
8Web Services
- They are self-contained, self-describing, modular
applications that can be published, located, and
invoked across the Web. - The XML based Web Services are emerging as tools
for creating next generation distributed systems
that are expected to facilitate
program-to-program interaction without the
user-to-program interaction. - Besides recognizing the heterogeneity as a
fundamental ingredient, these web services,
independent of platform and environment, can be
packaged and published on the internet as they
can communicate with other systems using the
common protocols. - Emerging web services standards such as SOAP,
WSDL, UDDI, and BPEL4WS are enabling much easier
system-to-system integration.
9Challenges Opportunities
- GOES-R (2012)
- Hyperspectral Environmental Suite (1600
channels) - NPOESS (2009)
- Both NPOESS and GOES-R will have data rates 30-60
times the current rates - Global, coupled models at a grid spacing of 1-5
km, integrated for multi-decades
- Heterogeneity and complexity of distributed
observing, modeling, data, and communication
systems - Nature of data coverage diversity and multiple
spatial and temporal scales - Use of legacy and contemporary technologies
- Lack of standards and interoperability
- User community not monolithic
- Political, technological, and cultural and
regulatory barriers - Integration with GIS and Decision Support Systems
10Data Service Attributes
- User-friendly interface (e.g., portal)
- Transparency (format, protocol,)
- Customization
- Server-side subsetting, subsampling, etc
- Aggregation
- Rich Metadata
- Integration across data types, formats, and
protocols - Intelligent client-server approaches
- Interoperability across services
- Flexibility and Scalability
- Service chaining
- Support an array of tools
11Broad Data Categories
- Future Data Systems must provide a seamless,
end-to-end services for accessing, utilizing and
integrating data across the following data types - Real-time data
- Archived data
- Field and Demonstration Project and Regional
Campaign data - Episodic (Case Study)
- Data from other disciplines (hydrology,
oceanography, cryosphere, chemical and biosphere
- soil, vegetation, canopy, evapotranspiration) - GIS databases
- Multimedia educational materials
- In addition to data, a broad set of tools and
support services should be provided for the most
effective use of data
12Ideal Data Services Will Need to use Hybrid
Access Methods
- Given the very high data rates from each GOES-R
satellite, the university community will need a
hybrid solution that couples a satellite-based
reception system with a terrestrial,
Internet-based data access system - Both local and remote data access mechanisms will
be required - Solutions using remote data access protocols such
as OpenADDE and OPeNDAP already exist
13Internet Data Distribution
Model
Satellite
Radar
14OPeNDAP/DODS Servers
15Thematic Real-time Environmental Distributed Data
Servers (THREDDS)
- To make it possible to publish, locate, analyze,
visualize, and integrate a variety of
environmental data
- Combines IDD push with several forms of pull
and DL discovery - About 25 data providers are partners in THREDDS
- Connecting People with Documents and Data
16THREDDS Interoperability
THREDDS Client Applications
GIS Client Applications
OpenGIS Protocols WMS, WFS, WCS
OGC or OPeNDAPADDE. FTP protocols
OGC or proprietary GISprotocols
Metadatacrosswalk
Metadatacrosswalk
Open Archives Initiative (OAI) Metadata Harvesting
Digital Library Discovery Systems
17Application
Common Data Model
NetcdfDataset
NetcdfFile
ADDE
OpenDAP
HDF5
I/O service provider
NetCDF-3
NetCDF-4
GRIB
NIDS
GINI
Nexrad
DMSP
18Remote Visualizations Using the IDV
IDV in IHOP
Thunderstorm Simulation
Sea-level Pressure and Upper-level Jet
NO2 concentration
Mantle Tomography
19Data Glut
Source Sara Graves, 2002
20Data Mining, Knowledge Discovery Management
End Users (Decision Makers, Students, Scientists)
- An ideal knowledge management system should
- avail of current knowledge in the field,
- stimulate the development of new knowledge and
ideas, - acquire knowledge transparently,
- classify and interrelate knowledge automatically,
- make knowledge globally accessible so that the
right knowledge could be obtained and effectively
utilized by any user who needs it.
Wisdom/ Discovery
Volume
Value
Knowledge
Information
Data
Satellite Data
Source Sara Graves, 2002
21Mining/Detection in LEAD
Data Assimilation System
Forecast Models
NEXRAD, TDWR, FAA, NETRAD Radars
Other Observations
Forecast Model Output
22TeraGrid A 100M NSF Facility
Capacity 20 Teraflops 1 Petabyte of
disk-storage Connected by 40GB network
NSF Recently funded three more institutions to
connect to the above Grid
23Typical Unidata Systems
- Typical Unidata systems at universities 5K -
30K - The community normally uses commodity systems
based on 64-bit X86 (Athlon, Itanium) processors
running Unix/Linux OS - Memory, Storage and network interface are usually
more important than processor power for data
access
24Challenges in bridging between the Grid
Unidata grid
- Bridging Communities with range of computing
capabilities and needs - Many technological, cultural and systemic
challenges remain in connecting departmental and
PI computing systems with high-end Grid
environments like the TeraGrid - With our LEAD-ITR work, we are working to build
bridges between the traditional Unidata community
and High-end Computing Communities.
25Education Integration
- Combining Data Objects with Learning Objects key
to enhancing educational value of data - Examples Bundles and data interactive documents
- VGEE example
- Mini-case study examples
- Need to integrate the two types of objects more
comprehensively and with metadata ontology
services and concept maps
26Metadata, Ontologies and Knowledge Environments
- Integration of metadata (syntactic and semantic),
with ontologies and concept maps is critical in
successfully creating knowledge environments - Ontologies An explicit, formal specification of
how to represent the objects, concepts, and other
entities that are assumed to exist in some area
of interest and the relationships that hold among
them. - e.g., glossaries data dictionaries, thesauri
taxonomies, schemas data models, and inference
27Thank You!
- Questions?
- Contact information mohan_at_ucar.edu
- http//www.unidata.ucar.edu/