A NOAANASA Pilot Project for the Preservation of MODIS Data from the Earth Observing System EOS - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

A NOAANASA Pilot Project for the Preservation of MODIS Data from the Earth Observing System EOS

Description:

Flight Operations, Data Capture, Initial Processing, Backup Archive. Data. Transport ... and metadata archives built on international standards and will allow future ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 35
Provided by: roberth155
Category:

less

Transcript and Presenter's Notes

Title: A NOAANASA Pilot Project for the Preservation of MODIS Data from the Earth Observing System EOS


1
A NOAA/NASA Pilot Project for the Preservation of
MODIS Data from the Earth Observing System (EOS)
  • Robert H. Rank
  • NOAA/NESDIS
  • Kenneth R. McDonald
  • NASA/GSFC

2
Topics
  • Data Life Cycle Background
  • NASAs Earth Science Data
  • NOAA Data Centers Mission
  • Guiding Principles
  • CLASS Overview and Goals
  • Long-Term Archive Challenges
  • MODIS Pilot Project
  • Reference Model
  • OAIS Responsibilities
  • Data Submission Agreement (DSA)
  • Schedule
  • Expected Outcomes
  • Experience Using OAIS
  • Conclusion

3
Data Life Cycle
A simple model showing the four major lifecycle
entities within a context of an overall set of
guiding policies
Some of these functions may be grouped together
in any given mission or project
4
Long Term Archive Requirements
  • NASA shares the responsibility for stewardship of
    its Earth science data resources with NOAA and
    USGS.
  • NASA holds the responsibility for its data during
    the life of each mission plus four years.
  • NOAA and USGS provide the long term archive for
    ocean and atmosphere data and land processes
    data, respectively.
  • Agreements are in place between NASA and NOAA and
    between NASA and USGS that document these
    responsibilities.

5
NASAs Earth Observing System (EOS) Data and
Information System (EOSDIS)
6
EOSDIS Overview
  • EOSDIS Functions
  • A production capability for standard data
    products from EOS instruments
  • An active archive of Earth science data from
    EOS and other past and present missions
  • A distributed information framework (data
    centers, SIPS, networks, interoperability
    infrastructure)
  • EOSDIS Operations
  • Supporting EOS missions since 1999 and heritage
    data archives since 1994.
  • Operations at 8 DAACs and 13 SIPS.
  • Total archive of over 4 petabytes, growing at 4
    terabytes per day
  • Over 200,000 distinct users obtaining data from
    DAACs
  • Annual distribution of 33 million data products,
    2 TB per day.

7
NOAAs National Data Centers -- Environmental
Data Stewards
Scientific Data Stewardship is ownership,
knowledge, utilization, and application of the
data CLASS is the Information Technology
infrastructure (hardware and software
environment, and tools) underpinning SDS Data
Rescue preserves and makes available historical
data sets from obsolete media
8
NOAAs National Data Centers
  • NOAAs National Data Centers are major archive,
    access, and assessment sites maintaining,
    processing, and distributing environmental and
    geospatial data.
  • National Climatic Data Center
    WWW.NCDC.NOAA.GOV
  • Asheville, NC
  • National Coastal Data Development Center
  • Stennis, MS WWW.NCDDC.NOAA.GOV
  • National Geophysical Data Center
    WWW.NGDC.NOAA.GOV
  • Boulder, CO
  • National Oceanographic Data Center
    WWW.NODC.NOAA.GOV
  • Silver Spring, MD

9
NOAAs National Data Centers(Continued)
  • These Centers provide long-term stewardship for
    most of NOAAs environmental and geospatial data,
    and a broad range of user services.
  • They serve as both
  • Centers of Data -- facilities where extensive
    collections of given environmental parameter(s)
    are maintained because of individual or
    institutional research or operational
    requirements
  • Agency Record Centers -- facilities where data
    is made accessible to a large user community, as
    well as being preserved and protected to certain
    standards

10
Guiding Principles
All NOAA environmental data will
  • be made accessible to broader data integration
    efforts, such as Global Earth Observation System
    of Systems GEOSS
  • reside in secure archives conforming to National
    Archives and Records Administration (NARA) and
    Continuity Of Operations (COOP) standards
  • be maintained at the highest standards of
    scientific data stewardship
  • be searchable using advanced data discovery
    tools, to facilitate interdisciplinary studies
  • be accessible through a common portal available
    to the scientific community, commercial sector,
    and general public based on advanced access tools

11
Comprehensive Large Array-data Stewardship System
(CLASS)
12
WHY a CLASS?
  • Fulfill NOAAs legal requirement to provide for
    archive and access to its data
  • The source for the vast majority of observational
    environmental data generated by NOAA.
  •  Provide critical products to Customers
  • Public and Private Research Development efforts
  • Colleges and Universities
  • Federal, State, and Local Climatologists
  • Agriculture Users, Drought Monitors, and Flood
    Management
  • Accident Investigators Legal Community
  • Coastal Monitoring, Algae Blooms, and Fishing
    Management

13
CLASS Overview
  • CLASS is a web-based data archive and
    distribution system for NOAAs environmental data
  • CLASS is an evolving system which will support
    additional campaigns, broader user base, new
    functionality as implementation continues for the
    next 10 years
  • CLASS is the principal IT system supporting
    NOAAs responsibility as environmental data
    stewards
  • CLASS concurrently supports both ongoing
    operations and new requirements implementation

14
CLASS Campaigns
  • NOAA and Department of Defense (DoD)
    Polar-orbiting Operational Environmental
    Satellites (POES) and Defense Meteorological
    Satellite Program (DMSP)
  • NOAA Geostationary Operational Environmental
    Satellites (GOES)
  • EUMETSAT Meteorological Operational Satellite
    (Metop) Program
  • NOAA NEXT generation weather RADAR (NEXRAD)
    Program and future dual polarized and
    phased-array radars.
  • National Aeronautics and Space Administration
    (NASA) Earth Observing System (EOS)
    Moderate-resolution Imaging Spectrometer (MODIS)
  • The NPOESS Preparatory Project (NPP)
  • National Polar-orbiting Operational Environmental
    Satellite System (NPOESS)
  • National Centers for Environmental Prediction
    Model Datasets, including Reanalysis Products

15
CLASS GOALS
  • Give any potential customer access to all NOAA
    (and possibly non-NOAA) data through a single
    portal
  • Eliminate the need to keep creating stovepipe
    systems for each new type of data, but, in as
    much as possible use already polished
    portions/modules of existing legacy systems
  • Describe a cost-effective architecture that can
    primarily handle large array data sets but also
    be capable of handling smaller data sets as well

16
CLASS Summary
  • A NOAA-wide Data Management System (DMS) can
    evolve from CLASS by initially integrating with
    the NOAA National Data Centers and ultimately
    with the NOAA Centers of Data
  • The CLASS backbone will provide the DMS for
    large-array (largely NESDIS) data sets, but also
    provide secure archival services to other NESDIS
    and NOAA users who participate in the NMMR and
    NOAA-Server
  • This approach will leverage the resources of
    CLASS, NVDS, SDS, and the various funding
    vehicles being use by non-NESDIS NOAA
    organizational components
  • This semi-distributed architecture, with central
    data and metadata archives built on international
    standards and will allow future integration of
    NOAA systems into GEOSS
  • CLASS will be the NOAA archive for NPP/NPOESS,
    EOS and GOES-R data
  • CLASS is accessible via the web at
    www.class.noaa.gov

17
LTA Challenges
  • What data are needed for long-term archive?
  • How is long-term preservation achieved?
  • What services do users need to deal with these
    data volumes?
  • What are the people vs. machine issues?
  • How will new technology help?
  • Metrics for assessing how we are doing
  • National Research Council Panel enabled to help
    address this issue

18
challenges
  • The archived information must be useable by
    consumers who are separated in time, distance and
    background from the producers
  • producers no longer available
  • cannot answer questions on ad-hoc basis
  • producers software not supported - may be
    obsolete
  • knowledge captured by the software becomes
    unavailable
  • documentation is lost over time

19
...challenges
  • The user community will change over time
  • new community will be unfamiliar with the
    background to the information
  • may use different analysis environment
  • may want to combine information from many sources
  • The archive will change over time
  • migration to new technology - hardware/software
  • may require reorganization of information
  • possible changes in implicit relationships
  • migration to different institutions
  • possible changes to management, data structure,
    file format

20
MODIS Pilot Project
  • Purpose is to define system interfaces and
    implement transfer capability.
  • Established MODIS L0 and L1B as initial
    candidates for data transfer - L0 is stable and
    L1B has high user demand.
  • Team includes representatives from ESDIS, GES
    DAAC, MODIS SDST, CLASS/Suitland, NCDC, NGDC,
    Fairmont, WV.
  • Established the collaboration tools and methods.
  • Pilot Plan Actions and Schedules
  • Working Group Charter
  • Following Open Archival Information System
    Standard (OAIS) as an LTA Reference Model.

21
Pilot Schedule Highlights
  • CLASS Operational at NSOF (ingest node) Jan
    2006
  • Prototype 1 week MODIS L0 transfer Feb 2006
  • Prototype 1 month L0 transfer (6x rate) Mar
    2006
  • Evaluate L0 continuous feed (40 days) Jun
    2006
  • DSA for MODIS L1B Feb 2006
  • ICD for MODIS L1B Apr 2006
  • Prototype 1 week L1B transfer May 2006
  • Prototype 1 month L1B transfer (6x rate) Jun
    2006
  • Evaluate continuous feed (20 days) Jul 2006
  • Access and Delivery Capability Aug 2006
  • Pilot Project Evaluation Report Oct 2006
  • Project Plan for NOAA MODIS LTA Dec 2006
  • NOAA/NASA Panel Report Dec 2006

22
MODIS Pilot Project Expected Outcomes
  • NASA and NOAA will have a better hands-on
    understanding of system capabilities, conventions
    (e.g. data model) standards and processes of
    respective systems.
  • A draft set of interface documentation (DSA, ICD,
    etc.).
  • An interface between EOSDIS and CLASS - defined
    and exercised.
  • An actual demonstration of CLASS support for EOS
    data.
  • The foundation for the development of a sound
    NASA/NOAA LTA plan.

23
Reference Model
  • A Reference Model is needed to provide a common
    framework for discussion description
  • A major aim is to facilitate a much wider
    understanding of what is required to preserve
    information for the long term
  • Facilitates description and comparison of
    archives
  • Provides a basis for further standardization
  • help broaden the market for commercial providers

24
...Reference Model
  • We are particularly concerned with Long-Term
    Preservation of digital information
  • long term is long enough to be concerned about
    changing technologies
  • not just bit preservation
  • starting point for model addressing non-digital
    information

25
......Reference Model
  • But this work is also of use for Short-Term
    archives because
  • technological change is rapid (years, not
    decades)
  • the short-term archive may eventually hand
    information over to another, longer-term, archive

26
Areas for Standards to follow
  • Interfaces between OAIS type archives
  • Submission to OAIS (SIP)
  • Dissemination from OAIS (DIP)
  • Search retrieve metadata from OAIS
  • Sufficient information should be provided to
    ensure the rendered content may be interpreted
    and understood by its intended users.
  • Information migration
  • Procedures should indicate the file format and
    version to be created and software used to create
    it.
  • Provenance
  • A description of the content history, including
    its origins, changes to the object or its content
    over time, and its chain of custody (if known).

27
OAIS Responsibilities
  • Negotiates accepts Submission IPs
  • Determines communities which need to be able to
    understand Content Information
  • Ensures information to be preserved is
    understandable to designated communities
  • Assumes sufficient control of information to be
    able to ensure long-term preservation
  • Follows policies procedures to ensure
    information is preserved
  • Makes the information available to the designated
    communities in appropriate forms

28
Data Submission Agreement (DSA)
  • Include all the information that is necessary for
    the producer to provide data products to the
    archive and for the archive to receive the data
    products from the produce
  • It seems a daunting (or rather an impossible)
    task to collect all of the information listed
    above in a single document in a timely fashion.
  • Need a high-level agreement in place before we
    proceed to specify the details of the
    Producer-Archive interface and to design the
    respective systems.
  • There is no way of compiling operational
    information until near the start of the
    operational phase.

29
DSA Groupings
  • High-level agreement
  • the content is rather static (i.e., temporally
    stable) and provides a framework for both the
    Producer and the Archive to move on to defining
    details.
  • Detailed level interface and some functional
    specification
  • the content is somewhat dynamic (i.e., changing
    with time) and requires for the Producer and the
    Archive to do some in-depth studies.
  • Operational information
  • the content is not available until near the time
    when the data flows commence (e.g., IP addresses,
    host directory names, operations contacts)
  • Quasi-static metadata details
  • the definite content is hard to come by,
    especially for a planned spacecraft missions.
  • More??

30
DSA Groupings
  • With these considerations in mind, we suggest
    that the Producer-Archive Agreement
  • Divided into several separate documents
  • Each being signed at a different
    management/technical level and at a different
    time
  • Memorandum of Agreement (MOA)
  • Interface Control Document (ICD)
  • Operations Agreement (OA)
  • Quasi-Static Metadata Specification (QSMS)
  • Others?

31
DSA Groupings
  • The MOA should be developed early on and signed
    by a high-level management of both parties.
  • It should provide a firm ground for detailed
    level technical work to proceed.
  • Any details that will become clearer later or
    simply are unknown will have to be deferred to
    the lower level components of the agreement
    (i.e., ICD, OA, and QSMS).
  • Once the MOA is signed, both parties may start
    developing ICD and QSMS.
  • Forms the basis for the design of the physical
    systems (for both the Producer and the Archive).
  • The OA can wait until the time of the system IT

32
DSA Groupings
  • Depending on the circumstances, the Producer and
    the Archive may include additional documents.
  • There are certain items that do not belong in the
    MOA and yet are not covered by the ICD.
  • We may call it Supplement to the MOA (yet still
    separate from the MOA).
  • This approach of creating the Producer-Archive
    Agreement in multiple volumes and releasing each
    sequentially in time appears to be far better
    than the current approach of creating a single
    volume agreement.

33
Use of OAIS
  • Benefits
  • Good overall framework of terms, functions and
    processes to structure the LTA discussion
  • Identifies a set of documents to capture and
    record requirements and specifications
  • Challenges
  • Timing - starting to use OAIS in the middle of
    the data life cycle of EOS data has been
    difficult
  • Complexity of EOS LTA requirements - numbers of
    products, data volumes, processing S/W
  • Overload on Data Submission Agreement - Interface
    Requirements Document, Interface Control
    Document, Operations Agreement

34
Conclusions
  • Transfer of NASAs Earth science data to NOAA for
    long-term preservation and stewardship is a major
    undertaking
  • NOAA/NASA MODIS Pilot Project - way to get
    started
  • Project provides great case study for use of OAIS
    Reference Model
  • Services to project and source of feedback on RM
  • Still learning how to best use OAIS
  • Expect that as OAIS is more widely used, over
    entire data life cycle, it will be even more
    valuable
Write a Comment
User Comments (0)
About PowerShow.com