E126 A Spatial Subscription Server for Earth Science Data - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

E126 A Spatial Subscription Server for Earth Science Data

Description:

Background information on the ECS project. Why a new subscription server ... How to check for undelivered email? Some memory leaks observed in Perl/DBI & in SQS ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 77
Provided by: Kar9252
Category:

less

Transcript and Presenter's Notes

Title: E126 A Spatial Subscription Server for Earth Science Data


1
E126 A Spatial Subscription Server for Earth
Science Data
  • Greg Dobbins
  • Senior Software Engineer
  • Raytheon Systems, Landover MD
  • gdobbins_at_eos.east.hitc.com

2
Agenda
  • What this presentation will cover
  • Background information on the ECS project
  • Why a new subscription server was needed
  • Design of the Spatial Subscription Server
  • Introduction to Spatial Query Server (SQS)
  • Performance and tuning
  • Open Issues
  • Future work

3
Technical Overview
  • Products used in the solution
  • Sybase ASE 12.5 with Open Client 12.0
  • Spatial Query Server 3.4.2 (SQS) from Autometric
  • Perl 5.6.1 with DBI and CGI libraries
  • Sybase Transact-SQL
  • Netscape 4.78 browser and web server
  • HTML and JavaScript
  • Sun/Solaris 2.5.1 and SGI/IRIX 6.5

4
Background Information EOS ECS
  • NASAs Earth Observing System (EOS)
  • A satellite-based monitoring system for
    supporting research on global environment change
  • Three principal satellites carrying multiple
    scientific instruments
  • Landsat 7
  • TERRA
  • AQUA
  • EOSDIS ground system (ECS) developed by Raytheon

5
Background Information EOS ECS
  • Scientific instruments aboard orbiting satellites
    collect various earth science data
  • Landsat 7 (http//landsat.gsfc.nasa.gov)
  • remotely sensed data of the earths land surface
  • TERRA (http//terra.nasa.gov)
  • ASTER surface temperature, emissivity,
    reflectance, and elevation
  • CERES clouds and earths radiant energy
  • MISR sunlight scattering, aerosol particles

6
Background Information EOS ECS
  • Scientific instruments aboard orbiting satellites
    collect various earth science data
  • TERRA (http//terra.nasa.gov)
  • MODIS cloud cover, photosynthetic activity of
    plants, areas of snow and ice, ground fires
  • MOPITT measures pollution in the troposphere
  • AQUA (http//aqua.gsfc.nasa.gov)
  • launched May 4, 2002
  • carries another MODIS instrument

7
Background Information EOS ECS
  • Distributed Active Archive Center (DAAC)
  • Four DAACs employ ECS to archive and retrieve
    data.
  • Goddard Space Flight Center, Greenbelt MD
  • upper atmosphere, global biosphere, atmospheric
    dynamics, geophysics (MODIS)
  • Systems Monitoring Coordination (SMC)
  • Land Processes DAAC, Sioux Falls SD
  • land processes (Landsat 7, ASTER)
  • operated by U.S. Geological Survey

8
Background Information EOS ECS
  • Distributed Active Archive Center (DAAC)
  • Four DAACs employ ECS to archive and retrieve
    data.
  • NASA Langley Research Center, Hampton VA
  • radiation, tropospheric chemistry, clouds,
    aerosol
  • CERES, MISR, MOPITT
  • National Snow and Ice Data Center, Boulder CO
  • snow and ice, cryosphere and climate (MODIS)
  • affiliated with University of Colorado

9
Background Information EOS ECS
  • More information on NASAs Earth Science
    Enterprise can be obtained from the Earth
    Observatory website
  • http//earthobservatory.nasa.gov

10
Subscriptions
  • Subscribing to data
  • Scientists place standing orders for data by
    requesting that the DAAC enter a subscription on
    their behalf.
  • A subscription is for a particular
    collection/event type
  • collection defined by short name and version
  • event types insert, delete, update_metadata
  • A subscription can be temporally qualified
  • could simulate spatial qualification this way

11
Background Information Synergy
  • The Synergy Project
  • Under NASA sponsorship, Raytheon is working with
    several universities and state/local agencies to
    explore non-research uses for EOS data.
  • Focus on making the data more readily available
  • Infomarts
  • Internet-based windows
  • application-specific
  • remotely-sensed data combined with local data to
    provide an enriched information display

12
Background Information Synergy
  • ECS Data Accessibility Enhancements
  • As the user community grows, there is an
    increased interest in rapid access to data.
  • Data accessibility issues
  • Slow distribution times due to tape access
  • Cumbersome user interface
  • Difficult to qualify subscriptions spatially

13
Background Information Synergy
  • ECS Data Accessibility Enhancements
  • As the user community grows, there is an
    increased interest in rapid access to the data.
  • Data accessibility enhancements
  • Data Pool of online ECS data
  • frequently-accessed data stored in online storage
    cache (Data Pool)
  • web-based drill-down and retrieve interface
  • Spatial Subscription Server

14
Improvements Achieved by the Spatial Subscription
Server
  • Spatial Subscription Server has several features
    not found in the original subscription server
  • Subscriptions may be qualified spatially as well
    as temporally
  • Numerical parameters qualified by min/max range
  • Web-based GUI for entering subscriptions
  • Email notification with metadata included
  • Option to insert data into the Data Pool
  • Greater throughput with minimal impact on ECS

15
Spatial Subscription Server Components
  • The Spatial Subscription Server consists of the
    following components
  • A web-based GUI for managing subscriptions
  • a Command Line Interface (CLI) is also available
  • A subscription database
  • communicates with a science database
  • Drivers which monitor/process database
    queues/logs
  • match data events with subscriptions
  • carry out the actions of a matched subscription

16
Spatial Subscription Server Component Diagram
1. Science Data Server Database
8. Science Data Server CLI
4. Subscribed Event Driver
5. Action Driver
2. Spatial Subscription Server Database
10. Data Pool Queue
6. Recovery Driver
7. Deletion Driver
9. UNIX Mail Server
3. Subscription GUI
17
Spatial Subscription Server GUI
  • Features of the GUI
  • Web-based
  • For DAAC operations use, not for external users
  • Operator can add, view, update, or delete a
    subscription
  • Implemented in Perl
  • database connection via DBI functions
  • generates HTML code for display and JavaScript
    functions for client-side data validation
  • event handling via CGI methods

18
Spatial Subscription Server GUI
19
Spatial Subscription Server GUI
20
Spatial Subscription Server GUI
21
Spatial Subscription Server GUI
22
Spatial Subscription Server GUI
23
Spatial Subscription Server GUI
  • Entering data to create a subscription
  • When the user clicks on Apply the following
    occurs
  • The user name is validated in an accounting
    database.
  • This table is replicated across all DAAC sites.
  • All collection names beginning with AST are
    retrieved from a table in the SSS database.
  • A revised form is displayed.
  • The retrieved collection names now populate the
    pull-down list.

24
Spatial Subscription Server GUI
25
Spatial Subscription Server GUI
  • Entering data to create a subscription
  • Clicking on Apply again commits the user to a
    particular collection name.
  • The event handler searches the SSS database for a
    list of attributes that can be used to qualify
    the subscription.
  • Only attributes that are meaningful for the
    selected collection type will be displayed.
  • A check is performed to see if there is spatial
    metadata associated with this collection type.

26
Subscription Qualifiers
  • Various possibilities for qualifying a
    subscription
  • Any of the attributes relating to the collection
    type may be used to qualify the subscription.
  • Temporal qualification
  • A date range is specified for the time of
    collection.
  • If the actual collection time interval overlaps
    this range, we have a successful match.
  • Numerical qualification
  • min and max values are specified for the attribute

27
Subscription Qualifiers
  • Various possibilities for qualifying a
    subscription
  • Any of the attributes relating to the collection
    type may be used to qualify the subscription.
  • String or character qualification
  • value must match exactly
  • Spatial qualification
  • User specifies an area of the earths surface
  • If the actual collection area overlaps this, we
    have a successful match.

28
Spatial Subscription Server GUI
29
Spatial Subscription Server GUI
30
Spatial Subscription Server GUI
31
Spatial Subscription Server GUI
32
Spatial Subscription Server GUI
33
Spatial Subscription Server GUI
  • Spatial qualification
  • The fact that spatial qualification is offered by
    the GUI indicates that this collection has
    associated spatial metadata.
  • The user spatially qualifies the subscription by
    entering coordinates for a latitude/longitude
    box (llbox).
  • A match means the llbox intersects the actual
    region of the earths surface where the data was
    collected.
  • We employ a Spatial Query Server (SQS) to
    facilitate the handling of spatial data types.

34
Spatial Query Server (SQS)
  • Spatial Query Server
  • A Spatial Query Server facilitates the handling
    of spatial data types.
  • Developed by Boeing Autometric
  • A multi-threaded database engine supporting
  • definition of spatial data types (e.g. line,
    polygon)
  • spatial operations (e.g. intersect, inside,
    outside)
  • a spatial indexing schema (clustered or non-)
  • All data and indexes are stored within ASE
    database

35
Spatial Query Server (SQS)
  • Spatial Query Server
  • built on the Sybase Open Server framework
  • seamless interaction and communication with ASE
  • creates additional user tables for use by SQS
  • existing applications such as isql work the same
    way
  • queries are in GeoSQL, a superset of Transact-SQL
  • non-spatial queries are passed through to ASE
  • spatial queries require pre- and post-processing

36
Spatial Query Server (SQS)
Application Program
SQS Open Server
Sybase Server
GeoSQL
TransactSQL
Storage
37
Spatial Query Server (SQS)
  • Spatial Query Server
  • Most collection types have associated spatial
    metadata, although the spatial data types may
    vary.
  • Spatial data types associated with data granules
  • gpolygon (generalized polygon)
  • bounding rectangle (llbox)
  • orbit (broken down into paths and blocks)
  • Regardless of the actual spatial type, llbox is
    always used for subscription qualification.

38
Spatial Subscription Server GUI
39
Actions
  • Actions for subscriptions
  • A subscription always has at least one associated
    action to be performed when it is matched by a
    data event.
  • Actions that can be associated with a
    subscription.
  • Email notification
  • Electronic data distribution
  • includes separate email notification by ECS
  • Insertion of the data granule into the Data Pool

40
Spatial Subscription Server GUI
41
Spatial Subscription Server GUI
42
Spatial Subscription Server GUI
43
Spatial Subscription Server GUI
44
Spatial Subscription Server GUI
  • Other features
  • The GUI can also be used to perform other
    operations.
  • To cancel ( delete) a subscription
  • To update a subscription
  • change the ownership (user name)
  • toggle between Active Inactive states
  • extend or shorten the expiration date
  • add, delete, or modify qualifiers

45
Spatial Subscription Server GUI
  • Other features
  • The GUI can also be used to perform other
    operations.
  • However, the collection type cannot be updated.
  • To monitor operations
  • View the action queue
  • Get operations statistics
  • How many events/actions left to dequeue
  • Average/Maximum times to perform certain tasks

46
Data Events
  • Science Data Server
  • The Science Data Server (SDS) is the master
    subsystem for overseeing the archiving and
    distribution of data.
  • The arrival of a new data granule will result in
    an entry in the notifier queue, a table in the
    SDS database.
  • A granule is a quantity of data, typically the
    amount of data collected over a five minute
    interval.
  • One of the columns in this table is the universal
    reference (UR), which uniquely identifies the
    granule.

47
Data Events
  • Notifier Queue
  • The notifier queue in the SDS database is the
    starting point for recognizing new data events.
  • An insert trigger for this table parses the UR
    and compares the information obtained about the
    granule with subscription information in the SSS
    database.
  • If the granule collection type matches that of at
    least one subscription, an entry is inserted into
    the subscribed event queue, a table in the SSS
    database.

48
Subscribed Events
  • Subscribed Event Queue
  • The subscribed event queue contains basic
    information about data events that are potential
    matches for subscriptions.
  • It is a non-destructive queue with front and rear
    pointers stored in separate tables.
  • A stored procedure, called by the notifier queue
    insert trigger, enqueues data.
  • The subscribed event driver dequeues data.
  • A second table logs all queue activity.

49
Subscribed Event Driver
  • The subscribed event driver identifies
    subscriptions that are matches for events
  • Dequeues the event.
  • Obtains detailed metadata from the SDS database.
  • metadata is a description of the data granule
  • Compares subscriptions having the same collection
    type with the metadata for the event
  • every subscription qualifier must be satisfied by
    the metadata before the subscription is deemed a
    match

50
Subscribed Event Driver
  • The matching algorithm
  • A truth table is employed for computing exact
    matches.
  • The metadata for the event is compared against
    every qualifier in the database for subscriptions
    of that type.
  • One match results in one entry in the truth
    table.
  • A subscription is matched if there is precisely
    one entry in the truth table for each of its
    qualifiers.
  • Unqualified subscriptions are matched trivially
    inactive or expired subscriptions are ignored.

51
Actions
  • Action Queue
  • The action queue contains basic information about
    matched pairs of events and subscriptions.
  • It is a non-destructive queue with front and rear
    pointers stored in separate tables.
  • The subscribed event driver calls a stored
    procedure that enqueues data.
  • The action driver calls a stored procedure that
    dequeues data.
  • A second table logs all queue activity.

52
Actions
  • Action Driver
  • The action driver processes the actions
    associated with matched subscriptions.
  • Dequeues from the action queue.
  • Actions associated with a particular subscription
    are identified.
  • email notification
  • data acquisition (electronic distribution)
  • insertion into the Data Pool

53
Action Driver
  • Processes actions for matched subscriptions
  • email notification
  • may include all metadata for the event or just
    the metadata corresponding to subscription
    qualifiers
  • the notification message is composed and
    submitted to the UNIX mail server
  • data acquisition (electronic distribution)
  • FTP push or pull
  • a request is submitted to the SDS via its CLI

54
Data Pool Actions
  • Third kind of action is Data Pool insert
  • Insertion into the Data Pool database action
    queue
  • Data Pool inserts are in fact done by the event
    driver
  • All Data Pool requests matching the same data
    event must enter the Data Pool insert action
    queue together
  • This is more easily managed during event
    processing
  • A buffer table stores Data Pool requests
    temporarily
  • An insert trigger on the buffer table performs a
    single insert into the Data Pool database after
    all subscriptions matching the event have been
    checked

55
Recovery Driver
  • Attempts to restart stalled processes
  • Monitors the event and action queue logs, looking
    for suspect behavior
  • An event was dequeued but did not finish.
  • A matched subscription was dequeued, but not all
    of its actions were carried out.
  • Stalled events are re-inserted into the
    subscribed event queue.
  • Stalled subscriptions are re-inserted into the
    action queue, but any successful actions are not
    repeated.

56
Deletion Driver
  • Performs cleanup
  • An action queue entry that has been processed to
    completion can be deleted after a suitable length
    of time.
  • Its log entries must first be deleted.
  • An event queue entry that has been processed to
    completion can be deleted after a suitable length
    of time, whether or not it had matching
    subscriptions.
  • Its action queue data, if any, must first be
    deleted.
  • Its log entries and metadata must first be
    deleted.

57
Spatial Subscription Server Database
  • A single Sybase ASE database
  • 38 user tables, plus a version control table
  • llbox qualifier is the only spatial data stored
  • 36 stored procedures (other than threshold
    procedures)
  • 1 trigger (on the buffer table)
  • rest of the triggers are in the SDS database
  • storage requirements
  • 1000 MB data
  • 500 MB log

58
Spatial Subscription Server Database
  • Key tables
  • EcNbEventDefinition (subscribable events)
  • EcNbSubscription (subscriptions)
  • EcNbMatchingExpression (subscription qualifiers)
  • EcNbSpatialMatchingExpression (spatial
    qualifiers)
  • EcNbSubscribedEventQueue and its log table
  • EcNbActionQueue and its log table

59
Event Definition Table
60
Event Definition Synchronization
  • Event definitions in the SSS database correspond
    to those in the SDS database
  • In SSS EcNbEventDefinition
  • In SDS DsDeEvent
  • Triggers on DsDeEvent keep the two tables in
    synch
  • An insert in DsDeEvent is propagated to SSS
  • A delete in DsDeEvent causes the corresponding
    event to be marked for deletion in SSS
  • deletion driver will perform the actual cleanup

61
Subscription Table
62
Matching Expression Table
63
Spatial Matching Expression Table
64
Subscribed Event Queue Table
65
Subscribed Event Queue Log Table
66
Action Queue Table
67
Action Queue Log Table
68
Implementation Drivers
  • Drivers are coded in Perl
  • DBI library routines used for database access
  • connections
  • standard ASE connection
  • SQS connection for dealing with spatial data
  • queries
  • stored procedure calls when possible
  • a spatial query must be performed outside a
    stored procedure

69
Configuration
  • Multiple instances of drivers run concurrently
  • Typically, 3 to 6 instances of the event action
    drivers
  • A single instance of the recovery driver will
    suffice
  • how long before considered stalled is
    configurable
  • how long to sleep if nothing to do is
    configurable
  • Typically, one or two instances of the deletion
    driver
  • minimum age prior to deletion is configurable
  • The optimal system configuration may vary among
    sites

70
Performance
  • Performance statistics
  • The system can process three new events per
    second and one matched event per second
  • Performance gains were achieved by
  • limiting the number of active DBI statement
    handles
  • using SQS connections only when performing
    spatial queries, normal ASE connections otherwise
  • batching multiple queries when practicable
  • collocating the drivers with the ASE server

71
Performance
  • Performance tuning
  • Monitoring with Sybase Central revealed that
    greatest lock contention is on tables storing the
    queue pointers
  • the action queue is especially busy
  • we tried an alternative queue implementation
  • no separate pointers (queue has identity column)
  • dequeue the minimum identity not yet dequeued
  • destructive queue with datarow locking
  • no performance gain was realized

72
Performance
  • Performance tuning
  • Most queries follow primary key indexes
  • EcNbSpatialMatchingExpression has a clustered
    index on the spatialConstraint column (llbox)
  • Very few deadlocks observed
  • Number of database connections remains more or
    less constant
  • two connections per Perl driver instance
  • The syslogs table was given its own named cache

73
Open Issues
  • Some problems yet to be resolved
  • No recovery in the case of failed deletions
  • recovery driver should monitor deletion queue
  • Difficult to detect and recover from failed
    dequeue operations
  • did the dequeue ever occur or is the system slow?
  • How to check for undelivered email?
  • Some memory leaks observed in Perl/DBI in SQS

74
Future Enhancements
  • Work in progress
  • Physical media distribution
  • currently all acquires are via electronic
    distribution
  • user might prefer tape or CD
  • more efficient to allow multiple granules per
    device
  • subscription can belong to a bundling order
  • distribution occurs when the bundle is complete
  • distribution requests handled by an Order Manager

75
Future Enhancements
  • Work in progress
  • Subscriptions with Data Pool actions can be
    associated with a particular theme
  • operator may wish to suspend, resume, or cancel
    subscriptions by theme
  • Make GUI compliant with provisions of Section 508
    of the Rehabilitation Act
  • should be usable by those with certain physical
    handicaps

76
Acknowledgements
  • Several people contributed to a successful
    release of the Spatial Subscription Server
  • Concept, Design Richard Meyer Peter MacHarrie
  • Database Design Peter MacHarrie
  • Detailed Design, Coding, Unit Test of GUI
    Drivers Milton Stevens, Siwei Xu, Greg Dobbins
  • Project Manager, Requirements, Integration Test
    Lead Kathy Carr
Write a Comment
User Comments (0)
About PowerShow.com