Title: Server-side OPeNDAP Analysis - A General Approach Utilizing Legacy Applications through TDS
1Server-side OPeNDAP Analysis - A General Approach
Utilizing Legacy Applications through TDS
- Roland Schweitzer
- Weathertop Consulting, LLC
- Steve Hankin and Ansley Manke NOAA/PMEL
2Highlights
- Server-side analysis
- Motivation
- LAS as OPeNDAP Client and Server
- Evolution of the server implementation
- Community call to action
- Summary
3Server-side Analysis
- In general server-side analysis is a computation
made by an OPeNDAP server at the request of a
client. - The specification of the computation is
transmitted to the server via the OPeNDAP URL.
4Motivation
- We are interested in server-side analysis for use
with the Live Access Server (LAS). - Primarily as a way to implement comparisons
between data defined on different grids - We want our implementation to leverage the
analysis capabilities of legacy applications like
Ferret and GrADS. - We want to use our experience running legacy
applications (like Ferret) from within a Java
runtime environment.
5The Live Access Server (LAS)
A highly configurable Web server designed to
provide flexible access to geo-referenced
scientific data
6LAS Architecture
LAS Product Server
Product Server
Metadata(XML)
client
SQLBackend Service
DRDSBackend Service
product
Local netCDF data
metadata
DRDS server
OPeNDAP server
product request XML (REST)
back endrequest (SOAP)
7Comparing OPeNDAP datasets
LAS
Product Server
Metadata(XML)
user
FerretBackend Service
SQLBackend Service
DRDSBackend Service
Ferret
product
Suppose the variables are on different grids?
OPeNDAP server
OPeNDAP server
metadata
product request XML (REST)
back endrequest (SOAP)
8LAS as an OPeNDAP Server
- Data on grids which are available via LAS are
guaranteed to be geo-referenced and at least
COARDS compliant. - We can often repair (including re-gridding) the
data and/or metadata by associating a script of
Ferret commands with the data source in the LAS
configuration. - Wouldnt it be nice to make these repaired data
available via OPeNDAP?
9The Ferret Data Server
- FDS made this possible.
- FDS provides an OPeNDAP view of the data being
served by LAS and makes any transformations
specified by the associated script before serving
the data. - FDS also implements server-side analysis
(including the ability to pass in external data
sources).
10A GDS Digression
- The GrADS Data (DODS) Server is the first
implementation of this concept. - In fact, FDS used the Anagram framework upon
which GDS is built. - Both GDS and FDS use the Java Runtime environment
to invoke the associated legacy app (GrADS or
Ferret) to do the heavy lifting.
11FDS Capabilities
- FDS took advantage of several characteristics of
the underlying engine (Ferret). - New "virtual" data variables can be defined
- Can build the metadata (netCDF header described
by dimensions, coordinate variables and the
structure of data variables) without performing
any heavy calculations for both data read from
files and virtual data variables - Only performs calculations when the data are
requested - Only calculates the minimal set needed to fulfill
the current request
12FDS Evolution
- Keep these advantages and evolve the
implementation. - The Java netCDF library allows new data container
formats to be plugged-in by implementing the I/O
Service Provider interface. - Once plugged-in clients using nj22 have access
to the data from this container. - We implemented a Ferret I/O Service provider
which can read Ferret command scripts and direct
Ferret to perform the calculations as needed to
satisfy data requests.
13Example
more data/simple4.jnl use levitus_climatology le
t/dlevitus_climatology temp_20
tempdlevitus_climatology,z020_at_sum set
var/title"surface heat content"/units"deg C
temp_20dlevitus_climatology
14Example
dncdump -c http//porter.pmel.noaa.gov8920/thredd
s/dodsC/mydata/simple4.jnl float TEMP(ZAXLEVITR,
YAXLEVITR, XAXLEVITR)
TEMPunits "DEG C"
TEMPlong_name "TEMPERATURE"
TEMP_FillValue -1.e10f
TEMPmissing_value -1.e10f
TEMPdataset "levitus_climatology.cdf" float
TEMP_20(YAXLEVITR, XAXLEVITR)
TEMP_20dataset "levitus_climatology"
TEMP_20direction "IJ"
TEMP_20units "deg C"
TEMP_20long_name "surface heat content"
TEMP_20missing_value "-1.00000E34"
TEMP_20virtual "true"
15IOSP
Application
The low-level part of the NetCDF-Java version 2.2
architecture
NetcdfFile
I/O service provider
NetCDF-3
NIDS
isValid
GRIB
NetCDF-4
open
HDF5
GINI
readData
Nexrad
DMSP
Ferret
Java Runtime
GrADS
Stolen directly from John Caron with only this
measly acknowledgement.
16The THREDDS Data Server
- TDS is an OPeNDAP server.
- nj22 I/O Service Providers can be plugged in to
TDS. - The combination of the Ferret I/O Service
Provider and TDS (aka F-TDS) serves via OPeNDAP
data which are represented by Ferret command
scripts (both data read from disk by Ferret and
virtual data computed on-the-fly by Ferret).
17F-TDS and Server-Side Analysis
- A DataSource handler can also be plugged in to
TDS which allows custom handling of OPeNDAP
requests based on the contents of the
HTTPServletRequest Object (and by implication the
URL). - We built such a DataSource handler which
recognizes URL with embedded analysis
expressions. - The three groups of input to the server-side
analysis were copied from GDS. - The three sets are
- Data sources (e.g. OPeNDAP URLs).
- Analysis commands which are implementation
specific. - A sub-region.
18An Analysis URL
- http//machineport/thredds/dodsC/
- _expr_
- dataset1,dataset2,...
- expression1expression2...
- region
- .URLsuffix?constraint
Part of the original GDS specification, not
necessary and not often used with FDS
19FDS-Specific Example
http//host.gov9090/thredds/dodsC/data/coads _exp
r_ http//host.gov9090/thredds/dodsC/data/levitu
s DIFSSTd1-TEMPd2,gSSTd1 .asc?DIF
20Community Action
- Can we define a implementation-indepedent syntax
for server-side analysis requests? - Some common operations (averaging, differencing,
linear interpolation) with standard name and
syntax - Server-specific (native) operations
- What mechanism?
- An encoded XML string?
- We use this technique in LAS with good success
- A simple command language?
21XML Analysis Expression
- ltdatasetsgt ltdataset id1gt"my_local_dset.
nc"lt/datasetgt ltdataset id2gt"http//remote_ds
et.nc"lt/datasetgtlt/datasetsgtltoperationgt
ltnamegtDIFFlt/namegt ltarg pos1
dset1gtsstlt/arggt ltarg pos2
dset2gtsstlt/arggtlt/operationgt
22With native Operation
- ltdatasetsgt ltdataset id1gtmy_local_dset.nc
lt/datasetgt ltdataset id2gthttp//remote_dset.n
clt/datasetgtlt/datasetsgtltnative_operationgt - LET diff sstd2-sstd1
- lt/native_operationgt
23Summary
- Server-side analysis is critical for the future
LAS. - NetCDF Java 2.2 and the THREDDS Data Server is a
great platform for implementing this type of
analysis with a legacy analysis application (like
Ferret). - A community-developed server-side analysis
framework would make it easier to get the
advantages of server-side analysis from other
servers.
24Acknowledgments
- From COLA
- Jennifer Adams
- Brian Doty
- Joe Wielgosz
- From Unidata
- John Caron
- Ethan Davis
- Former TMAP
- Richard Rogers
- Yonghua Wei