The Centralized Life Sciences Data (CLSD) service

About This Presentation

Title:

The Centralized Life Sciences Data (CLSD) service

Description:

To access CLSD you must have an account on the Libra Cluster at IU (aka libra00.uits.iu.edu) ... Once you have a Libra account, send email to SDS at data ... – PowerPoint PPT presentation

Number of Views:92

Avg rating:3.0/5.0

Slides: 48

Provided by: dgr58

Category:

more less

Transcript and Presenter's Notes

Title: The Centralized Life Sciences Data (CLSD) service

1
The Centralized Life Sciences Data (CLSD)
service Michael Grobe Scientific Data
Services Research Computing University
Information Technology Services Indiana
University at Indianapolis (dgrobe_at_iupui.edu) Jan
uary 2007
2
Outline
Basic genome science processes and
vocabulary Basic relational algebra Simple SQL
as an expression of the relational algebra DB2
and the Federated Server CLSD data sources
relationalized, mirrored, and
federated Accessing CLSD Directions for
possible future work Adding data
sources Integrating more completely with the
TeraGrid Integrating with other
Grids Questions, suggestions
3
Some chemistry
A polymer is a chemical composed of many
similar units, e.g. polyvinyl chloride, starches,
etc. DNA is a (usually double-stranded) polymer
composed of nucleotides Thymine, Adenosine,
Cytosine, and Guanine DNA carries genetic
information. Individual units of genetic
information are stored in individual (possibly
quite long) segments of DNA. RNA is a (usually
single-stranded) polymer composed of
nucleotides Uracil, Adenosine, Cytosine,
Guanine There are many varieties of RNA (mRNA,
snRNA, rRNA, snoRNA,etc.), and they serve
different functions within a cell. For example,
RNA transfers genetic information, catalyses
reactions, and otherwise assists or interferes
with reactions.
4
Some more chemistry

Polymers are synthesized by catalysts called
polymerases in a process called
polymerization.
Proteins are polymers composed of (over 20
different kinds of) amino acids, such as
Methionine (M), Isoleucine (I), Cysteine(C),
Histidine (H), Alanine(A), Glutamic acid (E),
Leucine (L), etc.
Proteins
provide structure
microfilaments (polymers of actin),
microtubules (polymers of tubulins),
channels thru the cell wall, etc.
catalyse and co-catalyse reactions, as enzymes,
bind with DNA to enhance or inhibit
transcription and translation,
are sometimes marked for transport or
degradation.
Protein primary, secondary and tertiary
structures are important.
Proteins are degraded within proteasomes..

5
Genetic material 2 meters of DNA packaged into
less than 1.4 microns
From Atherly,et al., 1999
6
The central model of molecular genetics
DNA can be reliably replicated during the process
of cell division, by DNA-dependent DNA
polymerases. DNA can be transcribed to
messenger RNA (mRNA) by DNA-dependent RNA
polymerases. Transcription takes place in the
nucleus (or equivalent). mRNA is transported to
the cytoplasm where it is used as a template for
creating proteins by ribosomes in a process
called translation. The translation process
encodes 1 amino acid for each 3 DNA bases in a
sequence (triplet). The function mapping each
of the 64 possible triplets to an amino acid is
the genetic code. Ribosomes are complexes of
RNA and protein.
7
The central model within the cell
Diagram from http//www.ncbi.nih.gov/About/primer
/images/proteinsynth4.GIF (Dont forget about
degradation and recyling of AAs.)
8
The central model in more detail
(Graphics of DNA and RNA from Atherly, et al.
1999)
9
Mutations and polymorphisms
Nucleotide sequence Translated AA
sequence Wildtype ACTGAACTGATT
ThrGluLeu-Ile Substitution
ACTGACCTGATT Thr-Asp-Leu-Ile Deletion
ACTCTGATT Thr-Leu-Ile Insertion
ACTGAACCTGAACTGATT Thr-Glu-Pro-Gly-Leu-Ile If
mutations like these occur in genetic material
within oocytes, they may be transmitted to
offspring, and define polymorphic gene
variations. A Single Nucleotide Polymorphism
(SNP) is a variation where one base is changed
and passed on to offspring (and occurs with
sufficient frequency). A Deletion/Insertion
Polymorphism (DIP) is a variation where multiple
bases have been removed or inserted into a
sequence. dbSNP is a database of SNPs and DIPs
containing millions of entries, and over 120K
unique sequences that are inserted or deleted.
10
Scale of human genome data
Total number of bases 3.2Gbp (DNA from one half
of one chromosome (chromatid) from each of 24
chromosomes 22 autosomal chromosome pairs plus
the sex chromosomes.) Percentage of genome
consisting of protein coding genes lt 2 Average
gene length 3Kbp (but up to 2.4Mbp) Average
exon length 200bp Average protein length
500-600AA Percentage of junk DNA often said
to be 50 Percentage of junk DNA now
suspected to be transcribed (the dark matter of
the genome) 50 to 100 Some of that junk is
mRNA that negatively regulates translation.
11
Process control cancer-related reaction pathways
from Hanahan, et al.
12
Basic relational algebra
The relational algebra operates on relations,
which are sets of tuples of the same arity, which
is to say, collections of lists of the same
length. Here are two 4-tuples ( 1, 2, 3, 4 ) (
8, 7, 9, 4 ) Relations are commonly represented
as tables. There are 5 primitive operations
within the relational algebra Projection
extract specific columns from a
relation Selection extract specific rows Set
union create a new table composed of all the
rows of two other tables Set difference
remove the rows in one relation that appear in
another Cartesian product multiply two
tables to create a third
13
Cartesian product in more detail
Relation1 (arity 4 length 3)
Relation 2 (arity 3 length 2)
8 7 9 1
1 2 3 4
7 6 2 3
3 4 7
1 9 8
8 7 9 1 3 4 7
8 7 9 1 1 9 8
1 2 3 4 3 4 7
1 2 3 4 1 9 8
7 6 2 3 3 4 7
7 6 2 3 1 9 8
Cartesian product (arity 4 3 length 3 2)
14
Relational databases and query languages

Database management systems based on the
relational algebra were described by Edward F.
Codd working for IBM in the early 1970s.
Codds formulation included
indexes and keys,
decomposition into normal forms, and
integrity constraints.
Multiple languages and interfaces were developed
to query and modify collections of relations,
among them the Structured English Query Language,
SEQUEL, developed by Chamberlain and Boyce.

15
SQL as an implementation of the relational algebra
The most successful such language,SQL, was based
on SEQUEL. SQL requires that each relation has a
tablename, and each tuple position has a
fieldname
Players (arity 4 length 3)
Player Innings Hits Teamnumber
8 7 9 1
1 2 3 8
7 6 2 3
Teams (arity 3 length 2)
t_num games rank
3 4 7
1 9 8
16
SQL as an implementation of the relational
algebra
SQL commands map to the relational primitives as
follows, where stands for all fields in a
table Projection select fieldname_list from
tablename ex select tnum,rank from
Teams Selection select from tablename where
ltlogical expressiongt ex select from Players
where Teamnumber 1 Union (select
fieldname_list from tablename1)
union (select fieldname_list from
tablename2) use ALL to keep duplicates Set
difference select from (tablename1 except
tablename2) Cartesian product select from
tablename1, tablename2 Note that SQL does not
specify how to perform a query only what the
result should be. It is a declarative, rather
than procedural, language.
17
The relational join operation
An SQL join is a Cartesian product followed by
a selection, as in select from Players,
Teams where Players.Teamnumber
Teams.t_num which results in a Cartesian product
table with only 2 (red) rows
Player Innings Hits Teamnumber t_num games rank
8 7 9 1 3 4 7
8 7 9 1 1 9 8
1 2 3 4 3 4 7
1 2 3 4 1 9 8
7 6 2 3 3 4 7
7 6 2 3 1 9 8
18
IBMs DB2 and WebSphere Federated Server,nee
Information Integrator, nee DiscoveryLink
DB2 is a fully-featured relational database
system that can house and serve large
databases. Data is usually imported in
relational form, structured as rows composed of
individual data values, possibly identified by
unique IDs (keys). DB2 can also access data in
tables managed by other, usually physically
remote, database management systems, such as
Oracle, MySQL or DB2. This process is known as
data federation. DB2 can also federate some
external resources that are not normally accessed
as relational tables (e.g. Blast). Such
resources are transformed, or relationalized
on-the-fly by wrappers. Once these resources
have been registered with their wrappers they may
be referred to within SQL queries as is any other
resource.
19
WFS diagram from Del Prete
20
Some WFS jargon
Wrapper a library to access a particular class
of data sources or protocols. Each wrapper
contains information about data source
characteristics. There are BLAST and PubMed
wrappers, and now a generic Script wrapper that
talks to user scripts. Server represents a
specific data source (user mappings maybe
required for authentication) Nickname a local
table name (alias) for a data on a server (mapped
to rows and columns) A nickname looks like a
table, but links to a server, which links to a
wrapper/data source, where the wrapper knows how
to process the data from the source.
21
Using NCBI data within DB2 More than just
mirroring

Mirroring usually implies maintaining exact
copies of data sources.
Most data mirrored by CLSD must not only be
copied, but also inserted into the CLSD
relational structure.
This is accomplished by a series of scripts that
Download the data from its external site,
Convert it to a form that can be used to update
CLSD tables,
Insert the data into tables, and
Monitor the overall process to identify and log
errors.
These scripts are run regularly from crontab
entries, and monitoring results are examined
after every run.

22
CLSD relationalized data sources
BIND -- Pathways, Gene interactions ENZYME --
Enzyme nomenclature ePCR -- ePCR results of
UniSTS vs Homo sapiens KEGG data sources
LIGAND -- Pathways, Reactions, Compounds
PATHWAY -- Pathway map coordinates NCBI data
sources LocusLink -- Genetic Loci. (LocusLink
has been inactive since July 1, 2005 when it
was retired in favor of UniGene.) UniGene --
Gene clusters SGD -- Saccharomyces Genome
Database
23
KEGG datasource info
PATHWAY 42,273 pathways generated from 306
reference pathways LIGAND 14,238 compounds,
4,111 drugs, 10,951 glycans, 6,810
reactions , 7,127 reactant pairs
24
CLSD federated data sources
Federated NCBI data sources (subject to hit rate
throttling) Nucleotide -- Nucleotide
sequences PubMed -- Journal abstracts
Federated local mirrors of NCBI data sources
(not throttled) Blast (updated monthly) is
mirrored by UITS dbSNP (updated at major builds)
is mirroed by IUSM Some KEGG resources are
federated via the FS KEGG user-defined functions
25
Examples from the CLSD web sitehttp//scidata.iu.
edu/CLSD/sql-in-db2.shtml

To get a list of genes containing "brain" in
their LOCUS_NAME in dbSNP126_shared
select from DBSNP126_SHARED.GENEIDTONAME
where locus_name like 'brain'
To get a list of Bind Genes and their species
select GeneNameA,Organism from
bind.bind_interaction
To get a list of genes mentioning "HUMAN" in
their descriptions in KEGG
select from KEGG.GENE where description like
'HUMAN'
To get some info from PubMed
select PMID, ArticleTitle FROM NCBI.pmarticles
where entrez.contains (ArticleTitle,
'granulation') 1
AND entrez.contains (PubDate, '1992') 1

26
BLAST Both mirrored and federated
NCBI Blast is typically accessed via a web page
at NCBI, or some mirrored site. Data is returned
in a typical web interface format suitable for
users. Within CLSD, BLAST is accessed via an SQL
query and data is returned as a table that can be
manipulated as is any other DB2 table. For
example, here is an SQL query that invokes a
blastall process running on libra00 from within
DB2 select GB_ACC_NUM, description, e_value
from ncbi.BLASTN_NT where BlastSeq
'AGTACTAGCTAGCTAGCTACTAGCTGACTGACTGACTGATGCATCGATG
ATGC The local version of blastall conducts the
search and returns results encoded within XML (by
specifying the m7 parameter).
27
The DB2 federation software converts the XML
encoded results into something like
this GB_ACC_NUM DESCRIPTION
E_VALUE (VARCHAR) (VARCHAR)
(DOUBLE) AE003644 Drosophila melanogaster
chromosome 0.00666475 2L, section 53 of 83 of
the complete sequence AE003410 Drosophila
melanogaster, chromosome 0.00666475 2L, region
34C4-36A7 (Adh region), section 4 of 10 of the
comple AC092228 Drosophila melanogaster,
chromosome 0.00666475 2L, region 35X-35X, BAC
clone BACR21J17, complete sequence AP008207 Or
yza sativa (japonica cultivar-group)
0.0263349 genomic DNA, chromosome 1, complete
sequence AP003197 Oryza sativa (japonica
cultivar-group) genomic 0.0263349 DNA,
chromosome 1, BAC cloneB1015E06 AP003105 Human
DNA sequence from chromosome 1,
0.0263349 putative argumentativeness gene
GROBE1
28
Modifying BLAST search settings via SQL
Parameters sent to blastall can be set by using
equality comparisons as assignment statements
within SQL conditionals, as in select Score,
E_Value, HSP_Info, HSP_Q_Seq, HSP_H_Seq from
ncbi.BLASTN_NT where BlastSeq
'gagttgtcaatggcgagg' and gapcost8 and
E_Value lt .0005 which will pass gapcost and
e-value settings on to blastall.
29
BLAST data sources available via CLSD
Here is a list showing which search types are
supported by the DB2 BLAST wrapper within CLSD.
BLAST search type Data sources BLASTN NT,
EST_HUMAN, EST_MOUSE, and EST_OTHER A nucleotide
sequence is compared with the contents of a
nucleotide sequence database. BLASTP NR,
SP An amino acid sequence is compared with the
contents of an amino acid database. BLASTX
NR, SP A nucleotide sequence is compared with
the contents of an amino acid sequence database.
Query is translated in all six reading frames.
30
Examples from IBM

Query 1 Given a search sequence, search
nucleotide (NT), and return the hits for only
those sequences not associated with a Cloning
Vector. For each hit, display the Cluster ID and
Title from Unigene, in additon to the Accession
Number and E-Value. Only show the top 5 hits,
based on the ones with the lowest E-values.
Select nt.GB_ACC_NUM, nt.DESCRIPTION, nt.E_VALUE,
useq.CLUSTER_ID, ugen.TITLE
From ncbi.BLASTN_NT nt, unigene.SEQUENCE useq,
unigene.GENERAL ugen
Where BLASTSEQ GGCCGGGCGCGGTGGCTCACGCCTGTAATCC
CAGCACTTTGGGAGGC
CGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACA
CGGTGAAACCCCGTC
And nt.DESCRIPTION not like cloning vector
And nt.GB_ACC_NUM useq.ACC
And useq.CLUSTER_ID ugen.CLUSTER_ID
Order by E_VALUE FETCH FIRST 5 ROWS ONLY

31
User-defined functions (supplied by IBM)

There exist special functions for manipulating
sequence patterns
LSPatternMatch
LSPrositePattern
To get a list of (aspartate aminotranserase)
BLAST results filtered by a (pyridoxal phosphate
attachment site) pattern specified in PROSITE
pattern language
select gb_acc_num, HSP_H_SEQ from ncbi.blastp_nr
where
blastseq'MSQICKRGLLISNRLAPAALRCKSTWFSEVQMGPPDAILG
VTE\
AFKKDTNPKKINLGAGAYRDDNTQPFVLPSVREAEKRVVSRSLDKEYATI
IGI\
PEFYNKAIELALGKGSKRLAAKHNVTAQSISGTGALRIGAAFLAKFWQGN
REI\
YIPSPSWGNHVAIFEHAGLPVNRYRYYDKDT'
and DB2LS.LSPatternMatch(HSP_H_SEQ,
DB2LS.LSPrositePattern(
'GS-LIVMFYTAC-GSTA-K-x(2)-GSALVN.' ) )
gt 0
Note the use of the period (.) to terminate the
PROSITE pattern, and that the LSPatternMatch
function returns the character position of the
left-most substring matching the pattern, or zero
if there is no match.

32
Accessing CLSD getting an account
To access CLSD you must have an account on the
Libra Cluster at IU (aka libra00.uits.iu.edu). If
you dont have an account and are associated
with Indiana University, request an account by
filling out a Research Systems Account
Application at http//rac.uits.iu.edu/rats/form
s/application.php. In the comments section of
the account request, add that you need a local
and persistent password for use with CLSD. Once
you have a Libra account, send email to SDS at
data _at_ indiana.edu and request instructions for
defining a local and persistent password for use
with CLSD. TeraGrid users should send e-mail
to SDS at data _at_ indiana.edu explaining how CLSD
will be used, and describing their TeraGrid
activities. SDS will then arrange for an
appropriate Libra account and send instructions
for defining a suitable password.
33
Accessing CLSD options

DB2 can be accessed in a variety of ways
DB2 Command Line Processor (Unix, Windows)
DB2 Control Center (wherever JRE is running)
DB2 driver for Perl DBI
DB2 drivers for the Java Database Connectivity
(JDBC) Application Program Interface (API),
especially the JDBC Universal Driver
Demonstration Web page (invokes a Java servlet
that uses JDBC)
http//discover.uits.indiana.edu8421/access/
Demonstration WebService (invoked as a function
call via JAX-RPC)
http//discover.uits.indiana.edu8421/axis/CLSDse
rvice.jws?wsdl
Demonstration Web page (invokes a Java servlet
that invokes the CLSD
WebService)
http//discover.uits.indiana.edu8421/access/inde
x-for-service.html
Experimental WSRF Resource (using WSRF within a
GT4 container)
Experimental OGSA-DAI service (running within a
GT4 container)

34
JDBC access
Connect to the CLSD Class.forName(
"com.ibm.db2.jcc.DB2Driver" ) con
DriverManager.getConnection( "jdbcdb2//libra00.
uits.iu.edu50000/clsd2", accountName,
accountPassword ) Prepare a query, send it to
the db, and receive a result statement
con.createStatement() resultSet
statement.executeQuery( query ) Get some query
meta-data (column labels and column data
types) ResultSetMetaData rsmd
resultSet.getMetaData() result
rsmd.getColumnLabel( colCount ) result2
rsmd.getColumnTypeName( colCount )
35
JDBC access (continued)
Get a row of data for( int colCount 1
colCount lt numcols colCount ) String
returnedString "" // Must be predefined.
returnedString resultSet.getString( colCount )
"" out.println( "lttdgt" returnedString
"lt/tdgt\n" )
36
Accessing CLSD thru a WebService (JAX-RPC)
The Java API for XML-based Remote Procedure
Calls, or JAX-RPC, is a specification that
defines a system for building distributed
services (so-called WebServices) within the
client-server model. JAX-RPC makes it possible
for a function invocation in a client like
a_variable function_name( parameter_list) to
cause the function, function_name, to run on a
remote server and return a response containing
the value to be assigned to the variable
a_variable, and a function invocation in a
client like returnString queryCLSD( "select
from syscat.tables", "1", "5", "accountName",
"accountPassword", table ) will return a
(possibly very long) string containing the
response to the query (given that various
linkages have been prearranged).
37
Outline of the CLSDservice
public class CLSDservice // Full source at
// http//scidata.iu.edu/CLSD/examples/CLSDservic
e.jws.txt public String queryCLSD( String
query, String startingRowToPrint,
String maxRows, String account, String password,
String format ) // Get a
query string, etc. from the command line or Web
// browser. // Declare JDBC drivers
and connect to DB2. // Prepare a JDBC
statement containing the SQL query, submit
// it to DB2, and capture the returned JDBC
result set. // Query result set metadata
for column names and types to // return as
the first row, and then collect the contents of
// each data row. return
theResponse // end queryCLSD // end
Class CLSDservice
38
SOAP and WSDL

JAX-RPC uses SOAP and WSDL to establish the
various linkages required to implement remote
procedure calls.
SOAP messages are usually encoded as XML messages
within HTTP requests where
A SOAP request is an HTTP POST request with an
XML body.
A SOAP response is an HTTP response header
followed by an XML body.
Such RPC functions are exposed as operations
when described within web pages using the Web
Services Description Language (WSDL).

39
Java command-line client to access CLSD via
CLSDservice
public class testCLSDClient public static
void main(String args) try
String endpoint "http//discover.uits.ind
iana.edu8421/axis/CLSDservice.jws"
Service service new Service() Call call
(Call) service.createCall()
call.setTargetEndpointAddress( new java.net.URL(
endpoint ) ) call.setOperationName(
new QName("http//soapinterop.org/",
"queryCLSD" ) ) String returnString
(String) call.invoke( new Object
"select from syscat.tables", "1",
"5", "accountName", "accountPassword", table
) System.out.println( returnString )
catch (Exception e)
System.err.println(e.toString())
40
Perl command-line client to access CLSD via
CLSDservice
!perl w use SOAPLite Set up the call to
CLSD using SOAP. host discover.uits.indiana.ed
u service SOAPLite -gt service(
http//host8421/axis/CLSDservice.jws?wsdl
) Make the call to CLSD. result
service-gtqueryCLSD( select
tabschema,tabname from syscat.tables, 1,
5, "DB2account", "password" "table" ) print
result
41
OGSA

The Open Grid Services Architecture (OGSA) is an
architecture for building computational grids.
In particular, OGSA defines a set of core
capabilities and behaviors that address key
concerns in Grid systems. 2 It does not,
however, implement or define how to implement
such core capabilities.
OGSA is NOT layered or object oriented.
However, both will be exploited naturally in some
implementations.
OGSA provides an architecture for building
services such as
Service-Based distributed query processing,
Grid Workflow,
Grid Monitoring Architecture
etc.

42
OGSA-DAI
OGSA-Data Access and Integration (OGSA-DAI) is a
very flexible and powerful data access framework
that can be used within an OGSA grid environment.
It provides various data movement,
virtualization, and manipulation services that
transform the use of data into a higher-level
workflow. The OGSA-DAI client shown in the next
slide uses the OGSA-DAI Client Toolkit to send a
hard-coded query to CLSD (here known as the
DB2Resource). The Toolkit allows clients to use
JDBC by creating a JDBC ResultSet object from an
OGSA-DAI WebRowSet. The response is encoded
using XML and may be retrieved as a single
string, or as individual fields by using
individual JDBC calls as shown below.
43
Java command-line client to access CLSD via
OGSA-DAI
public class queryCLSD public static void
main(String args) throws Exception
// Create an instance of the data service.
String handle "http//localhost8080/wsrf/
services/ogsadai/DataService" String id
"DB2Resource" DataService service
GenericServiceFetcher.getInstance().getDataService
( handle, id) // Define a request
composed of one activity. SQLQuery query
new SQLQuery( "select tabschema,tabname
from syscat.tables") WebRowSet rowset
new WebRowSet( query.getOutput() )
ActivityRequest request new ActivityRequest()
request.add( query ) request.add(
rowset )
44
Java command-line client to access CLSD via
OGSA-DAI 2
// Submit the request and retrieve
results. Response response
service.perform( request ) ResultSet
result rowset.getResultSet()
ResultSetMetaData rsmd result.getMetaData()
int numCols rsmd.getColumnCount() //
Display each column from each row. while(
result.next() ) for( int
colCount 1 colCount lt numCols colCount )
out.print(
result.getString( colCount ) )
out.println()
45

This client displays a small part of the
functionality provided by OGSA-DAI. In addition,
an OGSA-DAI service can be configured to
operate on XML or text data sources, as well as
relational data sources,
perform a series of operations (also known as
activities) as part of a single request,
deliver results to a third party (via FTP,
GridFTP, SMTP, etc.) or to another data service,
deliver results asynchronously, which can be very
useful for long-running requests, and
utilize authentication methods supported by WSRF
to provide grid-based security.
Also, exposing a database via OGSA-DAI makes it
available for OGSA Distributed Query Processing
(OGSA-DQP), so that its use may be further
virtualized within the DQP model.
In some cases, however, OGSA-DAI and DQP may
introduce performance penalties.

Current and possible directions
Adding data sources mirrored and federated
Requests for mirroring or federating will be
gladly entertained
DB2 now provides a user-configurable script
wrapper that connects to a remote DB2 daemon that
can start any co-located arbitrary script and
return data encoded in XML (restricted to one
foreign key per table)
Such a script could be built to relay any web
resource that returns XML meeting key
restrictions.
Wrappers could be constructed to relay some
OGSA-DAI resources
Implementing the OGSA-DAI service in productional
mode.
Integrating with the TeraGrid
CLSD is currently accessible from the TeraGrid,
but authentication is local.
It may be possible to enforce TeraGrid based
X.509 authentication, using either WSRF or
OGSA-DAI interfaces.

47
References

Atherly, Alan G, et al., The Science of Genetics,
1999.
Apache Foundation, AXIS Users Guide,
http//ws.apache.org/axis/java/user-guide.html
Codd, Edward F., A Relational Model of Data for
Large Shared Data Banks, http//www.acm.org/classi
cs/nov95/toc.html
(See also http//en.wikipedia.org/wiki/Edgar_F._
Codd)
CSLD web page http//rac.uits.iu.edu/clsd/
Del Prete, Doug, Efficient access to Blast using
IBM DB2 Information Integrator,
http//www-03.ibm.com/industries/healthcare/do
c/content/bin/blast.pdf
Foster, Ian, et al. The Open Grid Systems
Architecture, Version 1.5.
Sotomayer, Boria and Lisa Childers, Globus
Toolkit 4 Programming Java Services
Sundaram, Babu, Understanding WSRF,
http//www-128.ibm.com/developerworks/edu/gr-
dw-gr-wsrf1-i.html
Questions, comments, suggestions?