Storage and Data - PowerPoint PPT Presentation

About This Presentation
Title:

Storage and Data

Description:

the same physical file will be replicated to several SEs with different local file names ... information system) treats files as the basic resource abstraction ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 83
Provided by: Margri4
Category:

less

Transcript and Presenter's Notes

Title: Storage and Data


1
Storage and Data
  • Grid Middleware 6
  • David Groep, lecture series 2005-2006

2
Outline
  • Data management concepts
  • metadata, logical filename, SURL, TURL, object
    store
  • Protocols
  • GridFTP, SRM
  • RFT/FTS, FPS scheduled transfers with GT4
    (LIGO)
  • End-to-end integrated systems
  • SRB
  • Structured data and databases
  • OGSA-DAI
  • Data curation issues
  • media migration
  • content conversion (emulation or translation?)

3
Grid data management
  • Data in a grid need to be
  • located
  • replicated
  • life-time managed
  • accessed (sequentially and at random)
  • and the user does not know where the data is

4
Types of storage
  • File oriented storage
  • cannot support content-based queries
  • needs annotation metadata to be useful(note
    that a file system and name is a type of
    meta-data)
  • most implementations can handle any-sized
    object(but MSS tape systems cannot handle very
    small files)
  • Databases
  • structured data representation
  • supports content queries well via indexed
    searches
  • good for small data objects (with BLOBs of
    MBytes, not GBytes)

5
Grid storage structure
  • For file oriented storage

6
File storage layers (file system analogy)
  • Separation the storage concepts
  • helps for both better interoperation and
    scalability
  • Semantic view
  • description of data in words and phrases
  • Meta-data view
  • describe data by attribute-value pairs (filename
    is also an A-V pair)
  • like filesystems like HPFS, EXT2, AppleFS with
    extended attributes
  • Object view
  • refers to a blob of data by a meaningless handle
    (unique ID)
  • e.g. in typical Unix FSs inode
  • FAT directory entry alloc table (mixes
    filename and object view)
  • Physical view
  • block devices series of blocks on a disk, or a
    specific tape offset

7
Storage layers (grid naming terminology)
  • LFN (Logical File Name) level 2
  • like the filename in the traditional file system
  • may have hierarchical structure
  • is not directly suitable for access, as it is
    site independent
  • GUID (Globally Unique ID) level 3
  • opaque handle to reference a specific data object
  • still independent of the site
  • GUID-LFN mapping in 1-n
  • SURL (Storage URL, of physical file name PFN)
    level 3
  • SE specific reference to a file
  • understood by the storage management interface
  • GUID-SURL mapping is 1-n
  • TURL (Transfer URL) griddy level 4
  • current physical location of a file inside a
    specific SE
  • is transient (i.e. only exists after being
    returned by the SE management interface)
  • has a specific lifetime
  • SURL-TURL mapping is 1-(small number, typically 1)

terminology from EDG, gLite and Globus
8
Data Management Services Overview
9
Storage concepts
  • using the OSG-EDG-gLite terminology
  • Storage Element
  • management interface
  • transfer interface(s)
  • Catalogues
  • File Catalogue (meta-data catalogues)
  • Replica Catalogue (location services indices)
  • Transfer Service
  • File Placement
  • Data Scheduler

10
Grid Storage Concepts Storage Element
  • Storage Element
  • responsible for manipulating files, on anything
    from disk to tape-backed mass storage
  • contains services up to the filename level
  • the filename typically an opaque handle for
    files,
  • as a higher-level file catalogue serves the
    meta-data, and
  • the same physical file will be replicated to
    several SEs with different local file names
  • SE is a site function (not a VO function)
  • Capabilities
  • Storage space for files
  • Storage Management interface (staging, pinning)
  • Space management (reservation)
  • Access (read/write, e.g. via gridFTP, HTTP(s),
    Posix (like))
  • File Transfer Service (controlling influx of data
    from other SEs)

11
Storage Element grid transfer services
  • Possiblities
  • GridFTP
  • de-facto standard protocol
  • supports GSI security
  • features striping parallel transfers,
    third-party transfers (TPTs, like regular FTP)
    part of protocol
  • issue firewalls dont like open port ranges
    needed by FTP(neither active nor passive)
  • HTTPs
  • single port, so more firewall-friendly
  • implementation of GSI and delegation required
    (mod_gridsite)
  • TPTs not part of protocol

12
GridFTP
  • secure, robust, fast, efficient, standards
    based, widely accepted data transfer protocol
  • Protocol based
  • Multiple Independent implementation can
    interoperate
  • Globus Toolkit supplies reference implementation
  • Server, Client tools (globus-url-copy),
    Development Libraries

13
GridFTP The Protocol
  • FTP protocol is defined by several IETF RFCs
  • Start with most commonly used subset
  • Standard FTP get/put etc., 3rd-party transfer
  • Implement standard but often unused features
  • GSS binding, extended directory listing, simple
    restart
  • Extend in various ways, while preserving
    interoperability with existing servers
  • Striped/parallel data channels, partial file,
    automatic manual TCP buffer setting, progress
    monitoring, extended restart

source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
14
GridFTP The Protocol (cont)
  • Existing standards
  • RFC 959 File Transfer Protocol
  • RFC 2228 FTP Security Extensions
  • RFC 2389 Feature Negotiation for the File
    Transfer Protocol
  • Draft FTP Extensions
  • GridFTP Protocol Extensions to FTP for the Grid
  • Grid Forum Recommendation
  • GFD.20
  • http//www.ggf.org/documents/GWD-R/GFD-R.020.pdf

source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
15
Striped Server Mode
  • Multiple nodes work together on a single file
    and act as a single GridFTP server
  • An underlying parallel file system allows all
    nodes to see the same file system and must
    deliver good performance (usually the limiting
    factor in transfer speed)
  • I.e., NFS does not cut it
  • Each node then moves (reads or writes) only the
    pieces of the file that it is responsible for.
  • This allows multiple levels of parallelism, CPU,
    bus, NIC, disk, etc.
  • Critical if you want to achieve better than 1 Gbs
    without breaking the bank

source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
16
source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
17
Disk to Disk Striping Performance
source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
18
GridFTP Caveats
  • Protocol requires that the sending side do the
    TCP connect (possible Firewall issues)
  • Working on V2 of the protocol
  • Add explicit negotiation of streams to relax the
    directionality requirement above()
  • Optionally adds block checksums and resends
  • Add a unique command ID to allow pipelining of
    commands
  • Client / Server
  • Currently, no server library, therefore Peer to
    Peer type apps VERY difficult
  • Generally needs a pre-installed server
  • Looking at a dynamically installable server

()DG like a kind of application-level BEEP
protocol
source Bill Allcock, ANL, Overview of GT4 Data
Services, 2004
19
SE transfers random access
  • wide-area R/A for files is new
  • typically address by adding GSI to existing
    cluster protocols
  • dcap -gt GSI-dcap
  • rfio -gt GSI-RFIO
  • xrootd -gt ??
  • One (new) OGSA-style service
  • WS-ByteIO
  • Bulk interface
  • RandomIO interface
  • posix-like
  • needs negotiation of actual transfer protocol
  • attachment, DIME,

20
SE transfer local back-end access
  • backend of a grid store is not always just a disk
  • distributed storage systems without native posix
  • even if posix emulation is provided, that is
    always slower!
  • for grid use, need to also provide GridFTP
  • and a management interface SRM
  • local access might be through the native protocol
  • but the application may not know
  • and it is usually not secure enough to run over
    WAN
  • so no use for non-LAN use by others in the grid

21
Storage Management (SRM)
  • common management interface on top of many
    backend storage solutions
  • a GGF draft standard (from the GSM-WG)

22
Standards for Storage Resource Management
  • Main concepts
  • Allocate spaces
  • Get/put files from/into spaces
  • Pin files for a lifetime
  • Release files and spaces
  • Get files into spaces from remote sites
  • Manage directory structures in spaces
  • SRMs communicate other SRMs as peer-to-peer
  • Negotiate transfer protocols
  • No logical name space management (can come from
    GGF- GFS)

source A. Sim, CRD, LBNL 2005
23
SRM Functional Concepts
  • Manage Spaces dynamically
  • Reservation, allocation, lifetime
  • Release, compact
  • Negotiation
  • Manage files in spaces
  • Request to put files in spaces
  • Request to get files from spaces
  • Lifetime, pining of files, release of files
  • No logical name space management (rely on GFS)
  • Access remote sites for files
  • Bring files from other sites and SRMs as
    requested
  • Use existing transport services (GridFTP, http,
    https, ftp, bbftp, )
  • Transfer protocol negotiation
  • Manage multi-file requests
  • Manage request queues
  • Manage caches, pre-caching (staging) when
    possible
  • Manage garbage collection
  • Directory Management
  • Manage directory structure in spaces

source A. Sim, CRD, LBNL 2005
24
SRM Methods by the features
Space management srmCompactSpace
srmGetSpaceMetaData srmGetSpaceToken srmReleaseFi
lesFromSpace srmReleaseSpace srmReserveSpace srmUp
dateSpace   Authorization Functions srmCheckPermi
ssion srmGetStatusOfReassignment srmReassignToUser
srmSetPermission Request Administration srmAbor
tRequestedFiles srmRemoveRequestedFiles srmResumeR
equest srmSuspendRequest
Core (Basic) srmChangeFileStorageType srmExtendFil
eLifetime srmGetFeatures srmGetRequestSummary srmG
etRequestToken srmGetSRMStorageInfo srmGetSURLMeta
Data srmGetTransferProtocols srmPrepareToGet srmPr
epareToPut srmPutFileDone srmPutRequestDone srmRel
easeFiles srmStatusOfGetRequest srmStatusOfPutRequ
est srmTerminateRequest  
  Copy Function srmCopy srmStatusOfCopyRequest
  Directory Function srmCp srmLs srmMkdir srmMv
srmRm srmRmdir srmStatusOfCpRequest srmStatusOfLsR
equest
source A. Sim, CRD, LBNL 2005
25
SRM interactions
26
SRM Interactions
27
SRM Interactions
28
SRM Interactions
29
SRM Interactions
30
SRM Interactions
31
Storage infra example with SRM
graphic Mark van de Sanden, SARA
32
SRM Summary
  • SRM is a functional definition
  • Adaptable to different frameworks for operation
    (WS, WSRF, )
  • Multiple implementations interoperate
  • Permit special purpose implementations for unique
    products
  • Permits interchanging one SRM product by another
  • SRM implementations exist and some in production
    use
  • Particle Physics Data Grid
  • Earth System Grid
  • More coming
  • Cumulative experiences
  • SRM v3.0 specifications to complete

source A. Sim, CRD, LBNL 2005
33
Replicating Data
  • Data on the grid may, will and should exist in
    multiple copies
  • Replicas may be temporary
  • for the duration of the job
  • opportunistically stored on cheap but unreliable
    storage
  • contain output cached near a compute site for
    later scheduled replication
  • Replicas may also provide redundancy
  • application level instead of site-local RAID or
    backup

34
Replication issues
  • Replicas are difficult to manage
  • if the data is modifiable
  • and consistency is required
  • Grid DM today does not address modifiable data
    sets
  • as soon as more than one copy of the data exists
  • otherwise, result would be either inconsistency
  • or requires close coordination between storage
    locations (slow)
  • or almost guarantees a deadlock
  • Some wide-area distributed file systems do this
    (AFS,DFS)
  • but are not scalable
  • or require a highly available network

35
Grid Storage concepts Catalogues
  • Catalogues
  • index of files that link to a single object
    (referenced by GUID)
  • Catalogues logically a VO function, with local
    instances per site
  • Capabilities
  • expose mappings, not actual data
  • File or Meta-data Catalogue names, metadata -gt
    GUID
  • Replica Catalogue and Index GUID - SURLs for
    all SEs containing the file

36
File Catalogues
37
graphic Peter Kunszt, EGEE DJRA1.4 gLite
Architecture
38
Alternatives to the File Catalogue
  • Store SURLs with data in application DB schema
  • better adapted to the application needs
  • easier integration in existing frameworks

39
Grid Storage Concepts Transfer Service
  • Transfer service
  • responsible for moving (replicating) data between
    SEs
  • transfers are scheduled, as data movement
    capacity is scarce(not because of WAN network
    bandwidth, but because of CPU capacity and
    disk/tape bandwidth in data movement nodes!)
  • logically a per VO function, hosted at the site
  • builds on top of the SE abstraction and a data
    movement protocoland is co-ordinated with a
    specific SE
  • Capabilities
  • transfer SURL at SE1 to new SURL at SE2
  • using SE mechanisms such as SRM-COPY, or directly
    GridFTP
  • either push or pull
  • subject to a set of policies, e.g.
  • max. number of simultaneous transfers between SE1
    and SE2
  • with specific timeout or retries
  • asynchronous
  • states like SUBMITTED, PENDING, ACTIVE,
    CANCELLING, CANCELLED, DONE_INCOMPLETE,
    DONE_COMPLETE
  • update replica catalogues (GUID-gtSURL mappings)

40
File Transfer Service
graphic gLite Architecture v1.0 (EGEE-I DJRA1.1)
41
FTS Channels
  • Scheduled number of transfers from one site to a
    (set of) other sites
  • below CERNCI to sites on the OPN (next slide)

42
FTS channels
  • for scaling reasons
  • one transfer agent for each channel, i.e. each
    SRClt-gtTGT pair
  • agents can be spread over multiple boxes

43
LHC OPN
44
in network terms
  • Cricket graph 2006 CERN-gtSARA via OPN
  • link speed is 10 Gb/s

45
FTS complex services
  • Protocol translation
  • although many will, not all SEs support GridFTP
  • FTS in that case needs protocol translation
  • translation through memory excludes third-party
    transfers
  • Other Issues
  • credential handling
  • files on the source and target SE are readable
    for specific users and specific VO (groups)
  • SEs are site services, and sites want to be
    access by the end-user credential for tracability
    (not a generic VO account)
  • continued access to the user credential needed
    (like in any compute broker)

46
Grid Storage Concept File Placement
  • Placement Service
  • manage transfers for which the host site is the
    destination
  • coordinate updates up the VO file catalogue and
    the actual transfers (via the FTS, a site-managed
    service)
  • Capabilities
  • transfer GUID or LFN from A to B(note the FTS
    could only operate on SURLs)
  • needs access to the VO catalogues, and thus
    needs sufficient privileges to do the job(i.e.
    update the catalogues)
  • API can be the same as for the FTS

47
Data Scheduler
  • Like the placement service, but can direct
    requests to different sites

48
DM Putting it all together
graphic gLite Architecture v1.0 (EGEE-I DJRA1.1)
49
GT4 view on the same issues
  • Similar functionalitybut more closely linked to
    the VO than the site
  • based on soft-state registrations(like the
    information system)
  • treats files as the basic resource abstraction

next two slides Ann Chervenak, ISI/USC Overview
of GT4 Data Management Services, 2004
50
RLS Framework
Replica Location Indexes
  • Local Replica Catalogs (LRCs) contain consistent
    information about logical-to-target mappings

RLI
RLI
LRC
LRC
LRC
LRC
LRC
Local Replica Catalogs
  • Replica Location Index (RLI) nodes aggregate
    information about one or more LRCs
  • LRCs use soft state update mechanisms to inform
    RLIs about their state relaxed consistency of
    index
  • Optional compression of state updates reduces
    communication, CPU and storage overheads
  • Membership service registers participating LRCs
    and RLIs and deals with changes in membership

51
Replica Location Service In Context
  • The Replica Location Service is one component in
    a layered data management architecture
  • Provides a simple, distributed registry of
    mappings
  • Consistency management provided by higher-level
    services

52
Access Control Lists
  • Catalogue level
  • protects access to meta-data
  • is only advisory for actual file accessunless
    the storage system only accepts connections from
    a trusted agent that does itself a catalogue
    lookup
  • SE level
  • either natively (i.e. supported by both the SRM
    and transfer services) or via an agent-system
    like gLiteIO
  • SRM/transfer level
  • SRM and GridFTp server need to lookup in local
    ACL store access rights for each transfer
  • need all files owned by SRM unless underlying
    FS supports ACLs
  • OS level
  • native POSIX-ACL support in OS needed
  • only available for limited number of systems
    (mainly disk based)
  • not (yet) in popular HSM solutions

53
Grid ACL considerations
  • Semantics
  • Posix semantics require that you traverse up the
    tree to find all constraints
  • behaviour both costly and possibly undefined in a
    distributed context
  • VMS and NTFS container semantics are
    self-contained
  • taken as a basis for the ACL semantics in many
    grid services
  • ACL syntax local semantics typically Posix-style

54
Catalogue ACL method in GT4 with WS-RF
graphic Ann Chervenak, ISI/USC, from
presentation to the Design Team, Argonne, 2005
55
Stand-alone solutionsSRB
  • the SDSC Storage Request Broker

56
SRB Data Management Objectives
  • Automate all aspects of data management
  • Discovery (without knowing the file name)
  • Access (without knowing its location)
  • Retrieval (using your preferred API)
  • Control (without having a personal account at the
    remote storage system)
  • Performance (use latency management mechanisms to
    minimize impact of wide-area-networks)

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
57
Federated SRB server model
Peer-to-peer Brokering
Application
Parallel Data Access
Logical Name Or Attribute Condition
1
6
5/6
SRB server
SRB server
3
4
5
SRB agent
SRB agent
2
Server(s) Spawning
R1
MCAT
1.Logical-to-Physical mapping 2.Identification of
Replicas 3.Access Audit Control
R2
source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
58
Features
  • Authentication
  • encrypted password
  • GSI, certificate based
  • Metadata has it all
  • storage in a (definable) flat file system
  • Data put into Collections (unix directories),
    access and control operation possible
  • parallel transport of files
  • Physical Resources combine to Logical Resource
  • Encrypted data and/or encrypted metadata
  • Free-ish (educational) commercial version of an
    old SRB at http//www.nirvanastorage.com

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
59
SDSC Storage Resource Broker Meta-data Catalog
Application
Linux I/O
OAI WSDL
Access APIs
DLL / Python
Java, NT Browsers
GridFTP
Consistency Management / Authorization-Authenticat
ion

Prime Server
Logical Name Space
Latency Management
Data Transport
Metadata Transport
Storage Abstraction
Catalog Abstraction
Databases DB2, Oracle, Postgres, SQLServer,
Informix
HRM ORB
Servers
source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
60
Production Data Grid
  • SDSC Storage Resource Broker
  • Federated client-server system, managing
  • Over 70 TBs of data at SDSC
  • Over 10 million files
  • Manages data collections stored in
  • Archives (HPSS, UniTree, ADSM, DMF)
  • Hierarchical Resource Managers
  • Tapes, tape robots
  • File systems (Unix, Linux, Mac OS X, Windows)
  • FTP sites
  • Databases (Oracle, DB2, Postgres, SQLserver,
    Sybase, Informix)
  • Virtual Object Ring Buffers

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
61
Mappings on Name Space
  • Define logical resource name
  • List of physical resources
  • Replication
  • Write to logical resource completes when all
    physical resources have a copy
  • Load balancing
  • Write to a logical resource completes when copy
    exist on next physical resource in the list
  • Fault tolerance
  • Write to a logical resource completes when copies
    exist on k of n physical resources

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
62
SRB Development
  • Now at version 3.4 (as of November 2005)
  • Peer-to-peer federation of ZONES
  • Support multiple independent MCAT catalogs
  • Replicate metadata
  • mySQL/BerkeleyDB port
  • OGSA/OGSI compliant interface
  • GridFTP interfaces

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
63
User Interfaces
  • Unix Command line tools S-commands (e.g. Sls,
    Spwd, Sget, Sput)
  • Windows SRB browser InQ
  • Web Interface mySRB
  • java and C API.
  • java admin tools
  • DEMO

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
64
Administrative Interface
  • java based admin tool
  • Also available as Unix command

source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
65
Unix Command-line Tool S
source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
66
Windows Browser InQ
source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
67
Web Interface
source Maurice Bouwhuis, SARA, based on data by
Reagan Moore, SDSC
68
Nice and Not so Nice
  • It works and is being used in production
  • metadata based
  • it knows GSI and will know gridFTP
  • for S-commands password in plain text in file
    (should not be necessary)
  • InQ does not know GSI
  • Not all interfaces have same capabilities

source Maurice Bouwhuis, SARA
69
Structured DataOGSA-DAI
70
Access to structured data
  • Several layers
  • access layer
  • do not virtualise schema and semantics, just get
    there
  • OGSA-DAI, Spitfire (depricated)
  • semantic layer
  • interpret and attempt to merge schemas using
    ontology discovery
  • a research topic today, with some interesting
    results
  • see e.g. the April VL-e workshop for some nice
    examples

71
OGSA-DAI
  • An extensible framework for data access and
    integration.
  • Expose heterogeneous data resources to a grid
    through web services.
  • Interact with data resources
  • Queries and updates.
  • Data transformation / compression
  • Data delivery.
  • Customise for your project using
  • Additional Activities
  • Client Toolkit APIs
  • Data Resource handlers
  • A base for higher-level services
  • federation, mining, visualisation,
  • http//www.ogsadai.org.uk/

source Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
72
Considerations
  • Efficient client-server communication
  • One request specifies multiple operations
  • No unnecessary data movement
  • Move computation to the data
  • Utilise third-party delivery
  • Apply transforms (e.g., compression)
  • Build on existing standards
  • Fill-in gaps where necessary specifications from
    DAIS WG
  • Do not hide underlying data model
  • Users must know where to target queries, Data
    virtualisation is hard
  • Extensible architecture
  • Extensible activity framework
  • Cannot anticipate all desired functionality
  • Allow users to plug-in their own

based on Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
73
OGSA-DAI services
  • OGSA-DAI uses data services to represent and
    provide access to a number of data resources

Data Service
accesses
represents
accesses
Data Resource
Data Resource
Data Resource
based on Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
74
Services
  • Services co-located with the data as much as
    possible

based on Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
75
Supported data sources
Relational XML Files
MySQL DB2 Oracle 10 SQLServer PostgreSQL eXist Xindice Text Files Binary Files CSV SwissProt OMIM
based on Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
76
Service interaction
lt?xml?gt ltperformgt . lt/performgt
Client
lt?xml/gt ltresponsegt . lt/responsegt
Data Sink
011010011101100
based on Amy Krause, EPCC Edinburgh OGSA-DAI
Overview, GGF17, Tokyo, 2006
77
Data Service internals
from Alexander Wöhrer, AustrianGrid OGSA-DAI
tutorial, GGF13 Seoul, 2005
78
Request/response
ltperform xmlns" xmlnsxsi
xsischemaLocation"gt ltsqlQueryStatement
name"statement"gt ltexpressiongt
select from littleblackbookwhere id10
lt/expressiongt ltresultSetStream
nameoutput"/gt lt/sqlQueryStatementgt
ltdeliverToURLname"deliverOutput"gt
ltfromLocal fromoutput"/gt
lttoURLgtftp//anonfrog_at_ftp.example.com/homelt/toURL
gt lt/deliverToURLgt lt/performgt
ltgridDataServiceResponse xmlns"gt ltresult
name"deliverOutput" statusCOMPLETED"/gt
ltresult name"statement" statusCOMPLETED"/gt lt/gr
idDataServiceResponsegt
from Alexander Wöhrer, AustrianGrid OGSA-DAI
tutorial, GGF13 Seoul, 2005
79
Client library interaction
you have to know the backend structure of the
data source
  • SQLQuery
  • SQLQuery query new SQLQuery("select from
    littleblackbook
  • where id'3475'")
  • XPathQuery
  • XPathQuery query new XPathQuery(
    "/entry_at_idlt10" )
  • XSLTransform
  • XSLTransform transform new XSLTransform()
  • DeliverToGFTP
  • DeliverToGFTP deliver new DeliverToGFTP("ogsadai
    .org.uk", 8080, "myresults.txt" )

from Alexander Wöhrer, AustrianGrid OGSA-DAI
tutorial, GGF13 Seoul, 2005
80
Simple requests
  • Simple requests consist of only one activity
  • Send the activity directly to the perform method
  • SQLQuery query new SQLQuery(
  • "select from littleblackbookwhere
    id'3475'")
  • Response response service.perform( query )

from Alexander Wöhrer, AustrianGrid OGSA-DAI
tutorial, GGF13 Seoul, 2005
81
Closing Remarks
82
Miscellaneous tidbits
  • Data Curationthe need to preserve data over time
  • migrating media (preserve readablility) is only
    one aspect
  • need also
  • format conversion or
  • emulation of the programs operating on the data
  • Data Provenanceneed to know how this data has
    come into being
  • association of meta-data and work flow
  • recording of workflow and w/f instances in
    essential
  • this is (today) application specific, but maybe,
    one day,
Write a Comment
User Comments (0)
About PowerShow.com