e-moreorlessanything: The Killer Application Grids P2P and Web Services: The Killer Technologies - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

e-moreorlessanything: The Killer Application Grids P2P and Web Services: The Killer Technologies

Description:

emoreorlessanything: The Killer Application Grids P2P and Web Services: The Killer Technologies – PowerPoint PPT presentation

Number of Views:603
Avg rating:3.0/5.0
Slides: 234
Provided by: gridsUcs
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: e-moreorlessanything: The Killer Application Grids P2P and Web Services: The Killer Technologies


1
e-moreorlessanything The Killer
ApplicationGrids P2P and Web Services The
Killer Technologies
  • University of Southern California
  • 7-9pm March 29 2006
  • Geoffrey Fox
  • Computer Science, Informatics, Physics
  • Pervasive Technology Laboratories
  • Indiana University Bloomington IN 47401
  • gcf_at_indiana.edu
  • http//www.infomall.org

2
Web services
  • Web Services build loosely-coupled, distributed
    applications, (wrapping existing codes and
    databases) based on the SOA (service oriented
    architecture) principles.
  • Web Services interact by exchanging messages in
    SOAP format
  • The contracts for the message exchanges that
    implement those interactions are described via
    WSDL interfaces.

3
A typical Web Service
  • In principle, services can be in any language
    (Fortran .. Java .. Perl .. Python) and the
    interfaces can be method calls, Java RMI
    Messages, CGI Web invocations, totally compiled
    away (inlining)
  • The simplest implementations involve XML messages
    (SOAP) and programs written in net friendly
    languages like Java and Python

PaymentCredit Card
Web Services
WSDL interfaces
Warehouse Shipping control
WSDL interfaces
Web Services
4
Philosophy of Web Service Grids
  • Much of Distributed Computing was built by
    natural extensions of computing models developed
    for sequential machines
  • This leads to the distributed object (DO) model
    represented by Java and CORBA
  • RPC (Remote Procedure Call) or RMI (Remote Method
    Invocation) for Java
  • Key people think this is not a good idea as it
    scales badly and ties distributed entities
    together too tightly
  • Distributed Objects Replaced by Services
  • Note CORBA was considered too complicated in both
    organization and proposed infrastructure
  • and Java was considered as tightly coupled to
    Sun
  • So there were other reasons to discard
  • Thus replace distributed objects by services
    connected by one-way messages and not by
    request-response messages

5
Typical Grid Architecture
Each Blob is a Computer Program!
UserServices
CoreGrid
6
Classic Grid Architecture
Resources
Content Access
Composition
Middle TierBrokers Service Providers
Netsolve
Security
Collaboration
Computing
Middle Tier becomes Web Services
Clients
Users and Devices
7
Peer to Peer Grid
Peers
Service FacingWeb Service Interfaces
Peers
User FacingWeb Service Interfaces
Peer to Peer Grid
A democratic organization
8
The Grid and Web Service Institutional Hierarchy
4 Application or Community of Interest
(CoI)Specific Services such as Map Services,
Run BLAST or Simulate a Missile
XBMLXTCE VOTABLE CML CellML
3 Generally Useful Services and Features (OGSA
and other GGF, W3C) Such as Collaborate,
Access a Database or Submit a Job
OGSA GS-and some WS- GGF/W3C/.
2 System Services and Features (WS- from
OASIS/W3C/Industry) Handlers like WS-RM,
Security, UDDI Registry
WS- fromOASIS/W3C/Industry
1 Container and Run Time (Hosting) Environment
(Apache Axis, .NET etc.)
Apache Axis.NET etc.
Must set standards to get interoperability
9
Sources of Grid Technology
  • Grids support distributed collaboratories or
    virtual organizations integrating concepts from
  • The Web
  • Agents
  • Distributed Objects (CORBA Java/Jini COM)
  • Globus, Legion, Condor, NetSolve, Ninf and other
    High Performance Computing activities
  • Peer-to-peer Networks
  • With perhaps the Web and P2P networks being the
    most important for Information Grids and Globus
    for Compute/File Grids

10
The Essence of Grid Technology?
  • We will start from the Web view and assert that
    basic paradigm is
  • Meta-data rich Web Services communicating via
    messages
  • These have some basic support from some runtime
    such as .NET, Jini (pure Java), Apache
    TomcatAxis (Web Service toolkit), Enterprise
    JavaBeans, WebSphere (IBM) or GT3/4 (Globus
    Toolkit 3/4)
  • These are the distributed equivalent of operating
    system functions as in UNIX Shell
  • Called Hosting Environment or platform
  • W3C standard WSDL defines IDL (Interface
    standard) for Web Services

11
What is Happening?
  • Grid ideas are being developed in (at least) four
    communities
  • Web Service W3C, OASIS, (DMTF)
  • Global Grid Forum (High Performance Computing,
    e-Science)
  • Enterprise Grid Alliance (Commercial Grid Forum
    with a near term focus)
  • Service Standards are being debated
  • Grid Operational Infrastructure is being deployed
  • Grid Architecture and core software being
    developed
  • Apache has several important projects as do
    academia large and small companies
  • Particular System Services are being developed
    centrally OGSA framework for this in GGF
    WS- for OASIS/W3C/Microsoft-IBM
  • Lots of fields are setting domain specific
    standards and building domain specific services
  • USA started but now Europe is probably in the
    lead and Asia will soon catch USA if momentum
    (roughly zero for USA) continues

12
Technical Activities of Note
  • Look at different styles of Grids such as
    Autonomic (Robust Reliable Resilient)
  • New Grid architectures hard due to investment
    required
  • Program the Grid Workflow
  • Access the Grid Portals, Grid Computing
    Environments
  • Critical Services Such as
  • Security build message based not connection
    based
  • Notification event services
  • Metadata Use Semantic Web, provenance
  • Fabric and Service Management
  • Databases and repositories instruments, sensors
  • Computing Submit job, scheduling, distributed
    file systems
  • Visualization, Computational Steering
  • Network performance

Low Level WS-
High Level e.g. OGSA
13
What do Web Services Prescribe?
  • The specify interfaces for system services (and
    generally useful services like database)
  • They specify an interface language (WSDL) for all
    services
  • They develop containers and frameworks to use to
    host services
  • They specify a message format (SOAP) for ALL
    messages that defines both application and system
    actions precisely
  • They imply a process be started to define domain
    specific services
  • There are multiple competing activities from
    Microsoft and IBM to Apache, IU and Anabas (for
    example) developing system and application
    services
  • Unlike for RTI and CORBA, services from different
    vendors should interoperate

14
What do Grids Add?
  • Grids use all of the Web Services
  • They address management and deployment of large
    distributed systems of services
  • Internet Scale Distributed Services
  • I will use Grid more simply as a composable
    coordinated collection of services
  • They address security and management issues of
    virtual organizations crossing multiple
    administrative domains
  • GGF is developing specific services of relevance
    including job management, many aspects of data
    and scheduling
  • Not much on sensors, real-time, P2P
  • GGF has a good process for developing new higher
    level specifications

15
Plethora of Standards
  • Java is very powerful partly due to its many
    frameworks that generalize libraries e.g.
  • Java Media Framework
  • Java Database Connectivity JDBC
  • Web Services have a correspondingly collections
    of specifications that represent critical
    features of the distributed operating systems for
    Grids of Simple Services
  • About 60 WS- specifications introduced in last
    2-3 years
  • These are low level with higher level standards
    such as access database (OGSA-DAI) or Submit a
    job built on top of these
  • Many battles both between standard bodies and
    between companies as each tries to set standards
    they consider best thus there are multiple
    standards for many of key Web Service
    functionalities
  • Microsoft a key player and stands to benefit as
    Web Services open up enterprise software space to
    all participants
  • e.g. MQSeries (IBM) and Tibco have to change
    their messaging systems to support new open
    standards

16
The Ten areas covered by the 60 core WS-
Specifications
WS- Specification Area Examples
1 Core Service Model XML, WSDL, SOAP
2 Service Internet WS-Addressing, WS-MessageDelivery Reliable Messaging WSRM Efficient Messaging MOTM
3 Notification WS-Notification, WS-Eventing (Publish-Subscribe)
4 Workflow and Transactions BPEL, WS-Choreography, WS-Coordination
5 Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation
6 Service Discovery UDDI, WS-Discovery
7 System Metadata and State WSRF, WS-MetadataExchange, WS-Context
8 Management WSDM, WS-Management, WS-Transfer
9 Policy and Agreements WS-Policy, WS-Agreement
10 Portals and User Interfaces WSRP (Remote Portlets)
17
Activities in Global Grid Forum Working Groups
GGF Area GS- and OGSA Standards Activities
1 Architecture High Level Resource/Service Naming (level 2 of slide 6), Integrated Grid Architecture
2 Applications Software Interfaces to Grid, Grid Remote Procedure Call, Checkpointing and Recovery, Interoperability to Job Submittal services, Information Retrieval,
3 Compute Job Submission, Basic Execution Services, Service Level Agreements for Resource use and reservation, Distributed Scheduling
4 Data Database and File Grid access, Grid FTP, Storage Management, Data replication, Binary data specification and interface, High-level publish/subscribe, Transaction management
5 Infrastructure Network measurements, Role of IPv6 and high performance networking, Data transport
6 Management Resource/Service configuration, deployment and lifetime, Usage records and access, Grid economy model
7 Security Authorization, P2P and Firewall Issues, Trusted Computing
18
The Global Information Grid Core Enterprise
Services
Core Enterprise Services Service Functionality
CES1 Enterprise Services Management (ESM) including life-cycle management
CES2 Information Assurance (IA)/Security Supports confidentiality, integrity and availability. Implies reliability and autonomic features
CES3 Messaging Synchronous or asynchronous cases
CES4 Discovery Searching data and services
CES5 Mediation Includes translation, aggregation, integration, correlation, fusion, brokering publication, and other transformations for services and data. Possibly agents
CES6 Collaboration Provision and control of sharing with emphasis on synchronous real-time services
CES7 User Assistance Includes automated and manual methods of optimizing the user GiG experience (user agent)
CES8 Storage Retention, organization and disposition of all forms of data
CES9 Application Provisioning, operations and maintenance of applications.
19
The Core Service Areas I
Service or Feature WS- GS- NCES (DoD) Comments
A Broad Principles A Broad Principles A Broad Principles A Broad Principles A Broad Principles
FS1 Use SOA Service Oriented Arch. WS1 WS1 WS1 Core Service Model, Build Grids on Web Services. Industry best practice
FS2 Grid of Grids Strategy for legacy subsystems and modular architecture
B Core Services B Core Services B Core Services B Core Services B Core Services
FS3 Service Internet, Messaging WS2 NCES3 Streams/Sensors
FS4 Notification WS3 NCES8 JMS, MQSeries
FS5 Workflow WS4 NCES5 Grid Programming
FS6  Security WS5 GS7 NCES2 Grid-Shib, Permis Liberty Alliance ...
FS7 Discovery WS6 NCES4
FS8 System Metadata State WS7 Globus MDS Semantic Grid
FS9 Management WS8 GS6 NCES1 CIM
FS10 Policy WS9 ECS
20
The Core Service Areas II
Service or Feature WS- GS- NCES Comments
B Core Services (Continued) B Core Services (Continued) B Core Services (Continued) B Core Services (Continued) B Core Services (Continued)
FS11 Portals and User assistance WS10 NCES7 Portlets JSR168, NCES Capability Interfaces
FS12 Computing FS12 Computing GS3
FS13 Data and Storage FS13 Data and Storage GS4 NCES8 NCOW Data Strategy
FS14 Information FS14 Information GS4 JBI for DoD, WFS for OGC
FS15 Applications and User Services FS15 Applications and User Services GS2 NCES9 Standalone Services Proxies for jobs
FS16 Resources and Infrastructure FS16 Resources and Infrastructure GS5 Ad-hoc networks
FS17 Collaboration and Virtual Organizations FS17 Collaboration and Virtual Organizations GS7 NCES6 XGSP, Shared Web Service ports
FS18 Scheduling and matching of Services and Resources FS18 Scheduling and matching of Services and Resources GS3
21
A List of Web Services 1
  • 1) Core Service Architecture
  • XSD XML Schema (W3C Recommendation) V1.0 February
    1998, V1.1 February 2004
  • WSDL 1.1 Web Services Description Language
    Version 1.1, (W3C note) March 2001
  • WSDL 2.0 Web Services Description Language
    Version 2.0, (W3C under development) March 2004
  • SOAP 1.1 (W3C Note) V1.1 Note May 2000
  • SOAP 1.2 (W3C Recommendation) June 24 2003

22
A List of Web Services 2
  • 2) Service Internet including messaging
  • WS-Addressing Web Services Addressing (BEA, IBM,
    Microsoft, SAP, Sun) in W3C consideration August
    2004
  • WS-MessageDelivery Web Services Message Delivery
    (W3C Submission by Oracle, Sun ..) April 2004
  • WS-Reliability Web Services Reliable Messaging
    (OASIS Web Services Reliable Messaging TC) March
    2004
  • WS-RM Web Services Reliable Messaging (BEA, IBM,
    Microsoft, Tibco) v0.992 February 2005 linked to
    WS-Reliability in OASIS as Web Services Reliable
    Exchange (WS-RX)
  • WS-RM Policy Web Services Reliable Messaging
    Policy Assertion (BEA, IBM, Microsoft, Tibco)
    March 2006
  • WS-RX Web Services Reliable Exchange (Many
    members) integrating previous reliability
    specifications
  • SOAP MOTM SOAP Message Transmission Optimization
    Mechanism (W3C) June 2004
  • SOAP-over-UDP Binding of SOAP to UDP (Microsoft,
    BEA ) September 2004
  • Many obsolete specifications like WS-Routing and
    Referral SOAP Routing Protocol (Microsoft)
    October 2001

23
Application Specific Grids Generally Useful
Services and Grids Workflow WSFL/BPEL Service
Management (Context etc.) Service Discovery
(UDDI) / Information Service Internet Transport ?
Protocol Service Interfaces WSDL
Higher Level Services
ServiceContext
ServiceInternet
Base Hosting Environment
Protocol HTTP FTP DNS Presentation XDR
Session SSH Transport TCP UDP Network IP
Data Link / Physical
Bit level Internet (OSI Stack)
Layered Architecture for Web Services and Grids
24
WS- implies the Service Internet
  • We have the classic (CISCO, Juniper .) Internet
    routing the flood of ordinary packets in OSI
    stack architecture
  • Web Services build the Service Internet or IOI
    (Internet on Internet) with
  • Routing via WS-Addressing not IP header
  • Fault Tolerance (WS-RM not TCP)
  • Security (WS-Security/SecureConversation not
    IPSec/SSL)
  • Data Transmission by WS-Transfer not HTTP
  • Information Services (UDDI/WS-Context not
    DNS/Configuration files)
  • At message/web service level and not packet/IP
    address level
  • Software-based Service Internet possible as
    computers fast
  • Familiar from Peer-to-peer networks and built as
    a software overlay network defining Grid (analogy
    is VPN)
  • SOAP Header contains all information needed for
    the Service Internet (Grid Operating System)
    with SOAP Body containing information for Grid
    application service

25
A List of Web Services 3
  • 3) Notification and high-level publish/subscribe
    information dissemination
  • WS-Eventing Web Services Eventing (BEA,
    Microsoft, TIBCO) August 2004
  • WS-EventNotification (HP, IBM, Intel, Microsoft)
    March 2006 uses resources to manage subscriptions
  • WS-Notification Framework for Web Services
    Notification with WS-Topics, WS-BaseNotification,
    and WS-BrokeredNotification (OASIS) OASIS Web
    Services Notification TC Set up March 2004
  • JMS Java Message Service V1.1 March 2002
  • Different from using publish-subscribe to
    robustly support messaging between Web services
  • Bind SOAP to JMS or MQSeries

26
A List of Web Services 4
  • 4) Coordination and Workflow, Transactions and
    Contextualization
  • BPEL Business Process Execution Language for Web
    Services (OASIS) V1.1 May 2003 (V1.1) with V2.0
    under development
  • WS-CDL Web Services Choreography Language (W3C)
    V1.0 Working Draft 17 December 2004
  • WSCI (W3C) Web Service Choreography Interface
    V1.0 (W3C Note from BEA, Intalio, SAP, Sun,
    Yahoo)
  • WSCL Web Services Conversation Language (W3C
    Note) HP March 2002
  • Workflow is general linkage between services
    transactions are a critical special case
  • Concept of workflow generalizes traditional
    workflow processes in business

27
A List of Web Services 4-Continued
  • 4) Transactions, Business Processes and
    Contextualization
  • WS-CAF Web Services Composite Application
    Framework including WS-CTX, WS-CF and WS-TXM
    below (OASIS Web Services Composite Application
    Framework TC)
  • WS-CTX Web Services Context (OASIS Web Services
    Composite Application Framework TC) V0.9.2 July
    2005
  • WS-CF Web Services Coordination Framework (OASIS
    Web Services Composite Application Framework TC)
    V0.1 April 2005
  • WS-TXM Web Services Transaction Management (OASIS
    Web Services Composite Application Framework TC)
    including WS-ACID (V0.1 May 2005), WS-BP
    (Business Process V0.1 May 2005), WS-LRA (Long
    running action V0.1 May 2005)
  • WS-Coordination Web Services Coordination (BEA,
    IBM, Microsoft) November 2004
  • WS-AtomicTransaction Web Services Atomic
    Transaction (BEA, IBM, Microsoft) November 2004
  • WS-BusinessActivity Web Services Business
    Activity Framework (BEA, IBM, Microsoft) November
    2004
  • BTP Business Transaction Protocol (OASIS) May
    2002 with V1.1 November 2004
  • ebXML BPSS Business Process (OASIS) with V2.0.1
    pre-Committee Draft review 17 July 2005

28
A List of Web Services 5
  • 5) Security Frameworks and Core Specifications
  • WS-Security 2004 Web Services Security SOAP
    Message Security (OASIS) Standard March 2004.
  • WS-I Basic Security Profile V1.0 Web Services
    Interoperability Organization Working Group Draft
    May 15 2005
  • WS-Security Username Token Profile Web Services
    Security Username Token Profile V1.0 OASIS
    Standard, March 2004
  • WS-Security X.509 Certificate Token Profile Web
    Services Security X.509 Certificate Token Profile
    OASIS Standard, March 2004
  • WS-Security REL Profile Web Services Security
    Rights Expression Language (REL) Token Profile
    OASIS Standard 19 December 2004
  • WS-I REL Token Profile V1.0 Web Services
    Interoperability Organization Working Group Draft
    13 May 2005
  • WS-Security Kerberos Web Services Security
    Kerberos Binding (Microsoft) December 2003
  • Web-SSO Web Single Sign-On Metadata Exchange
    Protocol (Microsoft, Sun) April 2005
  • Web-SSO-Mex Web Single Sign-On Interoperability
    Profile (Microsoft, Sun) April 2005
  • WS-SecurityPolicy Web Services Security Policy
    Language (IBM, Microsoft, RSA, Verisign) V1.1
    July 2005

29
A List of Web Services 5 - Contd
  • 5) Security Capabilities
  • WS-Trust Web Services Trust Language (BEA, IBM,
    Microsoft, RSA, Verisign ) February 2005
  • WS-SecureConversation Web Services Secure
    Conversation Language (BEA, IBM, Microsoft, RSA,
    Verisign ) February 2005
  • WS-Federation Web Services Federation Language
    (BEA, IBM, Microsoft, RSA, Verisign) July 2003
  • WS-Federation Active Requestor Profile Web
    Services Federation Language Active Requestor
    Profile V 1.0 (BEA, IBM, Microsoft, RSA,
    Verisign) July 8, 2003
  • WS-Federation Passive Requestor Profile Web
    Services Federation Language Passive Requestor
    Profile V 1.0 (BEA, IBM, Microsoft, RSA,
    Verisign) July 8, 2003
  • WS-Authorization is being developed by IBM and
    Microsoft and will build on WS-Trust to describe
    how access to particular web services is
    specified and managed.
  • WS-Privacy is being developed by IBM and
    Microsoft and will build on WS-Policy to describe
    the binding of privacy policies to Web services
    and their exchanged data.

30
A List of Web Services 5 - Contd
  • 5) Security Languages
  • SAML Assertions and Protocols for the OASIS
    Security Assertion Markup Language (SAML) V2.0
    OASIS Standard, 15 March 2005
  • WS-Security SAML Token Profile Web Services
    Security SAML Token Profile OASIS Standard, 1
    December 2004
  • WS-I SAML Token Profile V1.0 Web Services
    Interoperability Organization Working Group Draft
    13 May 2005
  • XACML eXtensible Access Control Markup Language
    (OASIS) V2.0 1 February 2005

31
A List of Web Services 6
  • 6) Service Discovery
  • UDDI (Broadly Supported OASIS Standard) V3 August
    2003
  • WS-Discovery Web services Dynamic Discovery
    (Microsoft, BEA, Intel ) February 2004
  • WS-IL Web Services Inspection Language, (IBM,
    Microsoft) November 2001
  • Note WS-Context as a metadata catalog and
    WS-Management Catalog are examples of related
    services
  • There are many UDDI extensions

32
A List of Web Services 7
  • 7) Metadata and State
  • RDF Resource Description Framework (W3C) Set of
    recommendations expanded from original February
    1999 standard
  • DAMLOIL combining DAML (Darpa Agent Markup
    Language) and OIL (Ontology Inference Layer)
    (W3C) Note December 2001
  • OWL Web Ontology Language (W3C) Recommendation
    February 2004
  • WS-MetadataExchange 1.1 Web Services Metadata
    Exchange (HP, IBM, Intel, Microsoft) March 2006
  • ASAP Asynchronous Service Access Protocol (OASIS)
    with V1.0 working draft 2B December 11 2004
  • WS-GAF Web Service Grid Application Framework
    (Arjuna, Newcastle University) August 2003
  • WBEM Web-Based Enterprise Management including
    CIM (Common Information Model) from DMTF
    (Distributed Management Task Force) 2004-2005

33
A List of Web Services 7
  • 7) Metadata and State Resource Framework
  • WS-RF Web Services Resource Framework (OASIS)
    including
  • WS-Resource Framework Web Services Resource 1.2
    (OASIS) Public Review Draft 01, 10 June 2005
  • WS-ResourceProperties Web Services Resource
    Properties V1.2 Public Review Draft 01, 10 June
    2005
  • WS-ResourceLifetime Web Services Resource
    Lifetime V1.2 Public Review Draft 01, 13 June
    2005
  • WS-ServiceGroup Web Services Service Group V1.2
    Public Review Draft 01, 10 June 2005
  • WS-BaseFaults Web Services Base Faults V1.2
    Public Review Draft 01, June 13, 2005

34
Metadata and Service Context
  • Consider a collection of services working
    together
  • Workflow tells you how to specify service
    interaction but more basically there is shared
    information or context specifying/controlling
    collection
  • WS-RF and WS-GAF have different approaches to
    contextualization supplying a common context
    which at its simplest is a token to represent
    state
  • More generally core shared information includes
    dynamic service metadata and the equivalent of
    configuration information.
  • One can supports such a common context either as
    pool of messages or as message-based access to a
    database (Context Service)
  • Two services linked by a stream are perhaps
    simplest example of a collection of services
    needing context
  • Note that there is a tension between storing
    metadata in messages and services.
  • This is shared versus distributed memory debate
    in parallel computing

35
Stateful Interactions
  • There are (at least) four approaches to
    specifying state
  • OGSI use factories to generate separate services
    for each session in standard distributed object
    fashion
  • Globus GT-4 and WSRF use metadata of a resource
    to identify state associated with particular
    session
  • WS-GAF uses WS-Context to provide abstract
    context defining state. Has strength and weakness
    that reveals less about nature of session
  • WS-I Pure Web Service leaves state
    specification the application e.g. put a
    context in the SOAP body
  • I think we should smile and write a great
    metadata service hiding all these different
    models for state and metadata

36
A List of Web Services 8
  • 8) Management original OASIS
  • WS-DistributedManagement Web Services Distributed
    Management Framework with MUWS and MOWS below
    (OASIS)
  • WSDM-MUWS Web Services Distributed Management
    Management Using Web Services (OASIS) OASIS
    Standard March 9 2005
  • WSDM-MOWS Web Services Distributed Management
    Management of Web Services (OASIS) OASIS Standard
    March 9 2005

37
A List of Web Services 8- Contd
  • 8) Management Microsoft Converged Stack
  • WS-Management Web Services for Management
    (Microsoft, Intel, Sun ) August 2005
  • WS-Management Catalog The WS-Management Catalog
    (Microsoft, Intel, Sun ) August 2005
  • WS-ResourceTransfer Web Service Resource Transfer
    (HP, IBM, Intel, Microsoft) March 2006
  • WS-Transfer Web Service Transfer (Microsoft, BEA,
    Sonic Software etc.) September 2004
  • WS-TransferAddendum Extensions to Web Service
    Transfer (HP, IBM, Intel, Microsoft) March 2006
  • WS-Enumeration Web Service Enumeration
    (Microsoft, BEA, Sonic Software etc.) September
    2004

38
A List of Web Services 9
  • 9) General Service Characteristics
  • WS-PolicyFramework Web Services Policy Framework
    (BEA, IBM, Microsoft, SAP ) September 2004
  • WS-PolicyAttachment Web Services Policy
    Attachment (BEA, IBM, Microsoft, SAP ) September
    2004
  • WS-PolicyAssertions Web Services Policy
    Assertions Language (BEA, IBM, Microsoft, SAP) 18
    December 2002 (Superseded by WS-PolicyFramework)
  • WS-Agreement Web Services Agreement Specification
    (GGF under development) 9 August 2004

39
A List of Web Services 10
  • 10) User Interfaces
  • WSRP Web Services for Remote Portlets (OASIS)
    OASIS Standard August 2003
  • JSR168 JSR-000168 Portlet Specification for Java
    binding (Java Community Process) October 2003
  • WSRP specifies the client-service protocol while
    JSR168 specifies how portlets are implemented for
    each supported service user-facing Web service
    ports inside aggregating portalslike JetSpeed,
    GridSphere or uPortal

40
WS-I Interoperability
  • Critical underpinning of Grids and Web Services
    is the gradually growing set of specifications in
    the Web Service Interoperability Profiles
  • Web Services Interoperability (WS-I)
    Interoperability Profile 1.0a."
    http//www.ws-i.org. gives us XSD, WSDL1.1,
    SOAP1.1, UDDI in basic profile and parts of
    WS-Security in their first security profile.
  • We imagine the 60 Specifications being checked
    out and evolved in the cauldron of the real world
    and occasionally best practice identifies a new
    specification to be added to WS-I which gradually
    increases in scope
  • Note only 4.5 out of 60 specifications have made
    it in this definition

41
Some ideas to Remember
  • Grids are managed Web Services exchanging
    Messages
  • P2P Networks are differently managed and
    architected services exchanging messages
  • Any computer operation involves messages not all
    these messages can be isolated
  • With services all messages are explicit and can
    be examined
  • Grid Services extend WS- Web Service
    Specifications
  • Web Service container replaces computer
  • Service replaces process
  • A stream is an ordered set of messages
  • Service Internet replaces Internet messages
    replace packets
  • (Sub)Grids replace Libraries

42
Internet Scale Distributed Services
  • Grids use Internet technology and are
    distinguished by managing or organizing sets of
    network connected resources
  • Classic Web allows independent one-to-one access
    to individual resources
  • Grids integrate together and manage multiple
    Internet-connected resources People, Sensors,
    computers, data systems
  • Organization can be explicit as in
  • TeraGrid which federates many supercomputers
  • Information Retrieval Grid which federates
    multiple data resources
  • CrisisGrid which federates first responders,
    commanders, sensors, GIS, (Tsunami) simulations,
    science/public data
  • Organization can be implicit as in Internet
    resources such as curated databases and
    simulation resources that harmonize a community

43
Different Visions of the Grid
  • e-Science or Cyberinfrastructure are virtual
    organization Grids supporting global distributed
    engineering and science research (note sensors,
    instruments are people are all distributed)
  • Utility Computing or X-on-demand (Xdata,
    computer ..) is a major computer Industry
    interest in Grids and this is key part of
    enterprise or campus Grids
  • Skype (Kazaa) VOIP system is a Peer-to-peer Grid
    (and VRVS/GlobalMMCS like Internet A/V
    conferencing are Collaboration Grids)
  • DoDs vision of Network Centric Computing can be
    considered a Grid (linking sensors, warfighters,
    commanders, backend resources) and they are
    building the GIG (Global Information Grid)
  • Commercial 3G Cell-phones and DoD ad-hoc network
    initiative are forming mobile Grids
  • Grids support universal Globalization in life,
    fun, research, business

44
Why use SOAs
  • Globalization of applications Life, Fun,
    Research, Business, Defense as an International
    collaborative activity
  • Globalization of Software Production Software
    components including open-source made everywhere
  • Interoperability in interfaces and protocol
    (messages) requires Web Services as only broadly
    supported SOA
  • Anti-Performance if Moores law gives you a
    factor X, then use vX for performance, v X for
    improved lifecycle (re-use)
  • Software Engineering Software paradigms are ways
    of packaging modules/components/objects/methods/
    subroutines. Services have minimal coupling and
    best re-use (lowest performance). 1962 Fortran
    easier re-use than 2006 Java
  • Multicore chips requires pervasive concurrency
    without side effects. Even Microsoft must be able
    to use 32-128 way parallelism on a chip over next
    5 years

45
Intel Fall 2005 Multicore Roadmap
March 2006 Sun T1000 8 core Server at lt6,000
46
Performance Per Transistor
Peter Kogge 1997
Normalized SPECINTS
Normalized SPECFLTS
Millions of Transistors (CPU)
Millions of Transistors (CPU)
  • Performance data from uP vendors
  • Transistor count excludes on-chip caches
  • Performance normalized by clock rate
  • Conclusion Simplest is best! (250K Transistor
    CPU)

47
1962 Lickliders Vision
  • Lick had this concept all of the stuff
    linked together throughout the world, that you
    can use a remote computer, get data from a remote
    computer, or use lots of computers in your job.
  • Larry Roberts Principal Architect of the ARPANET

48
Physics and the Web
  • Tim Berners-Lee developed the Web at CERN as a
    tool for exchanging information between the
    partners in physics collaborations
  • The first Web Site in the USA was a link to the
    SLAC library catalogue
  • It was the international particle physics
    community who first embraced the Web
  • Killer application for the Internet
  • Transformed modern world academia, business and
    leisure

49
What is e-Science?
  • e-Science is about global collaboration in
    key areas of science, and the next generation of
    infrastructure that will enable it.
  • John Taylor
  • Director General of Research Councils
  • UK, Office of Science and Technology
  • e-Science is about developing tools and
    technologies that allow scientists to do faster,
    better or different research

50
Example e-Science Projects
  • Particle Physics
  • global sharing of data and computation
  • Astronomy
  • Virtual Observatory for multi-wavelength
    astrophysics
  • Chemistry
  • remote control of equipment and electronic
    logbook
  • Bioinformatics
  • data integration, knowledge discovery and
    workflow
  • Healthcare
  • sharing normalized mammograms
  • Environment
  • Ocean, weather, climate modelling, sensor networks

51
e-moreorlessanything and the Grid
  • e-Business captures an emerging view of
    corporations as dynamic virtual organizations
    linking employees, customers and stakeholders
    across the world.
  • The growing use of outsourcing is one example
  • e-Science is the similar vision for scientific
    research with international participation in
    large accelerators, satellites or distributed
    gene analyses.
  • The Grid integrates the best of the Web,
    traditional enterprise software, high performance
    computing and Peer-to-peer systems to provide the
    information technology e-infrastructure for
    e-moreorlessanything.
  • A deluge of data of unprecedented and inevitable
    size must be managed and understood.
  • People, computers, data and instruments must be
    linked.
  • On demand assignment of experts, computers,
    networks and storage resources must be supported

52
Science is a Team Sport
Life Sciences
53
Technology Today is More than Computers
  • Todays computer is a coordinated set of
    hardware, software, and services providing an
    end-to-end resource.
  • Cyberinfrastructure captures how the SE
    community has redefined computer

The computer as an integrated set of resources
54
Integrated Cyberinfrastructure
Cyberinfrastructure resources (computers,
data storage, networks, scientific instruments,
experts, etc.) glue (integrating software,
systems, and organizations).
NSFs Atkins Report provided a compelling
vision for integrated Cyberinfrastructure
55
How does Cyberinfrastructure Work?Cyberinfrastruc
ture-enabled Neurosurgery
  • PROBLEM Neuro-surgeons seek to remove as much
    tumor tissue as possible while minimizing removal
    of healthy brain tissue
  • Brain deforms during surgery
  • Surgeons must align preoperative brain image with
    intra-operative images to provide surgeons the
    best opportunity for intra-surgical navigation

56
Cyberinfrastructure and Computation -- Parallelism
  • Two ways of making computers solve problems
    faster
  • Make CPUs faster
  • Divide the problem into parts use more than one
    CPU interconnected by a network to run each of
    the parts simultaneously (parallelism)

57
Cyberinfrastructure and Computation Grid
Computing
  • Grid Computing takes the parallel computer out
    of the box
  • Allow the CPUs to be in different geographical
    locations
  • Connect many different kinds of components

NVO analysis can involve connecting the
telescope, data archive, and computer through
grid computing
Internet
58
National-scale Grid Projects
Open Science Grid Physics-driven Grid
infrastructure
NEES Earthquake Engineering Grid
59
Community Tools
  • e-mail and list-serves are oldest and best used
  • Kazaa, Instant Messengers, Skype, Napster,
    BitTorrent for P2P Collaboration text,
    audio-video conferencing, files
  • del.icio.us, Connotea, Citeulike manage shared
    bookmarks
  • hotornot.com or similar sites allow you to create
    community resources and share them
  • Writely, Wikis and Blogs are powerful specialized
    shared document systems
  • ConferenceXP and WebEx share general applications
  • Google Scholar tells you who has cited your
    papers while publisher sites tell you about
    co-authors
  • Note sharing resources creates (implicit)
    communities
  • Social network tools study graphs to both define
    communities and extract their properties

60
Entertainment Cyberinfrastructure
Role Playing Games support distributed players
in a shared scenario
Meanwhile games like chess are (apart from issues
like cheating) probably equally good on the
Internet as face-to-face. Grandmasters can give
lessons using Skype, text chats and shared chess
games as in internetchess.com. They can be paid
by paypal.com
61
Raw Data ? Data ? Information ?
Knowledge ? Wisdom
AnotherGrid
Decisions
AnotherGrid
SS
SS
SS
SS
FS
FS
OS
MD
MD
FS
Portal
FS
OS
OS
OS
SOAP Messages
OS
FS
FS
FS
FS
AnotherService
FS
MD
MD
OS
MD
OS
OS
FS
Other Service
FS
FS
FS
FS
OS
MD
OS
OS
FS
FS
FS
MD
MD
FS
Filter Service
OS
FS
MetaData
AnotherGrid
FS
FS
FS
MD
Sensor Service
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
AnotherService
62
Semantic Grid and Services
  • Implications of SOA (Service Oriented
    Architectures) for SG (Semantic Grid)
  • Build services to implement SG
  • Implications of SG for SOA
  • Build metadata rich systems of services using SG
  • Services receive data in SOAP messages,
    manipulate it and produce transformed data as
    further messages
  • Meta-data is carried in SOAP messages
  • Meta-data controls processing and transport of
    SOAP Messages
  • Knowledge is created from data by services
  • The Grid enhances Web services with semantically
    rich system and application specific management
  • One must exploit and work around the different
    approaches to meta-data and their manipulation in
    Web Services

63
Structure of SOAP Messages
  • SOAP Messages have System information in the
    header including WS-Policy based meta-data
    defining processing options
  • Processed by Handlers
  • Application data and meta-data is the body
    (controversies here!)
  • Processed by the Service itself
  • Some meta-data like WS-RF is logically only in
    messages
  • Other like that in WS-Context or the SRB are
    stored in logical equivalent of XML databases
  • We only need to preserve semantic structure
    (XML/SOAP Infoset) so transport in fast XML and
    store in efficient relational databases

64
What Type of Services are there?
  • There are a horde of support services supplying
    security, collaboration, database access, user
    interfaces
  • The support services are either associated with
    system or application
  • We studied the WS- and GS- which implicitly or
    explicitly define many support services
  • There are generalized filter services which are
    applications that accept messages and produce new
    messages with some data derived from that in
    input
  • Simulations (including PDEs and reactive
    systems)
  • Data-mining
  • Transformations
  • Agents
  • Reasoning are all termed filters
    here
  • There are services like author ontology, parse
    RDF or attach provenance that directly support
    Semantic Grid
  • But all services and their interactions are
    bathed in sea of meta-data and so implicitly need
    and support the Semantic Grid

65
Its a Composite Hierarchical World
  • Filters can be a workflow which means they are
    just collections of other simpler services
  • One needs meta-data to control the workflow
  • Services are programs that accept messages and
    produce messages
  • Grids are a distributed collection of services
    supporting managed shared resources
  • Management requires meta-data
  • Grids are distributed systems that accept
    distributed messages and produce distributed
    result messages
  • Can always talk about Grids and view a service
    or a workflow as a special case of a Grid
  • It just requires meta-data to send a message to a
    Grid and it routed to correct computer holding
    requested service
  • Meta-data allows mapping of virtual to real
    addresses

66
Semantically Rich Services with a Semantically
Rich Distributed Operating Environment
Filter Service
OS
FS
FS
MD
MD
FS
FS
OS
OS
OS
Portal
OS
FS
FS
FS
FS
FS
MD
MD
OS
MD
OS
OS
FS
Other Service
FS
FS
FS
FS
OS
MD
OS
OS
FS
FS
FS
MD
MD
FS
OS
FS
MetaData
FS
FS
FS
MD
Sensor Service
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
67
Consequences of Rule of the Millisecond
  • Useful to remember critical time scales
  • 1) 0.000001 ms CPU does a calculation
  • 2a) 0.001 to 0.01 ms Parallel Computing MPI
    latency
  • 2b) 0.001 to 0.01 ms Overhead of a Method Call
  • 3) 1 ms wake-up a thread or process
  • 4) 10 to 1000 ms Internet delay
  • 2a), 4) implies geographically distributed
    metacomputing cant in general compete with
    parallel systems
  • 3) ltlt 4) implies a software overlay network is
    possible without significant overhead
  • We need to explain why it adds value of course!
  • 2b) versus 3) and 4) describes regions where
    method and message based programming paradigms
    important

68
Linking Modules
  • From method based to RPC to message based to
    event-based publish-subscribe Message Oriented
    Middleware

ListenerSubscribe to Events
Publisher Post Events
Message Queue in the Sky
69
What is a Simple Service?
  • Take any system it has multiple functionalities
  • We can implement each functionality as an
    independent distributed service
  • Or we can bundle multiple functionalities in a
    single service
  • Whether functionality is an independent service
    or one of many method calls into a glob of
    software, we can always make them as Web
    services by converting interface to WSDL
  • Simple services are gotten by taking
    functionalities and making as small as possible
    subject to rule of millisecond
  • Distributed services incur messaging overhead of
    one (local) to 100s (far apart) of milliseconds
    to use message rather than method call
  • Use scripting or compiled integration of
    functionalities ONLY when require lt1 millisecond
    interaction latency
  • Apache web site has many (pre Web Service)
    projects that are multiple functionalities
    presented as (Java) globs and NOT (Java) Simple
    Services
  • Makes it hard to integrate sharing common
    security, user profile, file access .. services

70
Grids of Grids of Simple Services
  • Link via methods ? messages ? streams
  • Services and Grids are linked by messages
  • Internally to service, functionalities are linked
    by methods
  • A simple service is the smallest Grid
  • We are familiar with method-linked
    hierarchyLines of Code ? Methods ? Objects ?
    Programs ? Packages

71
Component Grids?
  • So we build collections of Web Services which we
    package as component Grids
  • Visualization Grid
  • Sensor Grid
  • Utility Computing Grid
  • Collaboration Grid
  • Earthquake Simulation Grid
  • Control Room Grid
  • Crisis Management Grid
  • Drug Discovery Grid
  • Bioinformatics Sequence Analysis Grid
  • Intelligence Data-mining Grid
  • We build bigger Grids by composing component
    Grids using the Service Internet

72
Using the Grid of Grids and Core Services to
build multiple application grids re-using common
components.
BioInformatics Grid
Chemical Informatics Grid


15 Application Services Sequencing
Tools Biocomplexity Simulations
Domain Specific Grids/Services
15 Application Services Screening Tools Quantum
Calculations
14 Information
Instrument/Sensor
11 Portals
Services
13 Data Access/Storage
12 Computing
17 Collaboration
9 Management 18 Scheduling
10 Policy
4 Notification
8Metadata
7 Discovery
Core Low Level Grid Services
5 Workflow
6 Security
3 Messaging
9 Management
Physical Network (monitored by FS16)

73
Critical Infrastructure (CI) Grids built as Grids
of Grids
74
Mediation and Transformation in a Grid of Grids
and Simple Services
75
Why can we build better software?
  • In 1962 I was punching holes in cards and paper
    tape to persuade tiny slow computers to
    manipulate words in memory to string together
    instructions like a b c
  • Now computers are much faster and languages are
    better but not a lot better
  • I suspect I would only be a factor of 2 or so
    faster programming the same program today
  • However A B C can now be resources (Bank records,
    Drugs, Games, Supernova) and can be a service
  • Objects were wrong as they distributed ordinary
    programs services express distributed
    independent entities (communication time very
    different inter and intra computers)
  • Services are essential for reliable modular
    programming

76
Whats wrong with old programs
  • They were made of instructions, methods,
    subroutines and libraries thereof
  • Languages (Java, C) encouraged spaghetti
    programming that linked parts of programs
    together
  • This leads to efficient unmaintainable software
  • However now computers and networks are several
    orders of magnitude faster
  • Optimize for modularity and maintainability and
    rarely if ever optimize for performance
  • Old programs have the wrong optimization and by
    construction are hard to optimize

77
Old and New Software Regime
  • Web Services, Grids and P2P systems are built
    with
  • The new software model independent entities
    connected by explicit messages
  • All computer entities are actually connected by
    some form of message (traveling on bus or from
    memory to register) but often implicit
  • And they support the distributed services and
    resources needed for global science, fun and
    business
  • Google, Amazon, Yahoo and perhaps Microsoft and
    Electronic Arts can use
  • Old programs have the old architecture and cannot
    be modified
  • At best can wrap partial functionalities as
    services and use as a black box
  • IBM, Oracle and the old Enterprise software
    companies have this noose around their necks

78
Large and Small Grids
  • N resources in a community (N is billions for the
    world and 1000-10000 for many scientific fields)
  • Communities are arranged hierarchically with real
    work being done in groups of M resources M
    could be 10-100 in e-Science
  • Metcalfes law value of network grows like
    square of number of nodes M we call Grids where
    this true Metcalfe or M2 Grids
  • Nature of Interaction depends on size of M or N
  • Shared Information O(N) Complexity Grids for
    largish N
  • Complexity M2 Metcalfe Grids for smaller M lt N
  • Grids must merge with peer-to-peer networks to
    support both Complexity O(N) and M2 Systems

79
Community Resources
  • Grid Community databases have analogy to
    Television and the News Web that allow
    individuals to communicate instantly with each
    other via Web Pages and Headline News acting as
    proxies
  • N resources deposit information and N can view
    Complexity O(N)

80
M2 Interactions
  • Superimpose M2 Grids on the sea (heatbath) of
    O(N) ordinary interactions

81
Architecture of (Web Service) Grids
  • Grids built from Web Services communicating
    through an overlay network built in SOFTWARE on
    the ordinary internet at the application level
  • Grids provide the special quality of service
    (security, performance, fault-tolerance) and
    customized services needed for distributed
    complex enterprises
  • We need to work with Web Service community as
    they debate the 60 or so proposed Web Service
    specifications
  • Use Web Service Interoperability WS-I as best
    practice
  • Must add further specifications to support high
    performance
  • Database Grid Services for O(N) Community case
  • Streaming support for M2 case
  • We add to WS-, Grid services for managed shared
    resources

82
e-Defense and e-Crisis
  • Grids support Command and Control and provide
    Global Situational Awareness
  • Link commanders and frontline troops to
    themselves and to archival and real-time data
    link to what-if simulations
  • Dynamic heterogeneous wired and wireless networks
  • Security and fault tolerance essential
  • System of Systems Grid of Grids
  • The command and information infrastructure of
    each ship is a Grid each fleet is linked
    together by a Grid the President is informed by
    and informs the national defense Grid
  • Grids must be heterogeneous and federated
  • Crisis Management and Response enabled by a Grid
    linking sensors, disaster managers, and first
    responders with decision support

83
DAME Grid based tools and Infer-structure for
Aero-Engine Diagnosis and Prognosis
XTO
Companies Rolls-Royce DSS Cybula
Universities York, Leeds, Sheffield, Oxford
Engine Model
Case Based Reasoning
Signal Data Explorer
84
DAME Operational Scenario
Engine flight data
5000 engines
Gigabyte per aircraft per Engine per
transatlantic flight
London Airport
New York Airport
Grid
Airline office
Diagnostics Centre
Maintenance Centre
American data center
European data centre
Rolls Royce and UK e-Science ProgramDistributed
Aircraft Maintenance Environment
85
DAME Signal Data Explorer Service
86
NASA Aerospace Engineering Grid
87
Some Important Styles of Grids
  • Computational Grids were origin of concepts and
    link computers across the globe high latency
    stops this from being used as parallel machine
  • Typically Compute/File Grids where information
    (messages) exchanged by writing and reading files
  • Knowledge and Information Grids link sensors and
    information repositories as in Virtual
    Observatories or BioInformatics
  • Education Grids link teachers, learners, parents
    as a VO with learning tools, distant lectures
    etc.
  • e-Science Grids link multidisciplinary
    researchers across laboratories and universities
  • Community Grids focus on Grids involving large
    numbers of peers rather than focusing on linking
    major resources links Grid and Peer-to-peer
    network concepts
  • Semantic Grid links Grid, and AI community with
    Semantic web (ontology/meta-data enriched
    resources) and Agent concepts
  • Collaboration Grids support the linkage of
    multiple people and electronic resources (often
    peer-to-peer architecture)

88
Types of Computing Grids
  • Running Pleasing Parallel Jobs as in United
    Devices, Entropia (Desktop Grid) cycle stealing
    systems
  • Can be managed (inside the enterprise as in
    Condor) or more informal (as in SETI_at_Home)
  • Computing-on-demand in Industry where jobs
    spawned are perhaps very large (SAP, Oracle )
  • Support distributed file systems as in Legion
    (Avaki), Globus with (web-enhanced) UNIX
    programming paradigm
  • Particle Physics will run some 30,000
    simultaneous jobs
  • Distributed Simulation HLA style Grids (some
    work)
  • Linking Supercomputers as in TeraGrid
  • Pipelined applications linking data/instruments,
    compute, visualization
  • Seamless Access where Grid portals allow one to
    choose one of multiple resources with a common
    interfaces
  • Parallel Computing typically NOT suited for a
    Grid (latency)

89
Analysis and Visualization
Large Disks
Old Style Metacomputing Grid
Large Scale Parallel Computers
Spread a single large Problem over multiple
supercomputers
90
Utility and Service Computing
  • An important business application of Grids is
    believed to be utility computing
  • Namely support a pool of computers to be assigned
    as needed to take-up extra demand
  • Pool shared between multiple applications
  • Natural architecture is not a cluster of
    computers connected to each other but rather a
    Farm of Grid Services connected to Internet and
    supporting services such as
  • Web Servers
  • Financial Modeling
  • Run SAP
  • Data-mining
  • Simulation response to crisis like forest fire or
    earthquake
  • Media Servers for Video-over-IP
  • Note classic Supercomputer use is to allow full
    access to do anything via ssh etc.
  • In service model, one pre-configures services for
    all programs and you access portal to run job
    with less security issues

91
UK National Grid Service
Web Services based National Grid Infrastructure
92
Towards an International Grid Infrastructure
UK NGS
Leeds
Manchester
Starlight (Chicago)
US TeraGrid
Netherlight (Amsterdam)
Oxford
RAL
SDSC
NCSA
PSC
UCL
UKLight
SC05
Local laptops in Seattle and UK
All sites connected by production network (not
all shown)
Computation
Steering clients
Network PoP
Service Registry
93
Cyberinfrastructure At Home
  • BOINC (Berkeley Open Infrastructure for Network
    Computing) (http//boinc.berkeley.edu)
  • Climateprediction.net study climate change
  • Einstein_at_home search for gravitational signals
    emitted by pulsars
  • LHC_at_home improve the design of the CERN LHC
    particle accelerator
  • Predictor_at_home investigate protein-related
    diseases
  • Rosetta_at_home help researchers develop cures for
    human diseases
  • SETI_at_home Look for radio evidence of
    extraterrestrial live
  • Etc.

Arecibo telescope
SETI_at_Home averages 138 TFLOPS on 100,000s of
computers in 100s of countries
94
climateprediction.net
Since September 2003 95,000 registered
participants in 150 countries Donated 8,000 years
of computer time Completed 100,000 simulations of
over 4M model years
95
Results so Far the first steps towards a fully
probability-based forecast
96
Information/Knowledge Grids
  • Distributed (10s to 1000s) of data sources
    (instruments, file systems, curated databases )
  • Data Deluge 1 (now) to 100s petabytes/year
    (2012)
  • Moores law for Sensors
  • Possible filters assigned dynamically (on-demand)
  • Run image processing algorithm on telescope image
  • Run Gene sequencing algorithm on compiled data
  • Needs decision support front end with what-if
    simulations
  • Metadata (provenance) critical to annotate data
  • Integrate across experiments as in
    multi-wavelength astronomy

Data Deluge comes from pixels/year available
97
Data Deluged Science
  • In the past, we worried about data in the form of
    parallel I/O or MPI-IO, but we didnt consider
    it as an enabler of new algorithms and new ways
    of computing
  • Data assimilation was not central to HPCC
  • DoE ASCI set up because didnt want test data!
  • Now particle physics will get 100 petabytes from
    CERN
  • Nuclear physics (Jefferson Lab) in same situation
  • Use around 30,000 CPUs simultaneously 24X7
  • Weather, climate, solid earth (EarthScope)
  • Bioinformatics curated databases (Biocomplexity
    only 1000s of data points at present)
  • Virtual Observatory and SkyServer in Astronomy
  • Environmental Sensor nets

98
The Data Deluge
  • In next 5 years e-Science projects will produce
    more scientific data than has been collected in
    the whole of human history
  • Some normalizations
  • The Bible 5 Megabytes
  • Annual refereed papers 1 Terabyte
  • Library of Congress 20 Terabytes
  • Internet Archive (1996 2002) 100 Terabytes
  • In many fields new high throughput devices,
    sensors and surveys will be producing Petabytes
    of scientific data

99
Tracking the Heavens
Hubble Telescope
Palomar Telescope
Sloan Telescope
100
Virtual Observatory Astronomy GridIntegrate
Experiments
Radio
Far-Infrared
Visible
Dust Map
Visible X-ray
Galaxy Density Map
101
The Virtual Observatory
  • Premise most observatory data is (or could be)
    online
  • So, the Internet is the worlds best telescope
  • It has data on every part of the sky
  • In every measured spectral band optical, x-ray,
    radio..
  • Its as deep as the best instruments
  • It is up when you are up
  • The seeing is always great
  • Its a smart telescope links objects and
    data to literature on them
  • Software has became a major expense
  • Share, standardize, reuse..

Slide modified from Alex Szalay, NVO
102
Do
About PowerShow.com