Establishing%20an%20inter-organisational%20OGSA%20Grid:%20Lessons%20Learned - PowerPoint PPT Presentation

About This Presentation
Title:

Establishing%20an%20inter-organisational%20OGSA%20Grid:%20Lessons%20Learned

Description:

Linux. Solaris. Windows XP. 3 ... HW (Intel/SPARC) Operating system (Windows/Solaris/Linux) ... Interesting effect of firewalls on testing and debugging ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 14
Provided by: wolfgang5
Category:

less

Transcript and Presenter's Notes

Title: Establishing%20an%20inter-organisational%20OGSA%20Grid:%20Lessons%20Learned


1
Establishing an inter-organisational OGSA Grid
Lessons Learned
  • Wolfgang Emmerich
  • London Software Systems, Dept. of Computer
    Science
  • University College London
  • Gower St, London WC1E 6BT, U.K
  • http//www.sse.ucl.ac.uk/UK-OGSA

2
An Experimental UK OGSA Testbed
  • Established 12/03-12/04
  • Four nodes
  • UCL (coordinator)
  • NeSC
  • NEReSC
  • LeSC
  • Deployed Globus Toolkit 3.2 throughout onto
  • Heterogeneous HW/OS
  • Linux
  • Solaris
  • Windows XP





3
Experience with GT3.2 Installation
  • Different levels of experience within team
  • Heterogeneity
  • HW (Intel/SPARC)
  • Operating system (Windows/Solaris/Linux)
  • Servlet container (Tomcat/GT3 container)
  • Interaction with previous GT versions
  • Departure from web service standards prevented
    standard tool use
  • JMeter
  • Development environments (Eclipse)
  • Exception management tools (Amberpoint)
  • Interaction with system administration
  • Platform dependencies

4
Performance and Scalability
  • Developed GTMark
  • Server-side load model SciMark 2.0
    (http//math.nist.gov/SciMark)
  • Client-side load model, configuration and metrics
    collection based on J2EE benchmark StockOnline
  • Configurable Benchmark
  • Static vs dynamic discovery of nodes
  • Loads for fixed period of time or until steady
    state obtained
  • Constant or variation of concurrent requests

5
Performance Results
6
Scalability Results
7
Performance Results
  • Performance and scalability of GT3.2 with
    Tomcat/Axis surprisingly good
  • Performance overhead of security is negligible
  • Good scalability - reached 96 of theoretical
    maximum
  • Tomcat performs better than GT3.2 container on
    slow machines
  • Surprising results on raw CPU performance

8
Reliability
  • Tomcat more reliable than GT3.2 container.
  • Tomcat container sustained 100 reliability under
    load
  • GT3.2 container failed once every 300 invocations
    (99.67 reliability)
  • Denial of Service Attack possible by
  • Concurrently invoking operation on the same
    service instance (they are not thread safe!)
  • Fully exhausting resources
  • Problem of hosting more than one service in one
    container
  • Trade-off between reliability and reuse of
    containers across multiple users/services.

9
Security
  • Interesting effect of firewalls on testing and
    debugging
  • Accountability and audit trails demand users be
    given individual accounts on each node
  • Overhead of node and user certificates (they
    always expire at the wrong time)
  • Current security model does not scale
  • Assuming cost of 18/Admin hour
  • 10 users per node (site)
  • It will cost approx. 300,000 to set up a 100
    node grid with 1000 users
  • It will be prohibitively expensive to scale up to
    1,000 nodes(with admin costs in excess of 6M)

10
Deployment
  • How do admins get grid middleware deployed
    systematically onto grid nodes?
  • How can users get the services onto remote hosts?
  • We tried out SmartFrog (http//www.smartfrog.org)
  • Worked very well inside a node.
  • Impossible across organisations
  • SmartFrog daemon would need to execute actions
    with root privileges which some site admins just
    did not agree to
  • Security paramount (SmartFrog would be the
    perfect virus distribution engine)
  • SmartFrogs security infastructure incompatible
    with GT 3.2 infrastructure

11
Looking Ahead
  • Installation efforts need to be reduced
    significantly
  • Binary distributions
  • For a few selected HW/OS platforms
  • Standards compliance
  • Track standards by all means
  • Otherwise no economies of scale
  • Management console
  • Add / remove grid hosts
  • Need to be able to monitor status of grid
    resources
  • Across organisational boundaries
  • More lightweight security model needed
  • Role-based Access Control
  • Trust-delegation
  • Deployment is a first-class citizen
  • Avoid adding as an afterthought
  • Needs to be built into middleware stack

12
Conclusions
  • Very interesting experience
  • Building a distributed system across
    organisational boundaries is different from
    building a system over a LAN
  • Insights that might prove useful for
  • OMII
  • Globus
  • ETF
  • There is a lot more work to do before we realize
    the vision of the Grid!

13
Acknowledgements
  • A large number of people have helped with this
    project, including
  • Dave Berry (NeSC)
  • Paul Brebner (UCL, now CSIRO)
  • Tom Jones (UCL, now Symantec)
  • Oliver Malham (NeSC)
  • David McBride (LeSC)
  • Savas Parastatidis (NEReSC)
  • Steven Newhouse (OMII)
  • Jake Wu (NEReSC)
  • For further details (including IGR) check out
    http//sse.cs.ucl.ac.uk/UK-OGSA
Write a Comment
User Comments (0)
About PowerShow.com