The GridPort Toolkit: a System for Building Grid Portals - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

The GridPort Toolkit: a System for Building Grid Portals

Description:

Bays to Estuaries Project (BBE), http://bbe.npaci.edu ... Basin, Bays to Estuaries (BBE) Portal ... of sediments within the San Diego Bay area during a storm. ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 39
Provided by: sdsc
Category:

less

Transcript and Presenter's Notes

Title: The GridPort Toolkit: a System for Building Grid Portals


1
The GridPort Toolkit a System for Building Grid
Portals
  • M. Thomas, S. Mock, M. Dahan, K. Mueller, D.
    Sutton
  • San Diego Supercomputer Center, UCSD
  • And
  • J. Boisseau
  • Texas Advance Computing Center, Univ. of Texas at
    Austin
  • Presented at the
  • 10th IEEE International Symposium on
  • High Performance Distributed Computing
  • 7-9 August, 2001 San Francisco, CA

2
Outline
  • Intro/Background/Motivation
  • The GridPort Toolkit
  • GridPort-based Application Portals
  • Web Services Experiments
  • Future Work/Conclusions
  • FOR EDs IPAQ (wherever it is)
  • https//hotpage.npaci.edu/pda

3
Motivation
  • Computational science environment is complex
  • Users now have access to a variety of distributed
    resources
  • Interfaces to these resources vary and change
    often
  • Policies, accounts, etc. differ across sites/orgs
  • Computational scientists are not computer
    scientists
  • Provide universal, easy access to resources
  • Focus on keeping the GCE simple, easy to use,
    easily accessible.
  • Users of GridPort-based portals require no
    software downloads or configuration changes on
    the client side, and run on common web browsers.
  • Driving philosophy
  • Focus on Grid users and developers that will
    benefit from simple portals and portal
    technologies
  • reduce workload on Grid users and Grid
    application developers.
  • Support users and smaller projects

4
A Few Grid Resources
5
Evolution of GridPort/HotPage
  • 1997-1998 (the intern years)
  • NPACI HotPage project started Informational
    services
  • 1999
  • Informational HotPage installed at other sites
  • Globus Toolkit? interactive services (beta,
    GRAM/GSI)
  • Formed GridPort Toolkit project Technology
    transfer
  • 2000
  • Developed GridPort v1.0 to support application
    portals (LAPK, GAMESS)
  • User portal collaboration ? GSI across NPACI,
    Alliance, NASA/IPG
  • PDA version
  • 2001
  • Released GridPort Toolkit v2.0 for public use
  • Session state, single login SRB integration
    coupled to GSI
  • Web services experiments
  • HotPage updated interface supporting PACI/NPACI
    personalization
  • Creating Perl Package version of GridPort
  • Created Globus Perl CoG Module

6
GridPort Design Requirements
  • Universal access
  • Portals will be web-based must support old
    browsers
  • Portals must run anywhere, anytime, leave no data
  • Require no downloads, plug-ins or applications
  • Technology transfer
  • GridPort is a Grid jump-start kit
  • Leverage infrastructure provided by World Wide
    Web
  • Use common Grid technologies and standards
  • minimize impact on already burdened resource
    administrators
  • GridPort Toolkit should not require that
    additional services be run on the HPC Systems
  • Provide a scalable and flexible infrastructure
  • Facilitate adding/removing Grid resources,
    services, jobs, and users

7
GridPort Design Requirements (cont.)
  • Security
  • Support HTTPS/SSL encryption at all layers, and
    provide access control. Base on GSI.
  • Single login
  • Required for easy access/navigation across Grid
    resources.
  • Client applications and portal services should be
    able to run on separate webservers
  • Enable scientists to build their own application
    portals and use existing portals for common
    infrastructure services
  • Any site should be able to host a portal
  • Any user should be able to create their own
    portal if they have accounts and certificate
  • Adopt Global Grid Forum standards
  • Actively collaborate and promote Global Grid
    Forum activities

8
GridPort Toolkit Architecture
9
GridPort Layers
  • Clients
  • Web browsers, including PDA versions
  • Plan to expand to other wireless devices
  • Application Portals
  • Currently, they exist on same physical machine
    and share domain (cookies)
  • Served to clients by separate virtual webservers
  • hotpage.npaci.edu or gridport.npaci.edu
  • All use the same instance of the GridPort
    libraries.
  • Share data, libraries, filespace, and other
    services on the webserver machine.
  • Single-login environment

10
GridPort Layers (cont.)
  • Portal Services.
  • For portals and users
  • Managing session state, portal accounts, file
    collections,
  • Monitoring the information services
  • Services that are portal specific
  • not typically addressed by Grid or web
    developers.
  • Grid Services
  • Standard middle and backend tiers of the Grid
  • Globus, Legion, SRB, NWS, Apples and (someday)
    etaschedulers
  • Resources
  • Compute
  • Archival

11
GridPort Layers
  • NEED A SCHEMATIC OF THIS

12
Commercial Technologies Employed
  • Server
  • Netscape or Apache servers
  • HTTPS, SSL, HTML/JavaScript, SSH
  • Perl 5.0/CGI
  • Database flat text configuration files
  • migrating to SQL/Oracle in limited cases
    (reliability)
  • Will use DB to generate text files
  • OS Unix/Solaris, Linux
  • Client
  • Netscape Communicator, IE (4.0 or greater)
  • PC, Mac, Sun/Solaris, SGI
  • HTTPS, SSL, HTML/JavaScript (limited use)

13
Grid Technologies
  • Globus/GRAM gatekeeper
  • used to run interactive jobs and tasks, and to
    submit batch jobs on remote resources
  • Grid Security Infrastructure (GSI)/MyProxy
  • used for security and authentication
  • Grid Information Systems/Grid Resource
    Information System (GIS/GRIS)
  • used for information services where available
  • SDSC Storage Resource Broker (SRB)
  • used for distributed file collection and
    management
  • Key problem
  • not all partners install and maintain all
    software

14
Services Supported
  • Portal user accounts
  • On-line account/certificate creation ? unique
    portal ID
  • Associate portalID with DN
  • Associate DN with user IDs in mapfiles ?
    authenticate
  • Track sessions, user preferences, distributed
    filespace
  • Portal users must have valid PKI/GSI certificate.
  • Accepted certs NPACI, Alliance, NASA/IPG,
    Cactus, Globus
  • This is a complex process, so it does not scale
    yet
  • Authentication (2 ways)
  • Authentication against certificate data stored in
    the SDSC certificate repository.
  • Myproxy server
  • We save proxy file for duration of session
  • Sessions expire after timeout period or user logs
    out.

15
Services Supported
  • Jobs
  • Executed via the Globus/GRAM gatekeeper.
  • Simple Unix-type commands
  • mkdir, ls, rmdir, cd, and pwd. (part of API)
  • Compiling and running programs
  • job and batchscript submission and deletion, and
    viewing of job status and history.
  • Files
  • Access to compute, archival, portal file space
  • file transfer
  • between the local workstation and the HPC
    resources
  • Between any 2 resources (via SRB)
  • Perform common file management operations on
    remote files
  • tar/untar, gzip/gunzip, and movement to archival
    storage.

16
GridPort Interactive Services Diagram
17
GridPort File Management
18
Resources Supported
  • Compute
  • IBM (Blue Horizon, SP)
  • Compaq (TCS1)
  • CRAY (T3E, T90)
  • Sun (E10K)
  • SGI (O2K)
  • Hewlett Packard (V2500)
  • Workstations and clusters.
  • Others
  • Archival
  • HPSS, DMF, MASS
  • Any system running Globus can be added
  • Multiple sites, centers, and orgs
  • PACI Grid NPACI, Alliance, PSC, hopefully DTF
  • NASA/IPG
  • Multiple sites/locations
  • SDSC
  • NCSA
  • Pittsburgh Supercomputing Center
  • Universities UT Austin, Univ. of Kentucky,
    Boston Univ.

19
Security Implementation
  • Security between the client -gt web server -gt
    grid
  • SSL/RC4-40 128 bit key/ SSL RSA X509 certificate
  • GSI authentication used for all portal services
  • Transparent access to the grid via GSI
    infrastructure
  • Authentication tracked with cookies
  • Coupled to server DB/session tracking, maintain
    session state
  • Assigned a random value by the webserver at login
  • Random value in the cookie corresponds to a
    session file
  • Session file contains a timestamp
  • Single login environment
  • Provides access to all NPACI Resources where GSI
    available
  • With full account access privileges for specific
    host
  • Within same domain because of cookies

20
Security Implementation (cont.)
  • User authentication via valid proxy files
  • Proxy generated from key/cert pair or retrieved
    from MyProxy
  • Sensitive data (proxies) stored in restricted
    access portal repository
  • Repository located outside webserver filespace
  • Has user and group permissions control
  • Portal acts as proxy
  • Executing requests on behalf of the user
  • Only what user is authorized to access
  • Based on credentials presented when portal
    account created
  • Users have same level of access to resource as if
    logged on
  • Globus used for client requests on resources
  • GSI used at all layers ? forward session proxy
    file

21
Applications Running on GridPort
  • 2 approaches for portals
  • Those developed by the NPACI Team
  • Those developed by the application team/developer
    (blue)
  • Application portals in production
  • PACI HotPage, https//hotpage.npaci.edu
  • NPACI HotPage, https//hotpage.paci.org
  • Pharmacokinetic Modeling, https//gridport.npaci.e
    du/LAPK
  • General Atomic and Molecular electronic Structure
    System, https//gridport.npaci.edu/GAMESS
  • Portals developed by project application
    developers
  • Bays to Estuaries Project (BBE),
    http//bbe.npaci.edu
  • Protein Database/CE Portal, https//gridport.npaci
    .edu/CE
  • Telescience (9/30/01), https//gridport.npaci.edu
    /Telescience

22
Using GridPort
  • Install Perl libraries and GridPort code on
    webserver
  • Application portal developer incorporates
    GridPort libraries directly into code
  • Can modify or add subroutines
  • General pattern (for our dev. team)
  • Uses between 3 and 6 lines of Perl code to access
    functions
  • Jobs, files, auth, etc.
  • Each of the CGI scripts for application portals
    developed with GridPort follow this pattern
  • An Example HotPage Batch job submission
  • Contains three lines of code that reference
    GridPort.
  • Other lines of code (750) are specific HotPage

23
HotPage View Job Submission
24
Laboratory for Applied Pharmacokinetics (LAPK)
  • Community Model Portal
  • users are Doctors, so need extremely simple
    interface
  • Must be portable run from many countries/labs
  • Need to hide details such as
  • Resource, files, batch scrips, compilation, UNIX
    env.
  • Uses gridport.npaci.edu portal services/capabiliti
    es
  • File upload/download between local
    host/portal/HPC systems
  • Jobs
  • Job submit (builds batch script, moves files,
    submit jobs)
  • Job tracking moves results to user filespace
    when complet
  • Job cancel/delete
  • Job History maintains relevant job information
  • Impact
  • LAPK users can now run multiple jobs at one time
    using portal

25
LAPK Job Submit and Job History
26
GCE Web Services
  • New architecture for GCEs is emerging
  • Workshop held at SDSC (May 01) to discuss this.
  • Grid Portals Markup Language/XML
  • Constructing GCE Testbed
  • Based on web services model that is currently
    evolving in commercial world
  • Sun Jxta, IBM WebSphere Microsoft .NET
  • XML/SOAP/UDDI/WSDL
  • CCA (See Gannons talk)
  • In this expt, our port is a URL
  • Key Advantage Client may be a web page/portal,
    another application or Grid service
  • Allows separation of the function of hosting
    client from the service or application being used

27
A Web Services Expt GridPort Client Toolkit
  • Focus on medium/small applications and
    researchers
  • Choose simple protocol (HTTP/CGI/Perl)
  • Client/application can be located on any server
    or system.
  • Connection to portal services is through the GCT
  • https//portals.npaci.edu/client/tools/FUNCTIONS
  • Inherits all existing portal services running on
    portal
  • Including authentication/single login
  • Its easy
  • Took 1 week to develop GCT
  • Key project goal
  • Allow scientist to write local portals/apps/etc.
    and use services

28
Web Services Expt GridPort Client Toolkit
  • Ease of use
  • Do not have to install complex code to get
    started
  • webservers, no Globus, no SSH, no SSL, no PKI,
    etc.
  • Do not have to write complex interface scripts to
    access these services (weve done that already)
  • Do not have to fund advanced web development
    teams
  • Client has local control over project, including
    filespace, etc.
  • Integration to existing portals has been done
  • Bays to Estuaries project

29
Services Implemented in GCT
  • Authentication
  • Login
  • Logout
  • Check authentication state
  • Jobs
  • Sumbit jobs to queues
  • Cancel jobs
  • Execute commands (command like interface)
  • Files
  • Upload from local host
  • Download to local host
  • FTP move FILE
  • View Portal FILEpace (?)
  • Commands
  • Pwd
  • Cd
  • Whoami
  • Etc.

30
GridPort Client Toolkit DemoApp
31
Basin, Bays to Estuaries (BBE) Portal
  • Community model Scientific portal for conducting
    multi-model Earth System Science (ESS)
  • Simulations are run to forecast the transport of
    sediments within the San Diego Bay area during a
    storm.
  • Technology developed for the BBE project
  • Website located on BBE webserver/machine
  • http//bbe.npaci.edu
  • Uses SRB for file management (GSI)
  • Perl/CGI based portal
  • Minimal effort required to modify code
  • Use GCT for all interactive functions
  • Hardest part was installing Perl/LWP module on
    local sytsem
  • Roughly 14 tests needed to integrate GCT into
    portal
  • 4 new Perl scripts required 

32
Basin, Bays to Estuaries (BBE) Portal
33
Conclusions
  • Remember your client
  • Developer ! User
  • Robust portals can be built with simple
    technologies
  • GridPort is a good jump-start Toolkit
  • Promotes rapid deployment
  • We need
  • Grid accounts so we dont have to update 10
    billion mapfiles
  • Common/shared security
  • Grid metaschedulers so our users can run on best
    available system
  • Grid aware compilers
  • Grid information services that are fast
  • Grid Web services

34
Future Work
  • GridPort V3.0
  • In process of evaluating new technologies
  • Java technologies to support CCA efforts
  • Considering move away from Perl (XML
    incompatibilities one reason)
  • Data portal technologies SRB and GSI-FTP
  • Support personalization at account level
  • Web services used by production application
    portals
  • HotPage v3.0
  • Expand to DTF, and non NSF PACI systems
  • Expand personalization ? MyHotPage
  • Implement use of NPACI Machines database
  • Update to accommodate Virt. Org. concepts
  • Automatic SRB collections for all users

35
New Directions
  • Continue Web services architecture research
  • Collaboration with GGF/GCE Research Area and
    working groups
  • GCE Testbed plan underway
  • USA PACI, Alliance, NASA, Jefferson lab, PNNL,
    others
  • Europe Daresbury, Cactus, others?
  • Collaboration with Sun CAL(IT)2 project

36
GridPort Team
  • SDSC Staff
  • Mary Thomas
  • Steve Mock
  • Kurt Mueller
  • Maytal Dahan
  • Cathie Mills
  • Student interns Ray Regno, Akhil Seth
  • A Collective Effort supported by SDSC services
  • Server systems (Josh Polterock)
  • HPC Systems (Victor Hazelwood)
  • Databases (Dave Archibal)
  • Distr. Computing (Keith Thompson, Bill Link)

37
Acknowledgements
  • San Diego Supercomputer Center and the NSF funded
    PACI programs for their support (both with
    resources and staff)
  • Grants
  • NPACI, NSF-ACI-975249
  • NPACINSF-NASA IPG Project Supplement
  • Pharmacokinetic Modeling NCRR Grant No RR11526
  • NBCR , NIH/NCRR P41 RR08605-07
  • Collaborators
  • PACI Partners
  • The User Portal Collaboration members NASA/IPG,
    LBL, PNNL
  • Globus team for providing valuable input and
    ideas on this project Gregor von Laszewski,
    Carl Kesselman and others.
  • The Global Grid Forum GCE working group

38
References
  • GridPort Toolkit Website
  • https//gridport.npaci.edu
  • NPACI HotPage User Portal
  • HotPage https//hotpage.npaci.edu
  • Accounts http//hotpage.npaci.edu/accounts
  • Downloads
  • http//gridport.npaci.edu/downloads
  • GridPort Toolkit
  • NPACI HotPage
  • GCT Portal (frames based)
  • Contact
  • Mary Thomas (mthomas_at_sdsc.edu)
Write a Comment
User Comments (0)
About PowerShow.com