Title: Computational Science Portals: The SDSC Grid Portal Toolkit (GridPort)
1Computational Science PortalsThe SDSC Grid
Portal Toolkit (GridPort)
- Mary Thomas
- (mthomas_at_sdsc.edu)
- Presented at the
- NPACI Parallel Computing Institute 2000
- San Diego, CA
2What is a Portal?
3Why Build Computational Science Portals?
- Computational science environment is complex
- Users have access to a variety of distributed
resources (compute, storage, etc.). - Interfaces to these resources vary and change
often - Policies at sites sometimes differ
- Using multiple resources can be cumbersome
- Portals can provide simple interfaces
- Portals are web based and that has advantages -
- Users know understand the web
- Can serve as a layer in the middle-tier
infrastructure of the Grid - Users can be isolated from resource specific
details - Single web interface isolates system
changes/differences - Not and end-all solution
- several issues/challenges here
4Desirable Science Portal Characteristics
- Provide Access To
- Other Portals
- Local and Remote Computer Resources
- Data Management and Analysis Tools
- Collaborative and Visualization Infrastructure
- Secure
- Personalizable
- Built on Commercial Components
- Fast/interactive
- Use well-connected servers
- Multiple resources available to support these
services - Push updates/display information
- Data must be fresh
- Preserve client state
5Types of Scientific Web Portals
- Differ from Commercial Portals (yahoo, amazon)
- Types of Science Portals
- User Portals
- simplify users ability to interact with and
utilize a complex, often distributed environment - direct access to resources (compute, data,
archival, instruments, and information) - Application Interfaces
- Enables scientists to conduct simulations on
multiple resources - EOT Portals
- Educates public (future scientists?) about
science using software simulations,
visualizations, etc - Learning tools
- Individual Portals
- Users can roll out their own portals by writing
web pages using standard HTML or Perl/CGI
6Examples of User Portals
- User Portals
- Generalized/Low level
- NPACI HotPage
- NCSA MyGrid
- European UNICORE
- WebSubmit
- Gateway
- Education Portals
- ChemViz
- Others
- Application Portals
- Project Specific
- Require application development teams
- Biology WorkBench
- PDB
- MEME
- VisBench
- CHEME
- Individual Portals
- User built
- Hosted on any webserver
7The GridPort Toolkit
- Based on the architecture developed for the NPACI
HotPage - Focus on computational scientists and application
developers - Comprised of a set of simple, modular services
and tools - Support application level, customized science
portals development - Facilitate seamless web-based access to
distributed compute resources and grid services - Built with commodity technologies
- Sits on top of the middle-tier of the Grid
- An interface to these services for web
8GridPort Toolkit Design Concepts
- Key design idea
- Any site should be able to host a user portal
- Any user should be able to create their own user
portal if they have accounts and certificate - Key Requirements
- Base software design on infrastructure provided
by World Wide Web - use commodity technologies wherever possible
- avoid shell programs/aapplications/applets
- GridPort Toolkit should not require that
additional services be run on the HPC Systems - reduce complexity -- there are enough of these
already - so, leverage existing grid research development
- GSI certificate (considering Kerberos, secure ID)
9GridPort Designed for Ease of Use
- WWW interface is common, well understood, and
pervasive - User Portals can be accessed by anyone who has
access to a web browser, regardless of location - Users can construct customized application web
pages - only basic knowledge of HTML and Perl/CGI
- Application programmers can extend the set of
basic functions provided by the Toolkit - Portal services hosts can modify support services
by adding/remove/modifying broker or grid
interface codes
10GridPort Based on CommodityWeb Technologies
- Use of commodity web technologies -gt Portability
- contribute to a plug-n-play grid
- Requirements
- Communicator and IE (4.0 or greater),
- HTTP, HTTPS, SSL, HTML/JavaScript, Perl/CGI, SSH,
FTP - Netscape or Apache servers
- Based on simple technology, this software is
easily ported to, and used by other sites. - Needs to also be easy to modify and adapt to
local site policies and requirements - Goal is to design a toolkit that is simple to
implement, support, and develop
11GridPort Architecture
12GridPort Architecture How it Works
- Client downloads applications pages from either
- application webserver
- responsible for processing/maintaining
application data - pass client requests on to grid portal services
webserver - grid portal services webserver
- processes requests from client for portal
services and passes them to broker - designed to be ported to general sites, so anyone
can host - Grid Services Broker chooses correct grid service
based on - user/access/preference/etc.
- type of resource requested and services available
on system - broker software is modular - adding new services
is easy
13GridPort Architecture Layers
- Has 3 layers, with APIs for each
- WWW layer for clients/applications
- Parses HTTP requests, sends to broker
- Converts data returned from broker and returns to
client - Either builds HTML, returns raw data, archives,
publish to info-services - Grid Services Broker layer
- Processes client requests
- Selects middle tier/grid services Portal services
layer based on client, request type, system
selected, etc. - Runs on webserver system
- Grid Services Interface layer
- Library of subroutines called by Broker, that
interface to Middle Tier grid services - Examples Perl module that interfaces to Globus
servies - Flexible design allows any of the layers to be
replaced or modified to adapt to local site
requirement
14GridPort Architecture APIs
- GridPort Toolkit will consist of two modules
with APIs - Portal services module
- runs on any commercial web servers
- written in Perl/CGI
- manages authenticated connectivity to the grid
through GSI - accessed by web server that is processing client
requests - Broker layer chooses grid service for client
- Application module provides WWW interface
- facilitates customized portals development
- users need no in-depth knowledge of underlying
portal infrastructure - HTML or Perl/CGI
- Key feature of the architecture
- client application and grid portal services can
run on separate web servers - leading to
distributed application support
15GSI Provides Grid Security at all Layers
- GSI authentication for all portal services
- transparent access to the grid via GSI
infrastructure - Security between the client -gt web server -gt
grid - SSL/RC4-40 128 bit key/ SSL RSA X509 certificate
- authentication tracked with cookies coupled to
server data base/session tracking - Single login to portal services provides access
to all NPACI Resources where the GSI available - with full account access privileges for specific
host - use cookies to track stateexploring other
mechanisms - Globus used for client requests on resources
- Use GSI enabled SSH/FTP as a backup
- Use when need to avoid overhead of Globus
gatekeeper - useful for limited portal services if Globus
down/unavailable
16Web Server to HPC Resource Architecture
17What can you do with GridPort?
- Current features (always adding more)
- login/logout to grid services
- jobs
- web-based batch script builders
- submit jobs to queues
- monitor jobs and track them
- files
- dir listing, file transfer/archival
- file upload download
- command execution
- any UNIX commands
- accounts
- webnewu
- unix commands (reslist)
18Applications running on GridPort
- Current applications in production
- NPACI HotPage (live demo of HotPage)
- https//hotpage.npaci.edu
- LAPK Portal Pharmacokinetic Modeling (live demo
of Pharmacokinetic Modeling Portal) - https//gridport.npaci.edu/LAPK
- Application portals under development (Fall
deployment) - GAMESS (General Atomic and Molecular electronic
Structure System) - an ab initio molecular program originally
developed through the National Resource for
Computational Chemistry - https//gridport.npaci.edu/GAMESS
- QMView computational 3-D molecular
modeling/visualization - National Biomedical Computation Resource -
Cardiac Physiology modeling project.
19NPACI HotPage Services
- Vertical portal to NPACI Resources and Services
- News/events within NPACI
- Documentation, training , news, consulting
- Simple tools
- application search systems information
- generation of batch scripts for all compute
resources - Network Weather System
- Provides dynamic information
- real-time information for each machine (or
summaries) such as - Status Bar live updates/operational
status/utilization - Machine Usage summary of machine status, load,
queues - Queues Summaries displays currently executing
and queued jobs - Node Maps graphical map of running applications
mapped to nodes - Network Weathering System connectivity
information between a users local host and grid
resources
20NPACI HotPage Interactive Services
- Users have direct access to accounts on resources
- single entry point to all NPACI resources on
which a user has accounts/allocations - Requires portal account, and authentication
- secure access to compute and storage resources
(GSI) - Standard menus for each machine
- allows user to perform common Unix tasks
- create, submit, monitor, cancel or delete jobs
- view output
- compile and execute code
- manipulate and view files, navigate through file
systems - use system commands chmod, mv, ls, cat, mkdir,
cp, rm - perform file transfer
- upload/download/archive files
- archiving and retrieving data between local host
and HPC system - managing accounts and allocations (via Webnewu)
21Interactive HotPage Views T3E Home Directory
Listing, Current Job Status
22Status of GridPort Toolkit
- Version 1.0 is in beta
- applications are running using the software
- currently, all applications must be hosted on
npaci.edu domain (security/cookies) - complete
- file transfer tool
- generalized dynamic batch script
- Version 2.0 in planning stages
- solve challenge of tracking user states across
multiple webserver domains - enhancing security and authentication
- customization of HotPage
- scheduling jobs based on best available system
23Future Work
- Continue development of the the GridPort Toolkit
- Incorporate into the Globus Perl CoG Toolkit
- Participate in the Grid Portal Collaboration with
NPACI, NCSA, and NASA/IPG projects - Can we share security/resources/data?
- Add new capabilities
- Perl/CGI RSL generator (similar to NASA Applet)
- Continue to add new tools/features
- Job Tracker
- File Transfer Tool
- Incorporate grid schedulers accounting tools
- Add user preferences
- data base user information, configurable web
pages - Continue to enhance security infrastructure
- Evaluate/incorporate
- XML, Java servers, Signed Java applets, Object
based technologies, and whatever new thing comes
along
24References
- GridPort Toolkit Website
- https//gridport.npaci.edu
- Contact Mary Thomas (mthomas_at_sdsc.edu)
- NPACI HotPage
- https//hotpage.npaci.edu
- Contact Steve Mock (mock_at_sdsc.edu)
- Grid Forum Organization
- http//www.gridforum.org
- Computing Portals Organization
- http//www.computingportals.org