GPIR GridPort Information Repository - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

GPIR GridPort Information Repository

Description:

University of Texas, University of Houston. Texas A&M, Texas Tech, Rice. Baylor College of Medicine. IPG. Planned. ETF. Deployment ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 16
Provided by: broo72
Category:

less

Transcript and Presenter's Notes

Title: GPIR GridPort Information Repository


1
GPIRGridPort Information Repository
  • Tomislav Urban
  • Texas Advanced Computing Center

2
Origins
  • HotPage Informational Data
  • Load, MOTD, Node Map, etc.
  • Obtained from customized data gathering scripts
  • MDS 2.0 where available
  • Static VO configuration data
  • Identified interest in recording historical grid
    data in support of
  • Workflow/ Decision-making
  • Job schedulers/Brokers
  • Histograms
  • Sought to move towards a web services model using
  • XML schema
  • Removes the need to write customized
    implementations for each new resource

3
GridPort Information Repository (GPIR)
  • Implementation of web service enabled information
    service
  • Evolved from various HotPage, GridPort, TACC and
    GCE-RG information and web services projects
    (IAWS)
  • Concept demonstrated at SC 02 for TeraGrid, PACI
    (NPACI/Alliance) resources
  • Called Information Archival Web Service (IAWS)
  • Based on XML documents stored on a file server
  • Thin clients (Java / Perl) pushed data into
    repository
  • Contained XML documents for current grid status
    as well as archived historical data (HotPage
    information other)
  • The IAWS was conceptualized in collaboration with
    SDSC and NCSA

4
Design Philosophy
  • Aggressive Practicality
  • Works today with whats available today
  • Comprehensive Portal-centric data set
  • Intended to support the GridPort GCE framework
    and its data requirements.
  • As web service, can be repurposed to any grid
    data needs.
  • Follow Standards
  • OGSI (Grid Services)
  • Emerging Data Schema (GLUE?)
  • Scalable
  • Relational Database back-end
  • Extensible
  • Easy to add new XML Queries, format as needed

5
Architecture
Information Providers
Resources
Clients
dB
Portals
Perl Client
Portlets
edu.tacc.GPIR
Java Client
Ingester WS
Query WS
MDS
GPIR MySQL PostgreSQL
Other Middleware
OGSA (Future)
Web Scraping
Other
SOAP-XML
HTTP
JDBC
6
Architecture
A single GPIR instance may support multiple
portals serving various VOs
VO Portal
VO Portal
VO Portal
VO Portal
GPIR
7
Current Data Sources
  • Thin Clients
  • Java
  • Perl
  • MDS
  • GMS
  • http//www.tacc.utexas.edu/grid/gms
  • NWS
  • Web Scraping
  • Cron jobs run periodically on HPC resources
    compiling text files that are then accessed via
    HTTP

8
Data
  • Load - aggregated CPU
  • Jobs individual and aggregated queue
  • MOTD
  • Nodes - job usage for each machine node
  • NWS - based on VO and Click model
  • Grid Monitoring (GMS)
  • Based on NCSA
  • Machine Status
  • Static Resource data (query only)
  • Extensible through the addition of XML data from
    any recognized source
  • Need schema
  • Need query

9
Web Services
  • Ingester WS
  • Accepts XML documents containing updates to Grid
    status
  • Query WS
  • Provides XML containing query specific information

10
Current Work
  • Migration to PostgreSQL
  • Full feature set
  • Transactionality
  • Etc.
  • Better future J2EE support
  • CMP
  • CMR
  • Administration Client
  • Allowing web-based administration of static
    data for all supported VOs would be a huge
    productivity boost

11
Supported VOs
  • Current
  • The PACI NPACI, Alliance
  • TACC/University of Texas
  • TIGRE / State of Texas
  • University of Texas, University of Houston
  • Texas AM, Texas Tech, Rice
  • Baylor College of Medicine
  • IPG
  • Planned
  • ETF

12
Deployment
  • Code available at http//www.tacc.utexas.edu/grid
    /gpir
  • Consists of
  • Web Service
  • Example Clients
  • JavaDocs
  • DDL Script for MySQL
  • XML Schema Documents (XSDs)
  • XML Document Examples

13
Future Directions
  • Integration into GridPort 3.0
  • J2EE Implementation
  • Treat GPIR Entities as real objects rather than
    table rows
  • Significant expansion to the data being gathered
  • Administration Client
  • Reporting and decision making based on historical
    data

14
Grid Services
  • Intend to implement GPIR as a grid service
  • Inherit OGSI Security model
  • GT 3.0 GSI
  • OGSI Compliance
  • OGSA Compliance
  • Will support WC3 and GGF standards
  • Web Services
  • Grid Services

15
Outstanding Issues
  • Inflexibility
  • Relational Database Changes
  • XML Schema Changes
  • Support for Dynamic Queries (Waiting for
    standards)
  • Inefficiency of dynamic data storage
  • Sampling vs. Events
  • Example The Job Table
  • Data Format Standards
  • MDS/GLUE Schema
  • INCA?
  • Security
  • GSI based authentication
Write a Comment
User Comments (0)
About PowerShow.com