GRAM: Software Provider Forum - PowerPoint PPT Presentation

About This Presentation
Title:

GRAM: Software Provider Forum

Description:

ns1:managedJobEndpoint xmlns:ns1= 'http://www.globus.org/namespaces/2004/10/gram/job' ... ns1:managedJobEndpoint Grid Job ID: ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 22
Provided by: jennife62
Learn more at: https://www.mcs.anl.gov
Category:
Tags: gram | forum | ns1 | provider | software

less

Transcript and Presenter's Notes

Title: GRAM: Software Provider Forum


1
GRAM Software Provider Forum
  • Stuart Martin
  • Computational Institute, University of Chicago
    Argonne National Lab

TeraGrid 2007 Madison, WI
2
GRAM - Basic Job Submission and Control Service
  • A uniform service interface for remote job
    submission and control
  • Includes file staging and I/O management
  • Includes reliability features
  • Supports basic Grid security mechanisms
  • Asynchronous monitoring
  • Interfaces with local resource managers,
    simplifies the job of metaschedulers/brokers
  • GRAM is not a scheduler.
  • No scheduling
  • No metascheduling/brokering

3
GRAM Versions in GT4
  • GRAM2 (Pre-WS GRAM)
  • Proprietary Protocol based implementation
  • Gatekeeper and Job Manager
  • GRAM4 (WS GRAM)
  • Web Services-based implementation
  • Managed Job Factory Service (MJFS)
  • Managed Executable Job Service (MEJS)

4
Performance Comparisons
5
Concurrent Jobs(as in paper)
Average seconds per 1000 jobs Condor-g to GRAM
to Condor LRM
Stage In Stage Out File Clean Up Unique Job Dir GRAM2 GRAM4
None None No No 2552 2100
1X10KB 1X10KB No No 2608 3779
1X10KB 1X10KB Yes Yes 2698 5695
6
Concurrent Jobs(as will be in GT 4.0.5)
Average seconds per 1000 jobs Condor-g to GRAM
to Condor LRM
Stage In Stage Out File Clean Up Unique Job Dir GRAM2 GRAM4
None None No No 2552 2176
1X10KB 1X10KB No No 2608 2147
1X10KB 1X10KB Yes Yes 2698 2254
7
Improving performance forstaging jobs
  • Adding local method call mechanism for general
    use in Java WS Core (4.0.5)
  • GRAM is doing this with RFT
  • Any service which calls another in-process
    service could make similar modifications for
    local calls and likely benefit from improved
    performance
  • Adding caching of the GridFTP server connections
    in RFT (4.0.6)

8
Sequential Jobs
Average seconds per job (Fork)
Delegation Stage In Stage Out GRAM2 GRAM4
None None None N/A 1.70
Per Job None None 1.07 3.53
Per Job 1X10KB None 1.78 5.57
Shared 1X10KB None N/A 5.41
Per Job 1X10KB 1X10KB 2.44 9.08
Shared 1X10KB 1X10KB N/A 7.91
9
Sequential Jobs
Average seconds per job (Fork)
Delegation Stage In Stage Out GRAM2 GRAM4
None None None N/A 1.46
Per Job None None 1.07 3.42
Per Job 1X10KB None 1.78 3.46
Shared 1X10KB None N/A 3.51
Per Job 1X10KB 1X10KB 2.44 5.25
Shared 1X10KB 1X10KB N/A 3.67
10
GRAM Auditing
11
TG Gateways
  • Lower the barrier for scientists and their
    applications to use TeraGrid resources
  • Provide an application or domain-specific
    interface that a scientist can easily understand
  • Each gateway may have 100s or 1000s of users
    accessing TG resources
  • Must be efficient and scale

12
Use Cases
  • Group Access
  • For efficiency, a community credential is used
    to multiplex many users over a single ID
  • Query Job Accounting
  • Gateways need a remote interface to obtain the TG
    units charged for their users jobs
  • Auditing
  • Grid services provide access to resources
  • TG Resource Providers need a record of actions
    performed by services

13
Requirements From Use Cases
  • Grid Job Identifier
  • Remote client interface to auditing and
    accounting information
  • Creation of service audit and accounting
    information
  • Access to remote LRM accounting information from
    the audit service
  • Scalability in storing information/records
  • Secure access (authentication and authorization)
    to audit and accounting information

14
Grid Job Identifier
  • Uniquely identifies a job
  • Shared between the client (Gateway) and service
    (TG RP)
  • Obtained in the normal service interaction/protoco
    l
  • In GRAM4 its the EPR converted
  • In GRAM2 its the job contact (as is)
  • GRAM4 Example gtgtgt

15
  • GRAM4 EPR
  • ltns1managedJobEndpoint xmlnsns1"http//www.glo
    bus.org/namespaces/2004/10/gram/job"gt
  • ltns2Address xmlnsns2
    "http//schemas.xmlsoap.org/ws/2004/03/addressing"
    gt
  • https//127.0.0.18443/wsrf/services/ManagedExecut
    ableJobService lt/ns2Addressgt
  • ltns3ReferenceProperties xmlnsns3
    "http//schemas.xmlsoap.org/ws/2004/03/addressing"
    gt
  • ltns1ResourceID cca8169a-c65f-11da-a61c-00
    0d61215ff0 lt/ns1ResourceIDgt
  • lt/ns3ReferencePropertiesgt
  • ltns4ReferenceParameters
  • xmlnsns4"http//schemas.xmlsoap.org/ws/2
    004/03/addressing"/gt
  • lt/ns1managedJobEndpointgt
  • Grid Job ID
  • https//127.0.0.18443/wsrf/services/ManagedExecut
    ableJobService?QQDzjbFVYImtVg8

16
Remote Client Interface
  • Flexible query interface to retrieve audit and
    accounting records
  • Define an operation getChargeForJob to return
    the units consumed by a Grid Job ID
  • Keep audit service interface separate from GRAM
    service to allow flexible deployment scenarios
  • Allow a single audit service for multiple GRAM
    services
  • Same client interface could be used for other
    services, for example, charging for data storage
    or transfers
  • OGSA-DAI satisfies these requirements

17
Creation of Service Auditing Information
  • Added GRAM audit record creation upon job
    termination
  • Record fields Job_grid_id, local_job_id,
    submission_job_id, subject_name, username,
    creation_time, queued_time, stage_in_gid,
    stage_out_gid, clean_up_gid, gt_verison, rm_type,
    job_description, success_flag
  • Gerson Galang (APAC) contribution for GRAM4 audit
    record creation at beginning of job, update after
    LRM submission, and final update upon termination
  • Records are needed soon after job termination
  • Accounting information is created by the local
    resource managers

18
Access to LRM Accounting Information
  • TeraGrid uploads all LRM accounting information
    from each TG site to a central DB (TGCDB)
  • The OGSA-DAI service can be configured to access
    the remote TGCDB

19
Scalability in Storing Information/Records
  • Estimated that system should handle 100,000
    records
  • GRAM service inserts records directly into audit
    DB
  • Audit DB must be local to GRAM service to assure
    reliability
  • Implemented to use either postgress or MySQL

20
Secure access
  • Standard authentication and authorization methods
    should be used to limit access to the audit and
    accounting information
  • Clients must present a valid X.509 certificate
  • Access can be controlled based on a range of
    policies
  • Current policy is to allow access iff the DN of
    the requestor matches the DN in the audit record

21
Resource Provider Site
GT4 Java Container
Delegation
RFT Audit Table
Compute Cluster
RFT
Resource Manager
1, 2
3
LEAD Gateway
WS GRAM
5
GRAM Audit Table
4
7
RM Accounting
8
OGSA DAI
9
AMIE
6
TG Central Accounting DB
22
Sequence Description
  1. Gateway submits job and gets an EPR on the reply
  2. Gateway controls and monitors job with EPR
  3. GRAM submits and monitors job in RM
  4. GRAM inserts audit record at end of job
  5. RM writes job accounting record
  6. AMIE uploads RM accounting records to TGCDB. The
    RM accounting record is converted to TG
    accounting units.
  7. Gateway locally converts EPR to GJID
  8. Gateway calls OGSA-DAI getChargeForJob with GJID
    and gets the job usage on the reply
  9. OGSA-DAI processes remote join between GRAM audit
    and TGCDB
Write a Comment
User Comments (0)
About PowerShow.com