SAMGrid and JIM - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

SAMGrid and JIM

Description:

D0 Grid project started in 2001-2002 to handle D0's expanded needs for globally ... Deployability no requirement of rooted daemons on the subm system ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 14
Provided by: igo47
Category:
Tags: jim | daemons | samgrid

less

Transcript and Presenter's Notes

Title: SAMGrid and JIM


1
SAMGrid and JIM
  • Igor Terekhov, FNAL/CD/CCF for JIM team

2
History
  • D0 Grid project started in 2001-2002 to handle
    D0s expanded needs for globally distributed
    computing
  • Funded by PPDG (our team here), GridPP (Ron
    Walker in the UK)
  • JIM the payload part thereof
  • Complements the data handling system (SAM) with
    jobs and info management
  • CDF joined later in 2002

3
(No Transcript)
4
Prototype Release, Oct. 10, 2002
  • Principal features
  • Remote submission of SAM and vanilla jobs to a
    grid site
  • Resource brokering (selection) for the jobs,
    based on the amount of data cached.
  • Web based monitoring of site stations and grid
    jobs, with navigation into the data handling
    status
  • Deployment and Testbed
  • Initially had 3 D0 sites (ICL-UK, UTA-TX, FNAL),
    using devel sam stations
  • CDF joined and we grew to 11 sites by (and for)
    the SC-2002. Production stations, CAF.

5
Prototype Release SC2002 demo
  • CDF CAF (and D0 non-CAF) jobs were submitted via
    the JIM interfaces samg submit
  • Station was chosed based on cached files
  • Job status could be seen on the Web
  • Output results were uploaded to a Web area
  • Need work on doing this via a DH service

6
Towards V1 March/April 2003
  • Major deployability improvements
  • Internal solidification and bug fixes
  • Success for end-user depends on coordination of
    efforts with
  • experimentss environments (CDF CAF, D0
    MCRunJob)
  • Core SAM

7
Directions for V1
  • Multi-tier job submission architecure
  • Separate client machine from queuing system
  • Deployability no requirement of rooted daemons
    on the subm system
  • Better grid monitoring architecture
  • Was submitted job -gt SAM project, Submission
    site -gt SAM station
  • Will be generic (D0 MC and beyond) global job -gt
    SAM project(s) subm site -gt generic execution
    site -gt site-specific monitoring -gt stations etc
  • Want to integrate SAM and Enstore monitoring
    systems

8
Directions for V1, contd
  • Grid site configuration for deployability,
    maintenance etc.
  • Wrote a paper on site configuration framework
  • Goal configure a site in an XML database, derive
    SAM and JIM service instantiation from there
  • For JIM V1, embed JIM per se into the framework
  • Later, we aim to dramatically improve SAM station
    installation

9
V1 Underlying Technologies
  • New major release of Condor
  • Endless problems
  • Will deploy a lightweight XML database for a
    flexible persistency mechanism
  • Will not touch Globus GTK-2 might be a dead-end

10
V1 Task list and our resources
  • Full Up-2-date version at
  • http//www-d0.fnal.gov/computing/grid/SAMGridTask
    List_v1.doc
  • Collapsed version
  • Enable 3-tier arch for job submission
  • Proper integration with local environments
  • Site description and configuration
  • Miscellaneous solidification
  • Has 2 FTE and 2 grad students
  • Helped by Frank et al on CDF, Rod Walker (ICL-UK)
    and Tom Rockwell (MSU)

11
V1 Challenges and concerns
  • Condor and PPDG The main problem
  • Need a CAF proper person to interface with JIM
    monitoring and job submission. Need installable,
    maintainable DCAF
  • Need a working sam submit as a service from the
    SAM team
  • Need a working D0 MCRunJob as a true component

12
Meta-Schema
Schema
Grid Designer
Site Admin
Main Site/cluster Config

Resource Advertisement
Monitoring Schema
Data Handling
Hosting Environment
13
User Interface
User Interface
Submission Client
Submission Client
1
Match Making Service
Match Making Service
2
Broker
3
Queuing System
Queuing System
6
Information Collector
Information Collector
5
5
7
4
4
Data Handling System
Data Handling System
Data Handling System
Data Handling System
Execution Site 1
Execution Site n
Computing Element
Computing Element
Computing Element
1
Storage Element
Storage Element
Storage Element
Storage Element
Storage Element
Grid Sensors
Grid Sensors
Grid Sensors
Grid Sensors
Computing Element
Write a Comment
User Comments (0)
About PowerShow.com