MyGrid: A UserCentric Approach for Grid Computing - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

MyGrid: A UserCentric Approach for Grid Computing

Description:

MyGrid: A UserCentric Approach for Grid Computing – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 23
Provided by: walfred
Category:

less

Transcript and Presenter's Notes

Title: MyGrid: A UserCentric Approach for Grid Computing


1
MyGrid A User-Centric Approach for Grid
Computing
  • Walfredo Cirne
  • Universidade Federal da Paraíba

2
High-Performance Computing
  • High-Performance Computing means running faster
    than the typical machine du jour
  • Unbeatable price/performance of microprocessors
    has killed specialized high-performance machines
  • Therefore, paralelism currently is the way to do
    High-Performance Computing
  • Parallel supercomputers

3
Solving a Real Problem
  • I had hundreds of thousands of independent
    simulations to run
  • Parallel supercomputers are typically
  • hard to get acess to
  • slow (too much time in the queue)
  • Since my simulations were independent, I had the
    perfect application for the Computational Grid

4
Grid Computing
  • Grid Computing aims to enable the execution of
    parallel applications over processors that are
  • Geographically distributed
  • Under multiple administrative domains
  • Not dedicated
  • The potential for resource gathering is enormous
  • Lets run over the Internet

5
Grid Applications
  • Not all applications can benefit from the Grid
  • Loosely coupled applications match the Grid
    characteristics much better than tightly coupled
    applications

6
State of Art in Grid Computing
  • Most services are provided by the Grid
    Infrastructure
  • Naming, remote execution/task control, security,
    etc
  • Scheduling is done at the application level
  • Globus
  • Virtual Organizations

7
Back to the Real Problem
  • I had hundreds of thousands of independent
    simulations to run
  • I was working in a top research lab in Grid
    Computing
  • I could not manage to use the Grid
  • It is hard to get the Grid Infrastructure
    Software installed everywhere

8
The Motivation for MyGrid
  • Users of loosely coupled applications could
    benefit from the Grid now
  • However, they dont run on the Grid today because
    the Grid Infrastructure is not widely deployed
  • What if we build a solution at the user level?
    That is, a solution that does not depend upon
    installed infrastructure?

9
MyGrid
  • MyGrid is a framework to build infrastructure-inde
    pendent grid applications
  • The user provides
  • A description of her Grid
  • A way to do remote execution and file transfer
  • The application
  • MyGrid provides
  • Grid abstractions
  • Scheduling

10
MyGrid Goals
  • open do not require a particular infrastructure
  • self-installable do not require manual
    installation on a given machine
  • extensible simple to add refinements
  • complete cover the whole production cycle

11
MyGrid Concepts
  • Job set of independent tasks
  • Tasks have three pieces init, remote and final
  • Home machine ? Grid machine
  • Grid abstractions
  • remote execution
  • file transfer
  • playpen
  • mirroring

12
Defining My Personal Grid
  • bagre.dsc.ufpb.br
  • dsc, linux
  • ssh machine command
  • scp localdir/file machineremotedir
  • scp machineremotedir/file localdir
  • traira.dsc.ufpb.br
  • dsc, linux
  • ssh machine command
  • scp localdir/file machineremotedir
  • scp machineremotedir/file localdir
  • quidam.ucsd.edu
  • cse, linux
  • ssh machine command
  • scp localdir/file machineremotedir
  • scp machineremotedir/file localdir

13
Fatoring with MyGrid
  • Fatora n gerates tasks, init, remotei, and
    collect
  • User runs mygrid.ui.AddTask lt tasks
  • tasks
  • task
  • init init
  • remote remote1
  • final collect
  • processor linux
  • playpensize 0
  • cost 1
  • task
  • init init
  • remote remote2

14
Fatoring with MyGrid
  • init java mygrid.ui.MyGridUI p PROC ./Fat.class
    PLAYPEN
  • remote1 java Fat 3 18655 34789789798
    output-TASK
  • remote2 java Fat 18655 37307 34789789798
    output-TASK
  • collect java mygrid.ui.MyGridUI g PROC ""
    PLAYPEN saida-TASK .

15
Running an MyGrid Task
16
User Agent
  • User Agent provides the grid abstractions
  • User Agent Daemon runs on grid machines
  • User Agent Server runs on home machines
  • The Daemon and the Server rely upon public-key
    cryptography to authenticate each other

17
Self Instalation
  • We are working on having MyGrid install and
    start-up User Agents everywere
  • The user provides a way to do remote execution
    and file transfer to make that possible

18
Scheduling in MyGrid
  • Grid scheduling is application dependent and
    effort intensive
  • Most people dont want to spend months to write
    good schedulers for their applications
  • MyGrid provides a sensible default scheduler
  • The user can of course replace the default
    scheduler

19
Default Scheduler
  • How to provide good performance with no knowledge
    about the application or the current state of the
    Grid
  • The key is to avoid having the job waiting for a
    task that runs in a slow/loaded machine
  • Task replication is our answer for this problem
  • Task replication is only done when the jobs has
    no other tasks

20
Preliminary Results
  • During a 40-day period, we ran 600,000
    simulations using 178 processors located in 6
    different administrative domains widely spread in
    the USA
  • MyGrid took 16.7 days to run the simulations
  • My desktop machine would have taken 5.3 years to
    do so
  • Speed-up is 115.8 for 178 processors

21
Conclusions
  • Running Grid Applications at the user-level is a
    viable strategy
  • Bag-of-tasks parallel applications can currently
    benefit from the Grid
  • Is upperware the way to go for new middleware
    development?

22
Future Work
  • Turn MyGrid into a production-quality software
  • Investigate the impact of task replication in
    resource consumption
  • Develop a default scheduler for data intensive
    applications
  • Such a scheduler should try to minimize data
    movement
Write a Comment
User Comments (0)
About PowerShow.com