Title: Achieving Application Performance on the Grid: Experience with AppLeS
1Achieving Application Performance on the Grid
Experience with AppLeS
- This presentation will probably involve audience
discussion, which will create action items. Use
PowerPoint to keep track of these action items
during your presentation - In Slide Show, click on the right mouse button
- Select Meeting Minder
- Select the Action Items tab
- Type in action items as they come up
- Click OK to dismiss this box
- This will automatically create an Action Item
slide at the end of your presentation with your
points entered.
- Francine Berman
- U. C., San Diego
2Distributed Computers
- clusters of workstations
- benefits of distributed system outweigh the costs
of MPPs - computational grids
- coupling of resources allow for solution of
resource-intensive problems
3Parallel Distributed Programs
- Distributed parallel programs now
- robust MPP-type programs
- coupled applications
- proudly parallel apps
- The Future grid-aware poly-applications
- able to adapt to deliverable resource
performance - The Challenge programming to achieve
performance on shared distributed platforms
4Programming the Beast
- When other users share distributed resources,
performance is hard to achieve - load and availability of resources vary
- application behavior hard to predict
- performance dependent on time, load
- Careful scheduling required to achieve
application performance potential - staging of data, computation
- coordination of target resource usage, etc.
5Application Scheduling
- On distributed platforms, application schedulers
needed to prioritize performance of the
application over other components. - resource schedulers focus on utilization,
fairness - high-throughput schedulers maximize collective
job performance - hand-scheduling, staging require static info
- Problem How to develop adaptive application
schedulers for shared distributed environments?
6The AppLeS Approach
- Develop application schedulers based on the
Application-Level Scheduling Paradigm - Everything in the system is evaluated in
- terms of its impact on the application
- performance of each component considered as
measurable quantity - program schedule developed by forecasting
relevant measurable quantities
7AppLeS
- Joint project with Rich Wolski
- AppLeS Application-Level Scheduler
- Each application has its own AppLeS
- Schedule achieved through
- selection of potentially efficient resource sets
- performance estimation of dynamic system
parameters and application performance for
execution time frame - adaptation to perceived dynamic conditions
8AppLeS Architecture
- AppLeS incorporates
- application-specific information
- dynamic information
- user preferences
- Schedule developed to optimize users performance
measure - minimal execution time
- turnaround time staging/waiting time
execution time - other measures precision, resolution, speedup,
etc.
9SARA An AppLeS-in-Progress
- SARA Synthetic Aperture Radar Atlas
- Goal Assemble/process files for users desired
image - thumbnail image shown to user
- user selects desired bounding box within image
for more detailed viewing - SARA provides detailed image in variety of
formats - Simple SARA focuses on obtaining remote data
quickly - code developed by Alan Su
10Focusing in with SARA
Thumbnail image
Bounding box
11Simple SARA
Network shared by variable number of users
Data Server
Computation servers and data servers are logical
entities, not necessarily different nodes
Compute Server
Data Server
Data Server
Computation assumed to be done at compute servers
12Simple SARA AppLeS
- Focus on resource selection problem Which site
can deliver data the fastest? - Data for image accessed over shared networks
- Network Weather Service provides forecasts of
network load and availability - Servers used for experiments
- lolland.cc.gatech.edu
- sitar.cs.uiuc
- perigee.chpc.utah.edu
- mead2.uwashington.edu
- spin.cacr.caltech.edu
via vBNS
via general Internet
13Simple SARA Experiments
- Ran back-to-back experiments from remote sites to
UCSD/PCL - Data sets 1.4 - 3 megabytes, representative of
SARA file sizes - Simulates user selecting bounding box from
thumbnail image - Experiments run during normal business hours
mid-week
14Preliminary Results
- Experiment with smaller data set (1.4 Mbytes)
- NWS chooses the best resource
15More Preliminary Results
- Experiment with larger data set (3 Mbytes)
- NWS trying to track trends -- seems to
eventually figure out whats going on
16Distributed Data Applications
- SARA representative of larger class of
distributed data applications - Simple SARA template being extended to
accommodate - replicated data sources
- multiple files per image
- parallel data acquisition
- intermediate compute sites
- web interface, etc.
17SARA AppLeS -- Phase 2
Client, servers are logical nodes, which
servers should the client use?
18A Bushel of AppLeS almost
- During the first phase of the project, weve
focused on getting experience building AppLeS - Jacobi2D, DOT, SRB, Simple SARA, Genetic
Algorithm, Tomography, INS2D, ... - Using this experience, we are beginning to build
AppLeS templates/tools for - master/slave applications
- parameter sweep applications
- distributed data applications
- proudly parallel applications, etc.
- What have we learned ...
19Lessons Learned from AppLeS
- Dynamic information is critical
20Lessons Learned from AppLeS
- Program execution and parameters may exhibit a
range of performance
21Lessons Learned from AppLeS
- Knowing something about performance predictions
can improve scheduling
22Lessons Learned from AppLeS
- Performance of scheduling policy sensitive to
application, data, and system characteristics
23Show Stoppers
- Queue prediction time
- How long will the program wait in a batch queue?
- How accurate is the prediction?
- Experimental Verification
- How do we verify the performance of schedulers in
production environments? - How do we achieve reproducible and relevant
results? - What are the right measures of success?
- Uncertainty
- How do we capture time-dependent information?
- What do we do if the range of information is
large?
24Current AppLeS Projects
- AppLeS and more AppLeS
- AppLeS applications
- AppLeS templates/tools
- Globus AppLeS, Legion AppLeS, IPG AppLeS
- Plans for integration of AppLeS and NWS with
NetSolve, Condor, Ninf - Performance Prediction Engineering
- structural modeling with stochastic predictions
- development of quality of information measures
- accuracy
- lifetime
- overhead
25New Directions
- Contingency Scheduling
- scheduling during execution
- Scheduling with
- partial information, poor information,
dynamicallychanging information - Multischeduling
- resource economies
- scheduling social structure
26 The Brave New World
- Grid-aware Programming
- development of adaptive poly-applications
- integration of schedulers, PSEs and other tools
27AppLeS in Context
Integration of multiple grid constituencies archi
tectural models which support high-performance, hi
gh-portability, collaborative and other
users. automation of program execution
Performance grid-aware programming languages,
tools, PSEs, performance assessment and
prediction
Usability, Integration development of basic
infrastructure
Short-term
Medium-term
Long-term
Integration of schedulers and other tools,
performance interfaces
Application scheduling Resource
scheduling Throughput scheduling
Multi-scheduling Resource economy
You are here
28Project Information
- AppLeS Home Page http//www-cse.ucsd.edu/groups/
hpcl/apples.html - Jenny Schopf
- Gary Shao
- Neil Spring
- Shava Smallen
- Alan Su
- Dmitrii Zagorodnov
- Thanks to NSF, NPACI, Darpa, DoD, NASA
- AppLeS Corps
- Francine Berman
- Rich Wolski
- Walfredo Cirne
- Marcio Faerman
- Jamie Frey
- Jim Hayes
- Graziano Obertelli