On Economics and the User-Scheduler Relationship in HPC and Grid Systems - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

On Economics and the User-Scheduler Relationship in HPC and Grid Systems

Description:

On Economics and the User-Scheduler Relationship in HPC and Grid Systems Cynthia Bailey Lee Advisor: Allan E. Snavely Department of Computer Science and Engineering – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 13
Provided by: csewebUc9
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: On Economics and the User-Scheduler Relationship in HPC and Grid Systems


1
On Economics and the User-Scheduler Relationship
in HPC and Grid Systems
  • Cynthia Bailey Lee
  • Advisor Allan E. Snavely
  • Department of Computer Science and Engineering
  • San Diego Supercomputer Center
  • University of California, San Diego
  • November 14, 2007

2
Introduction
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • The job submission routine
  • Edit job script, including resources needed and
    amount of time requested
  • Submit jobtypically, many questions remain
  • Did I request enough time?
  • How long will the job wait in the queue?
  • Eventually, job runsmore questions
  • I submitted to a high-priority queuewas my
    wait time actually shorter than if I hadnt?
  • By how much?
  • Was it worth it?
  • Is this a satisfying relationship for either
    party?

3
Contributions of This Work
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  1. Falsified The Padding Hypothesis as the sole
    explanation for users inaccurate runtime
    requests
  2. Quantified users valuation of turnaround by
    collecting actual users utility curves
  3. Proposed a model for synthetically generating
    utility functions that draws on patterns seen in
    the actual user curves
  4. A genetic algorithm-based scheduler that uses
    aggregate utility as an explicit objective
    function

4
The Padding Hypothesis
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • The inaccuracy of users requested runtimes,
    relative to the actual runtime of jobs, is
    explained by users explicitly padding otherwise
    accurate runtime estimates in order to avoid the
    possibility of being killed by the scheduler.

SDSC users were asked to provide a
no-kill/no-pressure estimate, with prizes for
being accurate
72
Users are able to self-identify as more or less
accurate
Decrease
5
What is a Utility Function?
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
u(t)?
time
8 am 121pm 5 pm 8 am 9
am
Other factors coordinate with other grid sites
or sensors, paper deadlines, weather and
hurricane prediction,
6
Real Users' Functions
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Randomly-selected users of SDSC systems provided
    these data points for jobs they were submitting
  • Utility is in terms of the SDSC charge unit
    (SU)?

7
More Real Users' Functions
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
8
Model for Generating Synthetic Utility Functions
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Expected Linear
  • Expected Exponential
  • Step
  • Expected refers to the fact that each point is
    chosen randomly (i.e. Most won't follow the
    pattern as cleanly as shown here)?

9
Genetic Algorithm Scheduler
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Individuals
  • permutations of the job queue ordering
  • Mutation
  • swap two randomly-selected jobs
  • Reproduction
  • zipper-like merging of parents (skip duplicates)?
  • Fitness global utility of resulting schedule
    (approx.)?

10
GA scheduler and Inaccurate Runtimes
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Schedulers compared
  • CONS Conservative Backfilling
  • EASY Aggressive Backfilling
  • PRIO Priority FIFO (typical supercomputer
    priority scheduler)?
  • GA genetic algorithm
  • Workload is SDSC-BLUE from the Parallel Workloads
    Archive (Dror Feitelson)?
  • Load modified by scaling inter-arrival times

Regular Load
Heavy Load
11
Related WorkVery abbreviatedplease see my
publications for much more complete references
  • Runtime request inaccuracy Mualem and Feitelson
    2001, Tsafrir 2005, Cirne and Berman 2001, Ward,
    Mahood and West 2002, (workaround via queue
    prediction) Brevik, Nurmi, Wolski 2006, (mitigate
    via grace period) Chiang, Arpaci-Dusseau and
    Vernon 2002
  • Utility functions Chun and Culler 2004, Irwin,
    Grit and Chase 2004, Chen and Muhlethaler 1996
  • Market approach to scheduling Wolski, Plank,
    Brevik and Bryan 2001, Stoica, Abdel-Wahab and
    Pothen 1995, Buyya, Spawn system Waldburger,
    Hogg et al. 1992, Tycoon system Lai et al.
    2004, Popovici and Wilkes _at_ SC05, Singh,
    Kesselman, Deelman 2007, many, many more
  • Genetic algorithm in parallel job scheduling
    Franke 2006 (many, many more in other scheduling
    domains)

12
For more information
  • Inaccurate runtime requests survey
  • Lee, C., Y. Schwartzman, J. Hardy, A. Snavely.
    Are user runtime estimates inherently
    inaccurate? Workshop on Job Scheduling
    Strategies for Parallel Processing, with
    SIGMETRICS, June 2004.
  • Survey collecting SDSC users' utility curves
  • Lee, C. and A. Snavely. "On the User-Scheduler
    Dialogue Studies of User-Provided Runtime
    Estimates and Utility Functions." International
    Journal of High Performance Computing
    Applications, vol. 20, 2006.
  • Genetic algorithm scheduler and model for
    generating synthetic utility curves
  • Lee, C. and A. Snavely. Precise and Realistic
    Utility Functions for User-Centric Performance
    Analysis of Schedulers. HPDC-16, June 2007.
  • Contact Cynthia Lee, CL_at_SDSC.EDU
Write a Comment
User Comments (0)
About PowerShow.com