PerformanceDriven Processor Allocation - PowerPoint PPT Presentation

About This Presentation
Title:

PerformanceDriven Processor Allocation

Description:

P fixed at submission time. FCFS, SJF, SCDF [Majumdar88, ... PDPA behavior (zoom) Tuning algorithm. C. D A C. U. P. Performance-Driven Processor Allocation ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 29
Provided by: DAC63
Category:

less

Transcript and Presenter's Notes

Title: PerformanceDriven Processor Allocation


1
Performance-Driven Processor Allocation
  • Julita Corbalan, Xavier Martorell, Jesus Labarta
  • juli,xavim,jesus_at_ac.upc.es
  • DAC-UPC

2
Objective
  • Scheduling parallel applications in Shared Memory
    Multiprogrammed systems
  • Allocate processors to applications that
    can take advantage of them
  • Implemented in an SGI Origin2000 with 64
    processors

3
Outline
  • Introduction Related Work
  • NANOS Execution Environment
  • Performance-Driven Processor AllocationPDPA
  • Evaluation
  • Conclusions Future Work

4
Introduction
  • Scheduling problem allocate processors to
    applications
  • Space-Sharing / Time-Sharing
  • Number of processes Number of Processors
  • Process Control Tucker89
  • Space-sharing approaches
  • P fixed at submission time
  • FCFS, SJF, SCDF Majumdar88,...
  • P defined at execution time (Adaptive / Dynamic)
  • Equal-allocation of the resources Equipartition
    McCan93
  • Processor allocation proportional to the
    application performance

5
Introduction (2)
  • Processor allocation proportional to application
    performance
  • Drawback Application performance is not known
    before its execution
  • Solution Calculate it a priori
  • Executing several times with different P and
    input data
  • Extrapolate the values based on a few samples
  • These approaches may not be valid
  • Application performance depends on run-time
    parameters Initial data placement, process
    migrations, distance between processors and
    memory,
  • It can be impracticable e.g. infinite input data
    sets

6
Related Work
  • Dynamic performance analysis
  • Self-Tuning Nguyen96, efficiency calculated at
    run-time as a function of idleness, system and
    communication overhead
  • Adaptive/Dynamic processor allocation policies
  • Equal_efficiency Nguyen96, tries to achieve the
    same efficiency on all processors
  • Dynamic Allocation, based on the idleness
    McCann93
  • Allocates the knee of the efficiency/execution
    time curve Eager89

7
Our proposal
  • We propose
  • Dynamic performance analysis
  • Real speedup
  • Calculated at run-time
  • Allocate processors to applications that can
    take advantage of them
  • Dynamic partitioning
  • Cost conscious re-allocations (memory locality)
  • Really efficient use of processors
  • Dynamic multiprogramming level
  • Coordination between the medium long term
    schedulers

8
Outline
  • Introduction Related work
  • NANOS Execution Environment
  • Performance-Driven Processor AllocationPDPA
  • Evaluation
  • Conclusions Future Work

9
NANOS Execution Environment
-Controls the application arrival -Coordinated
with the CPU Manager
FCFS
Queued applications
OpenMP Parallel Applications (malleable)
Queueing System
Start new application
-Implements the scheduling policy -Informs the
applications about its decisions -Enforces the
processor allocation
New application?
-Request processors -Informs about its performance
Proc. request, speedup
CPU Manager
Proc. allocated
Resume, bind, ...
SelfAnalyzer
Operating System
Shared Memory Multiprocessor
.
10
Outline
  • Introduction Related work
  • NANOS Execution Environment
  • Performance-Driven Processor Allocation PDPA
  • Dynamic Performance Analysis SelfAnalyzer
  • Performance-Driven Processor Allocation policy
  • Dynamic Multiprogramming Level
  • Evaluation
  • Conclusions Future Work

11
Dynamic Performance Analysis SelfAnalyzer
  • Tool to estimate the application speedup and
    execution time
  • Based on iterative parallel applications
  • Source code available
  • SelfAnalyzer calls inserted by the user or the
    compiler
  • Source code not available
  • Dynamic Periodicity Detection
  • SelfAnalyzer dynamically loaded

12
Dynamic Performance Analysis SelfAnalyzer(2)
  • Speedup calculated as the relationship between
    T(1) and T(P)

Serialization!!
13
Performance-Driven Processor Allocation
  • Space-Sharing
  • Allocation for acceptable efficiency (S(p)/p)
  • In the range low_eff , high_eff 50-70
  • Run-To-Completion
  • Minimum allocation of one processor
  • Dynamic partitioning, re-allocations when
  • Applications inform about their speedups
  • Application arrival/Application end
  • Remembers the application state
  • Allocation, performance

14
Performance-Driven Processor Allocation(2)
  • Policy parameters step, low_eff and high_eff

NewAppl Pmin(Free Proc., Proc. Requested)
NO_REF
Eff(p)lthigh_eff Eff(p)gtlow_eff
DEC
STABLE
INC
15
Dynamic Multiprogramming Level
  • Multiprogramming level (ML)
  • Number of applications running concurrently
  • Static/Dynamic ML
  • Coordination between the medium long term
    schedulers
  • If (new_appl_fits()?)
    start_new_appl()
  • new_appl_fits() defined by the scheduling policy
  • Free processors during several quanta
  • start_new_appl() implemented by the queuing
    system

16
Outline
  • Introduction Related work
  • NANOS Execution Environment
  • Performance-Driven Processor AllocationPDPA
  • Evaluation
  • Processor Allocation Policies
  • Applications Workloads
  • Execution Time Processor Allocation
  • Conclusions Future Work

17
Processor Allocation Policies
  • Equip equal CPUs to each running application
  • PDPA DML our proposal
  • Equal_eff equal efficiency in all the processors
  • SGI-MP native IRIX Scheduler
  • MP_BLOCKTIME200000
  • OMP_DYNAMICTRUE

18
Applications Workloads
  • Architecture System
  • SGI Origin2000 with 64 processors IRIX 6.5.8
  • Applications Open MP
  • Swim(44.2), Bt(20.85), Hydro2d(6.3), apsi(1)
  • Workloads
  • Multiprogramming Level set to 4
  • Request 32 processors each application

19
Exec.Time Proc. Allocation
Limited processor allocation
Total execution time reduced
Appl. exc. time slightly increased
20
Exec.Time Proc. Allocation
Performance affected by the multiprogrammed
execution
Total exec. Time improved
Allocation proportional to the performance
21
SGI vs. PDPA
4476 vs. 4 processes migrations !!!!
Processor Affinity Process Control
22
PDPA behavior (zoom)
Tuning algorithm
23
Outline
  • Introduction Related Work
  • NANOS Execution Environment
  • Performance-Driven Processor AllocationPDPA
  • Evaluation
  • Conclusions Future Work

24
Conclusions
  • It is important to provide an accurate
    performance information
  • SelfAnalyzer dynamic, accurate, easy to use
  • PDPA allocates processors to applications that
    can take advantage of them
  • The Dynamic Multiprogramming Level improves the
    system performance
  • Coordinating the medium long term schedulers

25
Future Work
  • Dynamic performance analysis
  • Non-iterative applications
  • PDPA
  • Space SharingTime Sharing
  • Evaluation in a open environment
  • Step, low_eff and high_eff need further research
  • Number of reallocations limited
  • Coordination medium long term schedulers
  • New policies

26
More contact info...
  • http//www.ac.upc.es/NANOS
  • http//www.ac.upc.es/homes/juli
  • juli_at_ac.upc.es

27
Related Work
  • Dynamic performance analysis
  • Self-Tuning Nguyen96, efficiency calculated at
    run-time as a function of idleness, system and
    communication overhead
  • Dynamic processor allocation policies
  • Equal_efficiency Nguyen96, tries to achieve the
    same efficiency on all processors
  • Dynamic Allocation, based on the idleness
    McCann93
  • Allocates the knee of the efficiency/execution
    time curve Eager89

It does not calculate the real speedup
It does not ensure an efficient use of processors
Excessive number of reallocations
Uses a priori information
28
Performance-Driven Processor Allocation(3)
  • Advantages
  • PDPA works with run-time information
  • Ensures that processors are always efficiently
    used
  • Drawbacks
  • The tuning algorithm can introduce overhead
    inside the application
  • Dynamic step
  • Some processors can remain unallocated
  • Dynamic Multiprogramming Level
Write a Comment
User Comments (0)
About PowerShow.com