Pipeline and Batch Sharing in Grid Workloads - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Pipeline and Batch Sharing in Grid Workloads

Description:

measurements of the memory, CPU, and I/O requirements of ... IBIS. 144 K. 26.77. 37. BLASTP. 8737 K. 0.15. 45888. SETI. CPU/IO (instr/op) MEM/CPU (MB/MIPS) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 12
Provided by: q31
Category:

less

Transcript and Presenter's Notes

Title: Pipeline and Batch Sharing in Grid Workloads


1
Pipeline and Batch Sharing in Grid Workloads
  • Douglas Thain, John Bent, Andrea C.
    Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and
    Miron Livny
  • 2003
  • Reviewed by Zhicheng Qiu

2
Overview
  • What they want to do?
  • Characterizes workloads composed of pipelines of
    sequential processes
  • How they doing?
  • measurements of the memory, CPU, and I/O
    requirements of individual components
  • analyses of I/O sharing within complete batches.
  • It has no relation to the scheduling

3
Pipeline Batch Sharing
How the Pipeline and Batch Sharing work?
4
Workloads categories
  • Endpoint,
  • which represents the input and final output,
  • Pipeline-shared,
  • which is shared in a write-then-read fashion
    within a single pipeline,
  • Batch-shared,
  • which is comprised of input I/O shared across
    pipelines.

5
Application pipelines
Biology
Earth sys.
Physics
Physics
Chemistry
Astrophysics
6
I/O amounts of each type I/O
Appl. Endpoint I/O (MB) Endpoint I/O (MB) Pipeline I/O (MB) Pipeline I/O (MB) Batch I/O (MB) Batch I/O (MB)
Appl. Files Traffic Files Traffic Files Traffic
SETI 2 0.34 12 75.43 0 0
BLASTP 2 0.12 0 0 9 329.99
IBIS 20 179.92 99 148.27 17 7.89
CMS 6 63.56 2 12.99 9 3729.67
HF 3 1.96 7 4654.34 1 0
NAUTILUS 124 14.06 369 785.37 8 3.24
AMANDA 6 5.22 11 264.31 29 508.52
It is not mentioned that how much time is used
for I/O.
7
Amdahls ratios
App CPU/IO (MIPS/MBPS) MEM/CPU (MB/MIPS) CPU/IO (instr/op)
SETI 45888 0.15 8737 K
BLASTP 37 26.77 144 K
IBIS 34530 0.20 109823 K
CMS 190 2.09 396 K
HF 74 0.16 353 K
NAUTILUS 2287 1.20 8238 K
AMANDA 785 3.77 551 K
I/O traffic is heavy for some applications.
8
Scalability is limited by the I/O bandwidth
Storage center
Commodity disk
Why the bandwidth milestones is the storage
device other than the network bandwidth or the
RAM access bandwidth?
9
Conclusions
  • Shared I/O is the dominant component of all I/O
    traffic, and it causes the serious scalability
    problem for applications.
  • Efforts have to be made to eliminate shared I/O
  • New file system are required to minimize the I/O

10
How to optimize the I/O traffic?
  • Minimize the I/O traffic
  • Application design (Sequential I/O vs. random
    I/O, Replication)
  • Speed up endpoint I/O
  • File system (Caching, Database, Ramdisk)

11
Another question
  • Why the CPU and the Memory are not the bottleneck
    for the system scalability.
Write a Comment
User Comments (0)
About PowerShow.com