Yijian Wang - PowerPoint PPT Presentation

About This Presentation
Title:

Yijian Wang

Description:

Northeastern University {yiwang, kaeli}_at_ece.neu.edu. 1/30/2003 BARC. 2. Outline. Introduction ... Northeastern University Computer Architecture Research Group ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 18
Provided by: yijia
Category:

less

Transcript and Presenter's Notes

Title: Yijian Wang


1
Profile-Guided I/O Partitioning
  • Yijian Wang
  • David Kaeli
  • Electrical and Computer Engineering Department
  • Northeastern University
  • yiwang, kaeli_at_ece.neu.edu

2
Outline
  • Introduction
  • Related work
  • Profile-guided I/O partitioning
  • Benchmarks
  • Experimental results
  • Conclusions and future work

3
Introduction
  • The I/O bottleneck
  • The growing gap between the speed of processors
    and I/O devices
  • Some applications access disks very frequently
  • I/O intensive applications
  • Multimedia applications
  • Database applications
  • Parallel scientific applications

4
Related work
  • Fast disks
  • FC-connected SCSI disks
  • Smart caching I/O controller (EMC, IO Integrity)
  • Parallel I/O
  • Parallel disks (i.e., RAID)
  • Parallel file systems (NFS, PIOF, HPS, etc.)
  • Runtime parallel systems (MPI-IO, ROMIO, ADIO)
  • Compiler technology
  • (Loop tiling, compiler-directed collective I/O)
  • To achieve high performance, I/O should be
    parallelized at multiple levels (application,
    file system, disks)

5
I/O Partitioning
  • Our target applications are parallel scientific
    codes running on Beowulf clusters
  • I/O is parallelized at both the application level
    (using MPI and MPI-IO) and the disk level (using
    file partitioning)
  • Ideally, every process will only access files on
    local disk (though this is typically not possible
    due to data sharing)
  • How to recognize the access patterns ?
  • dynamically (profiling)
  • statically (compiler)

6
Profile generation
Run the application
Capture I/O traces
Apply our partitioning algorithm
Rerun the tuned application
7
I/O traces and partitioning
  • For every process, for every contiguous file
    access, we capture the following I/O profile
    information
  • Process ID
  • File ID
  • Address
  • Chunk size
  • I/O operation (read/write)
  • Timestamp
  • Generate a partition for every process
  • Partitioning is NP-complete

8
Our Greedy Algorithm
For each MPI-IO process create a file
partition For each contiguous data
chunk identify the process that most frequently
accesses this chunk assign the chunk to the
associated partition For each
partition reorder data in the partition based on
first access to each chunk
9
Benchmarks
  • NASA Parallel Benchmark (NPB2.4)/BT
  • Computational fluid dynamics
  • Generates a file (1.6 GB) dynamically and then
    reads it
  • Writes/reads sequentially in chunk sizes of 2040
    Bytes
  • SPEChpc96/seismic
  • Seismic processing
  • Generates a file (1.5 GB) dynamically and then
    reads it back
  • Writes sequential chunks of 96 KB and reads
    sequential chunks of 2 KB
  • mpi-tile-io
  • Parallel Benchmarking Consortium
  • Tile access to a two-dimensional matrix (1 GB)
    with overlap
  • Writes/reads sequentially chunks of 32 KB, with
    2KB of overlap
  • All applications uses MPI and MPI-IO for
    computation, communication and I/O

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Conclusions and future work
  • We obtain scalable speedup due to
  • creating parallel I/O channels
  • reducing disk seek time
  • reducing communication overhead
  • I/O access patterns are generally independent of
    data values, for the applications studied
  • Investigating static (compile time) approaches to
    I/O partitioning

17
Northeastern University Computer Architecture
Research Grouphttp//www.ece.neu.edu/groups/nucar
  • This project is supported by the NSF-funded
  • Center for Subsurface Sensing and Imaging System
    (CenSSIS)
Write a Comment
User Comments (0)
About PowerShow.com