Memory Architectures for Protein Folding: MD on million PIM processors - PowerPoint PPT Presentation

About This Presentation
Title:

Memory Architectures for Protein Folding: MD on million PIM processors

Description:

University of Illinois at Urbana-Champaign. Memory Architectures for ... simulator that allows one to run full-fledged programs on simulated architecture ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 22
Provided by: prvu
Learn more at: http://charm.cs.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Memory Architectures for Protein Folding: MD on million PIM processors


1
Memory Architectures for Protein Folding MD on
million PIM processors
  • Fort Lauderdale, May 03,

2
Overview
  • EIA-0081307 ITR Intelligent Memory
    Architectures and Algorithms to Crack the Protein
    Folding Problem
  • PIs
  • Josep Torrellas and Laxmikant Kale (University of
    Illinois)
  • Mark Tuckerman (New York University)
  • Michael Klein (University of Pennsylvania)
  • Also associated Glenn Martyna (IBM)
  • Period 8/00 - 7/03

3
Project Description
  • Multidisciplinary project in computer
    architecture and software, and computational
    biology
  • Goals
  • Design improved algorithms to help solve the
    protein folding problem
  • Design the architecture and software of
    general-purpose parallel machines that speed-up
    the solution of the problem

4
Some Recent Progress Ideas
  • Developed REPSWA
  • (Reference Potential Spatial Warping Algorithm)
  • Novel algorithm for accelerating conformational
    sampling in molecular dynamics, a key element in
    protein folding
  • Based on spatial warping'' variable
    transformation.
  • This transformation is designed to shrink barrier
    regions on the energy landscape and grow
    attractive basins without altering the
    equilibrium properties of the system
  • Result large gains in sampling efficiency
  • Using novel variable transformations to enhance
    conformational sampling in molecular dynamics Z.
    Zhu, M. E. Tuckerman, S. O. Samuelson and G. J.
    Martyna, Phys. Rev. Lett. 88, 100201 (2002).

5
Some Recent Progress Tools
  • Developed LeanMD, a molecular dynamics parallel
    program that targets at very large scale parallel
    machines
  • Research-quality program based on the Charm
    parallel object oriented language
  • Descendant from NAMD (another parallel molecular
    dynamics application) that achieved unprecedented
    speedup on thousands of processors
  • LeanMD to be able to run on next generation
    parallel machines with ten thousands or even
    millions of processors such as Blue Gene/L or
    Blue Gene/C
  • Requires a new parallelization strategy that can
    break up the simulation problem in a more fine
    grained manner to generate parallelism enough to
    effectively distribute work across a million
    processors.

6
Some Recent Progress Tools
  • Developed a high-performance communication
    library
  • For collective communication operations
  • AlltoAll personalized communication, AlltoAll
    multicast, and AllReduce
  • These operations can be complex and time
    consuming in large parallel machines
  • Especially costly for applications that involve
    all-to-all patterns
  • such as 3-D FFT and sorting
  • Library optimizes collective communication
    operations
  • by performing message combining via imposing a
    virtual topology
  • The overhead of AlltoAll communication for
    76-byte message exchanges between 2058 processors
    is in the low tens of milliseconds

7
Some Recent Progress People
  • The following graduate student researchers have
    been supported
  • Sameer Kumar (University of Illinois)
  • Gengbin Zheng (University of Illinois)
  • Jun Nakano (University of Illinois)
  • Zhongwei Zhu (New York University)

8
Overview
  • Rest of the talk
  • Objective Develop a Molecular Dynamics program
    that will run effectively on a million processors
  • Each with low memory to processor ratio
  • Method
  • Use parallel objects methodology
  • Develop an emulator/simulator that allows one to
    run full-fledged programs on simulated
    architecture
  • Presenting Today
  • Simulator details
  • LeanMD Simulation on BG/L and BG/C

9
Performance Prediction on Large Machines
  • Problem
  • How to predict performance of applications on
    future machines?
  • How to do performance tuning without continuous
    access to a large machine?
  • Solution
  • Leverage virtualization
  • Develop a machine emulator
  • Simulator accurate time modeling
  • Run a program on 100,000 processors using only
    hundreds of processors

10
Blue Gene Emulator functional view
Affinity message queues
Affinity message queues
Converse scheduler
Converse Q
11
Emulator to Simulator
  • Emulator
  • Study programming model and application
    development
  • Simulator
  • performance prediction capability
  • models communication latency based on network
    model
  • Doesnt model memory access on chip, or network
    contention
  • Parallel performance is hard to model
  • Communication subsystem
  • Out of order messages
  • Communication/computation overlap
  • Event dependencies
  • Parallel Discrete Event Simulation
  • Emulation program executes in parallel with event
    time stamp correction.
  • Exploit inherent determinacy of application

12
How to simulate?
  • Time stamping events
  • Per thread timer (sharing one physical timer)
  • Time stamp messages
  • Calculate communication latency based on network
    model
  • Parallel event simulation
  • When a message is sent out, calculate the
    predicted arrival time for the destination
    bluegene-processor
  • When a message is received, update current time
    as
  • currTime max(currTime,recvTime)
  • Time stamp correction

13
Parallel correction algorithm
  • Sort message execution by receive time
  • Adjust time stamps when needed
  • Use correction message to inform the change in
    event startTime.
  • Send out correction messages following the path
    message was sent
  • The events already in the timeline may have to
    move.

14
Timestamps Correction
15
Timestamps Correction
16
Timestamps Correction
17
Timestamps Correction
18
Predicted time vs latency factor
Validation
19
LeanMD
  • LeanMD is a molecular dynamics simulation
    application written in Charm
  • Next generation of NAMD,
  • The Gordon Bell Award winner in SC2002.
  • Requires a new parallelization strategy
  • break up the problem in a more fine-grained
    manner to effectively distribute work across the
    extreme large number of processors.

20
LeanMD Performance Analysis
Need readable graphs 1 to a page is fine, but
with larger fonts, thicker lines
21
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com