Biomolecular Path Sampling Enabled by Processing in Network Storage PowerPoint PPT Presentation

presentation player overlay
1 / 13
About This Presentation
Transcript and Presenter's Notes

Title: Biomolecular Path Sampling Enabled by Processing in Network Storage


1
Biomolecular Path SamplingEnabled by Processing
in Network Storage
P. Brenner, J. Wozniak, D. Thain, A. Striegel,
J. Peng, and J. Izaguirre
  • Department of Computer Science and Engineering
  • Department of Chemistry and BioChemistry
  • University of Notre Dame

2
Research Motivation
  • Scientific discovery and optimal design through
    shared heterogeneous grid resources
  • Collaborative computing promotes max utilization
  • Shared capital expenditures
  • Computational biochemistry
  • Conformation Sampling
  • Committer Probabilities
  • Rate Calculations
  • Docking

3
Our Grid Environment
  • Heterogeneous
  • Operating Systems, Architecture, Software
  • Autonomous
  • Frequent random evictions by owner/user
  • Both computation and storage
  • Incremental check pointing (not dynamic)
  • Testbed snapshot (www.nd.edu/ccl)
  • Computation 468 CPUS
  • Storage 31.3 TB
  • Bandwidth 10Mb, 100Mb, 1000Mb

4
The PINS Framework
  • Condor matchmaking
  • Locality aware storage (file generation,
    replication, access)
  • Minimal client side bandwidth consumption

5
The GEMS Framework
  • Shared storage fabric built upon Chirp
  • Abstracted locality with hybrid database
    interface

6
  • Shared

7
Robust
  • Server upgrade all jobs expired
  • Re-ran submission script automatic continuation

8
High resolution data generation
  • In-network bandwidth utilization
  • Data generated and stored in place
  • Prioritized replication
  • Collocation weighted post processing
  • Locality aware remote access capability
  • Biomolecular Example
  • 100 trial trajectory paths to capture committer
    probabilities for side chain (dihedral) motion
  • Each high resolution trajectory is 3GB

9
Preliminary WWd Paths
  • Explicitly solvated NVT trajectory
  • The Order Parameter
  • ARG12 backbone dihedral
  • Investigating correlations of side chain motion
    to NMR
  • Other possible contributions
  • Transition Path Sampling
  • Chandler, Dellago, Bolhuis
  • Committers/Transmission
  • Snow, Rhee, and Pande (2006)

10
Path Points
Starting Dihedral Value
  • 50 trajectories from 2 trial points

11
Tradeoffs and Challenges
  • Efficiency
  • Total computation time can be much less than the
    observed completion time
  • Large variation in CPUs (utilization vs speed)
  • Large variation in resource evictions
  • Checkpointing frequency
  • Computation efficiency is proportional
  • PINS/GEMS overhead inversely proportional
  • Additional scripting for fault tolerance
  • Automatic continuation is not currently an
    application generic framework feature

12
Acknowledgements
  • Advisor
  • Dr. Jesús Izaguirre
  • Funding Agency
  • NSF Grant DBI-0450067
  • Collaborators
  • Dist Sys Dr. Striegel, Dr. Thain, Mr. Wozniak
  • Biochemistry Dr. Peng
  • Website
  • http//gipse.cse.nd.edu/GEMS/

13
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com