Computer Infrastructures for Reliable, LargeScale Simulations - PowerPoint PPT Presentation

1 / 11

About This Presentation

Title:

Computer Infrastructures for Reliable, LargeScale Simulations

Description:

Different methods, e.g., Monte Carlo, Molecular Dynamics ... Between June 2004 and August 2004, 64 targets (sequences amino acids whose ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 12

Provided by: michela8

Category:

more less

Transcript and Presenter's Notes

Title: Computer Infrastructures for Reliable, LargeScale Simulations

1
Computer Infrastructures for Reliable,
Large-Scale Simulations

Michela Taufer

Global Computing Lab University of Texas at El
Paso
2
What Infrastructures and Simulations?

Computer infrastructures
Dedicated high performance systems clusters and
SMP machines on your campus
Grid computing the NSF initiative TeraGrid
Desktop grid computing and volunteer computing
BOINC projects
Computer simulations
Different methods, e.g., Monte Carlo, Molecular
Dynamics
Heterogeneous workflow, i.e., different phases
(codes) to accomplish a complete simulation

Given a computer simulation, each infrastructure
has its strengths and weaknesses. Choosing the
proper infrastructure is vital for fast, reliable
simulation results.
3
Outline

TeraGrid initiative
Volunteer computing and BOINC
Two applications on volunteer computing
Protein structure prediction
Protein-ligand docking
Research challenges
Research opportunities

4
Grid Computing and TeraGrid

Short overview of the TeraGrid project

5
Volunteer Computing and BOINC Projects
BOINC project

Computing resources (e.g., desktops, notebooks)
owned by volunteers and connected through the
Internet
Normally used to address fundamental problems in
science
BOINC (Berkeley Open Infrastructure for Network
Computing) is a well-known representative of VC
The computing power of BOINC is currently about
420 TeraFLOPS (based on credit granted across all
projects)?
The total free disk space on computers running
SETI_at_home is 12 Petabytes

Developers and Administrators
Results
Master
Internet
Scientists
Worker (home PC)?

Worker (home PC)?
Volunteers
6
Predictor_at_home Project Goals

CASP Critical Assessment of Techniques for
Protein Structure Prediction
Biannual competition which aims to advance the
research in structure prediction methods
Between June 2004 and August 2004, 64 targets
(sequences amino acids whose protein structure
was unknown) were ultimately solved
experimentally for comparison with the
participant predictions
Our previous experiences in CASP4 / CASP5
Focus on development of algorithms for structure
prediction
Our objective in CASP6
Improve predictions upon previous methods by
augmenting conformational sampling by orders of
magnitude
Our approach
Deploy a structure prediction supercomputer
based on the volunteer computing paradigm
Predictor_at_Home
Our final goal
Test the hypothesis that significantly increased
sampling affordable with Predictor_at_Home indeed
improves the quality of structure prediction

7
Predictor_at_home Heterogeneous Workflow
8
Predictor_at_home Significant Results

Over three months, from June 2004 to August 2004
6786 users joined the project providing a total
compute time of about 12 billion seconds (380
years)?
P_at_H has identifies four types of targets
Easy targets based on good templates do not
benefit from extensive sampling P_at_H and a
dedicated cluster provide similar results over
the same interval of time
Medium difficult targets based on loose templates
of unrelated proteins benefit from high P_at_H
sampling P_at_H provides better results than a
dedicated cluster over the same interval of time
Hard difficult targets without a template and
lengths up to 300 amino acids still benefit from
high P_at_H sampling - as for Medium difficult
target
Very hard targets without a template, longer than
300 amino acids, and multi-domains show sever
limitation of model to capture the multi-domians
- P_at_H and a dedicated cluster provide bad results
over the same interval of time

9
Predictor_at_home Prediction Samples
Experimental structures P_at_H predictions
Comparative Modeling (easy)? Target t0277, 119
residues GDT 80.34, RMSD 1.88
Fold Recognition (medium)? Target t0274, 159
residues GDT 71.63, RMSD 3.40
New Fold (hard)? Target t0201, 94 residues GDT
43.88, RMSD 5.80
10
Docking_at_home Objectives and Research Fields

Objectives
to explore the multi-scale nature of algorithmic
adaptations in protein-ligand docking
protein-ligand representation spanning scale
from rigid to flexible representation of
protein-ligand interactions
solvent representation spanning scale from less
accurate to more accurate modeling of water
treatment
sampling strategy spanning scale from fixed to
adaptive sampling of the protein-ligand docking
space
to develop cyber infrastructures based on
volunteer computing that efficiently accommodate
these adaptations
Interdisciplinary research fields
docking methods (Drs. Charles L. Brooks III at
TSRI and Michela Taufer at UTEP)?
decision theory (Dr. Martine Ceberio at UTEP)?
modeling for dynamic adaptation (Drs. Patricia J.
Teller and Michela Taufer at UTEP)?
volunteer computing (Drs. David P. Anderson at UC
Berkeley and Michela Taufer at UTEP)?