Title: ProteinShop: A Tool for Protein Structure Prediction and Modeling
1ProteinShop A Tool for Protein Structure
Prediction and Modeling
Silvia Crivelli Computational
Research Division Lawrence Berkeley National
Laboratory
2The Protein Structure Prediction Problem
- To determine how proteins, the building
- blocks of living cells, fold themselves into
- three-dimensional shapes that define the
- role they play in life.
3Importance of Protein Structure Prediction
- The shape of a protein determines its function.
- Knowledge of structure is used in many ways
- Drug design
- Design of synthetic proteins
- Re-engineering defective proteins
- Genome projects are providing sequences for many
proteins whose structure will need to be
determined.
4Protein Structures
Proteins consist of a long chain of amino acids,
the primary structure
Pro
Gly
Leu
Ser
5Protein Structures
Proteins consist of a long chain of amino acids,
the primary structure
Pro
Gly
Leu
Ser
The constituent amino acids may encourage
hydrogen bonding that form regular structures,
called secondary structures
a-helix
b-sheet
The secondary structures fold together to form a
compact 3-dimensional shape, called the tertiary
structure
6Ab Initio Approach
Our Goal To provide an approach that relies more
on physical principles than on information from
known proteins
The problem can be formulated as a global
minimization problem, as it is assumed that the
tertiary structure occurs at the global minimum
of the free energy function of the primary
sequence
7Ab Initio Method
Tertiary structure is believed to minimize
potential energy Min VMM(x) where x atom
coordinates
Difficulties
8The Search Algorithm
Given the amino acid sequence of a protein, find
the global minimum of the free energy function.
Generate Starting Configurations
Global Optimization
Phase 1
Phase 2
9Secondary Structure Predictions in Phase 1
Sequence
SKIGIDGFGRIGRLVLRAALSCGAQ
Servers predict secondary structure likely to be
in a target protein based on a large database of
known proteins.
Sequence Type Weight
SKIGIDGFGRIGRLVLRAALSCGAQ CBBBB BCCCAAAAAAACCCBBBB
BC 1135522356789992888566733
10Matching the predicted strands is a
combinatorial problem
Which strands are paired?
?
?
?
Which orientation?
anti-parallel
parallel
Which residues are paired?
even
odd
11There are n!2 n-2 possible n-stranded motifs
96 motifs for n4 960 motifs for n5
It takes weeks to create some of
these configurations using constrained
local minimizations!
Distribution of Beta Sheets in Proteins with
Applications to Structure Prediction Ruckzinski,
Kooperberg, Bonneau, and Baker, Proteins 48,2002
12CASP4 Competition
- Fourth community-wide experiment on the
Critical Assessment of Techniques for Protein
Structure Prediction (2000) - Our group predicted 8 proteins
- Largest protein had 240 aa
- Most complex fold had 2 ß-strands
13ProteinShop
- Interactive tool for protein manipulation
- Designed to quickly create initial configurations
- It takes weeks to create a number of
configurations using constrained minimizations - It takes a few hours to create the same
configurations with ProteinShop
14Phase 1 with ProteinShop
Amino Acid Sequence
Structure Sequence
Initial Configurations
ProteinShop takes minutes
Pre-configuration
Final Configuration
Initial Configurations
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20CASP4 Competition (before ProteinShop)
- Our group predicted 8 proteins
- Largest protein had 240 aa
- Most complex fold had 2 ß-strands
CASP5 Competition (with ProteinShop)
- Our group predicted 20 proteins
- Largest protein had 417 aa
- Most complex fold had 13 ß-strands
21 Phase 2
Initial Configurations
Amino Acid Sequence
Takes months to converge using hundreds of
processors on Seaborg!
Initial Configurations
Final Configuration
Final Configuration
22 Phase 2 with ProteinShop
Initial Configurations
Will reduce computation time
Amino Acid Sequence
Initial Configurations
Final Configuration
Steering System
Final Configuration
23Monitoring System
- Monitor progress of overall optimization/each
optimization process
24Monitoring System
- Monitor progress of overall optimization/each
optimization process - Alert user to important events during
optimization - A sudden drop in internal energy
- A group of processes getting stuck
- Test new heuristics for expanding nodes of the
tree
25Steering System
- Change configurations during optimization to
account for developments not anticipated during
Phase 1 - Manipulate proteins that dont seem to be
realistic or that are stuck in a local minimum - Allow pruning of the optimization tree
- Assign multiple processes to a configuration that
just had a drop in internal energy - Assign stuck processes to other configurations
26Plans for the Future
- Use of the monitoring and steering
- features to develop and test a new
- method for protein structure prediction
- Compete in CASP6 (Critical Assessment
- of Techniques for Protein Structure
Prediction) - Expand and enhance ProteinShop
27ProteinShop
O. Kreylos, N. Max, B. Hamann, S. Crivelli, and
W. Bethel. Interactive Protein Manipulation,
Winner of the Best Application Award IEEE
Visualization 2003, Seattle.
Available to academic and non-profit
organizations proteinshop.lbl.gov