Protein Structure Prediction - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Protein Structure Prediction

Description:

Protein Structure Prediction – PowerPoint PPT presentation

Number of Views:138

Avg rating:3.0/5.0

Slides: 31

Provided by: Jon2158

Category:

more less

Transcript and Presenter's Notes

Title: Protein Structure Prediction

1
Protein Structure Prediction
2
Historical Perspective

Protein Folding From the Levinthal Paradox to
Structure Prediction, Barry Honig, 1999
A personal perspective on advances and
developments in protein folding over the last 40
years

3
Levinthal Paradox

Cyrus Levinthal, Columbia University, 1968
Observed that there is insufficient time to
randomly search the entire conformational space
of a protein
Resolution Proteins have to fold through some
directed process
Goal is to understand the dynamics of this process

4
Old vs. New Views

Old
Heirarchical view of protein folding
Secondary structures form, then interact to form
tertiary structures
General order of events
New
Statistical ensembles of states
Potential energy landscape
Folding Funnel
Not all that different most important ideas were
theorized many years ago

5
Secondary Structures

Consensus view is that secondary structure
formation is the earliest part of the folding
process
Numerous studies indicate that local sequence
codes for local structures
Helical sequences in a folded protein tend to be
helical in isolation
Current SSE prediction algorithms about 70
correct (1993). Failure indicates some tertiary
interactions in stabilizing SSEs

6
However

Not clear what sequence elements code for overall
topology
One factor is the existence of hydrophobic faces
on the surface of SSEs
Still challenges in predicting topology of SSEs,
even when protein class is known

7
Atomic level calculations

Molecular calculations have made great impact in
our understanding of protein folding
Harold Scheraga, 1968
Shneior Lifson, 1969
Martin Karpluss laboratory, 1979
Early calculations had trouble dealing with
solvent effects

8
Secondary Structure

Many of the essential elements of protein
energetics can be derived from looking at SSE
formation
Early experimental work Ingwall et all, 1968
Baldwin et all, 1989, Worked on stabilizing
shorter helices
Dyson, Wright, 1991, demonstrated that even short
peptides in solution can be partially structured

9
Results

Yang and Honig, 1995
Alpha-helices stabilized by hydrophobic
interactions and close packing hydrogen bonding
has little effect
Beta-sheets stabilized by non-polar interactions
between residues on adjacent strands
Work supports idea that SSEs coded for locally in
the sequence

10
Folding Pathways

SSEs can change conformation in the presence of a
relatively small number of tertiary interactions
Free-energy difference between alpha-helix,
beta-sheet, and coil is not great
Individual helices can be changed into
beta-sheets by changing just a few amino acids
This suggests that proteins have a structural
plasticity which allows for changes in
conformation

11
Folding Pathways

Early in folding processes, many different
combinations of SSEs have very similar
stabilities
In the end, it is the tertiary interactions which
drive towards the native topology
Early in folding, flickering of SSEs,
eventually stabilized by tertiary interactions
and converge to native state
Suggests that multiple folding pathways exist,
which can all lead to the same end result once
stabilized

12
Structure Prediction

Recently, a split has been seen
Protein prediction problem
Trying to predict the end result of folding,
using a large amount of comparison between known
and unknown structures
Protein folding problem
Trying to understand the folding path which leads
to the end result of folding, typically by MD
simulations or energy calculation
Authors contention that both areas will need to
be used together to fully understand protein
folding

13
PrISM

Yang and Honig, 1999
Software suite which integrates prediction based
on simulations and known information about
structures
Sequence analysis
Structure based sequence alignment
Fast structure-structure superposition using a
structural domain database
Multiple Structure alignment
Fold recognition and homology model building
Used to make predictions for all 43 targets of
CASP3 conference (more on CASP later)

14
Conclusions

Much of the current understanding of protein
folding was theorized long ago
Vague and speculative ideas have been replaced by
carefully defined theoretical concepts and
rigorous experimental observations

15
Conclusions

Polypeptide backbone is the most important
determinant of structure
SSEs are meta-stable statement that sequence
determines structure not wholly accurate
More accurate statement is that sequence chooses
from a limited set of available SSEs and
determines how they are ordered in space

16
Conclusions

Free-energy differences between alternate
conformations is not large may provide a bases
for rapid evolutionary change

17
CASP

A decade of CASP progress, bottlenecks and
prognosis in protein structure prediction, John
Moult
CASP Critical Assessment of Structure
Prediction
First held in 1994, every 2 years afterwards
Teams make structure predictions from sequences
alone

18
CASP

Two categories of predictors
Automated
Automatic Servers, must complete analysis within
48 hours
Shows what is possible through computer analysis
alone
Non-automated
Groups spend considerable time and effort on each
target
Utilize computer techniques and human analysis
techniques

19
CASP

CASP6, 1994
200 prediction teams from 24 countries
Over 30,000 predictions for 64 protein targets
collected and evaluated
Conference held after to discuss results, with
many teams presenting individual results and
methodologies
Helps to steer future work

20
Modeling classes

Comparative modeling based on a clear sequence
relationship
Modeling based on more distant evolutionary
relationships
Modeling based on non-homologous fold
relationships
Template free modeling

21
Comparative modeling based on a clear sequence
relationship

Easily detectable sequence relationship between
the target protein and one or more known protein
structures, typically through BLAST
Copy from template, however
Must align target and template sequences
In general, reliably building regions not present
in the template is still a challenge
Sidechain accuracy is poor
Refinement remains a challenge

22
Comparative modeling based on a clear sequence
relationship

Progress in MD needed for refinement
Models useful for identifying which members of a
protein family have similar functionalities, and
which are different

23
Modeling based on more distant evolutionary
relationships

Makes use of PSI-BLAST and hidden Markov models
Compile a profile for the sequence, compare this
profile to other known profiles
Allows for prediction of structures, even when
sequence is not close
Use of metaservers to find consensus structures
between CASP4 and CASP5 has led to improved
accuracy

24
Modeling based on more distant evolutionary
relationships

Limitations
Correct template may not be identified
Alignment of target sequence to template is not
trivial
Significant fraction of residues will have no
structural equivalent in the template modeling
of these regions is hit or miss
Although regions are similar, they are not
identical, and the greater the difference, the
higher the error
Details are thus not accurate, but overall
structure can be useful
For improvements, must work together with
template-free methodologies

25
Modeling based on more distant evolutionary
relationships
26
Modeling based on non-homologous fold
relationships

Protein threading
In recent CASP experiments, these methods have
not been competitive with template free models

27
Template-free Modeling

For sequences where no template is available
Historically physics based approaches were used
Newer methods focus on substructures
While we have not seen all folds, we have
probably seen nearly all substructures
Make use of substructure relationships
From a few residues through SSEs to
super-secondary structures

28
Template-free Modeling

Range of possible conformations and considered
Most successful package has been ROSETTA
For proteins less than 100 residues, produce one
or several approximately correct structures (4-6
A rmsd for C-alpha atoms)
Selecting the most accurate structures from all
possibilities is still to be solved, typically
make use of clustering currently
Development of atomic models is crucial to
further progress

29
Template-free Modeling
30
CASP Progress

Write a Comment

User Comments (0)