Title: To be folded or to be unfolded journal club 120604
1To be folded or to be unfolded(journal club
12/06/04)
S. O. Garbuzynskiy, M. Y. Lobanov and O. V.
Galzitskaya in Protein Science (2004), 13,
2871-2877
Apostol Gramada San Diego Supercomputer Center,
2School of Pharmacology, University of California
at San Diego, La Jolla, California, USA
2Unfolded proteins (natively unfolded,
intrinsically unstructured)
- Natively unfolded proteins lack ordered structure
under conditions of neutral pH in vitro. - In vivo, they are likely to be stabilized by
binding of specific targets, ligands (small
molecules, substrates, cofactors, other proteins,
nucleic acids, membranes, etc.). - It has been suggested that the lack of globular
structure is a functional advantage, allowing
them to interact more efficiently with several
targets. Moreover, a disorder-order transition
induced during binding represent a simple
mechanism for regulation of numerous cellular
processes.
3- There is a considerable number of unfolded or
partially unfolded proteins under normal
physiological conditions (more than 100). - It has been predicted that more than 15000
proteins contain long (40 residues or more)
disordered segments. - Are implicated in a wide range of functional
roles. Ex entropic machines (bristles, spacers,
linkers, clocks) but also implicated in protein
transport (metal ion binding), signal
transduction and molecular recognition. - Some of these proteins are associated with
neurodegenerative disorders (BSE, CJD,
Alzheimers and Parkinsons diseases).
4Unfolded proteins
- A paradigm shift from the common dogma A 3D
structure is imperative for function. - The Structure-function paradigm is being
reconsider - Protein Trinity Paradigm three thermodynamic
states, ordered, molten globule and random coil
with function arising from any of the three
conformations and transitions between them
(Dunker et al, (2002), Biochemistry 41, 6573). - Protein Quartet Model premolten globule state
added to the picture (Uversky (2002), Protein
Science, 11, 739). - AA sequence determines 3D structure, then it
should also determine lack of 3D structure (not
an obvious assumption). - It appears that natively unfolded proteins have
low hydrophobicity and high net charge (previous
work by Uversky).
5To be folded or to be unfolded the paper
- Are there intrinsic properties of amino acid
residues that are responsible for the absence of
a fixed structure at physiological conditions?
Two sets of indicators are proposed and studied
in comparison with others by Uversky - Expected average number of contacts per residue
- Artificial parameters obtained by Monte-Carlo
optimization - A weaker version of the assumption that sequence
determines lack of structure. - The goal is to separate as well as possible two
classes of proteins natively unfolded and
globular - Do the results support the claim that these
indicators are effective?
6Average number of contacts - motivation
- The globular structure is the result of the
competition between entropic and energetic free
energy. - A contact reduces the energy of interaction.
- The more compact the structure, the less
conformational entropy. - The larger of potential number of contacts in a
globular state, the smaller the energy of
interaction and therefore the larger the
conformational entropy they can compensate for.
7Databases
- Globular Proteins (extracted from SCOP 1.63)
- one chain only
- nonhomologous single-domain without modified
residues - no serious errors in connectivity
- no disulfide bonds and ligands
- all heavy atoms resolved
- from the four general classes of SCOP
- The length of proteins 54-500 residues.
- 80 proteins overall.
8- Natively unfolded (from Uversky et al. (2000) and
SWISS-PROT - Nuclear magnetic resonance chemical shifts of a
random-coil - Lack of significant secondary structure
- Show hydrodynamic dimensions close to those
typical of an unfolded polypeptide chain - Length 50-1827
- 90 proteins overall.
9- Data base for number of contacts (extracted from
SCOP 1.61) - 6626 domains from seven general classes (a-g)
with less than 80 sequence identity 1122 (?
from a), 1644 (? from b), 1617 (?/? from c), 1435
(?? from d), 142 (multidomain from e), 127
(membrane from f) and 528 (small from g). - Calculated the average number of contacts for
each of the 20 residues.
10Properties of amino acid residues
- Average number of contacts is the sum over all
residues in the sequence of the residue specific
number of contacts divided by the length of the
chain.
11Results for the expected number of contacts
12- With a border at 20.73 contacts per residue, the
prediction accuracy is 89.
13Other indicators considered
- Only the artificial parameters and the expected
of contacts (even though, this one marginally)
are statistically different for the two classes.
The next closest seem to be the hydrophobicity.
14Hydrophobicity
15- The prediction accuracy is 83.
16Charge (per residue)
17- The prediction accuracy is 76.
18Two parameters simultaneously
- More than two parameters doesnt help to
separate. However, number of contacts and
hydrophobicity lead to an 8 accuracy.
19Artificial parameters
- The parameters are optimized using a Monte-Carlo
algorithm.
20Artificial parameters results
21Artificial parameters - results
- The accuracy reaches 5, still not fully
separated.
22Parameter correlation
- The best correlation is between the artificial
parameters and expected number of contacts. - The next best does not ( of C and S atoms have a
poor separation).
23Conclusions
- None of the parameters is able to separate the
classes. - The idea of uniquely separation of naturally
unfolded proteins in the charge-hydrophobicity
space is somewhat challenged here. - The set of artificial parameters are better
correlated with the number of contacts than the
other parameters with good separation of the two
classes. - The use of several structural properties may
possibly separate the databases. - The scores are only sensitive to composition but
not to the order in the sequence that might be
one of the reasons for the above conclusion
according to authors. Contact profiles may be
also able to predict unfolded regions for
partially structured proteins.