Title: theoretical methods to study protein folding: empirical force fields
1theoretical methods to study protein folding
empirical force fields
2Averaging over less important degrees of freedom
Fully-detailed
QM
QM/MM
Averaging over individual components
Individual components
Atomistically-detailed
All-atom
United-atom
Description level
Residue level
Coarse-grained
Molecule/domain level
PDEs to describe reaction/diffusion
System level (Networks)
Network graphs
3(No Transcript)
4Anfinsens thermodynamic hypothesis. The studies
on the renaturation of fully denaturated
ribonuclease required many supporting
investigations to establish, finally, the
generality which we have occasionally called the
thermodynamic hypothesis. This hypothesis
states that the three-dimensional structure of a
native protein in its normal physiological milieu
(solvent, pH, ionic strength, presence of other
components such as metal ions or prosthetic
group, temperature and other) is the one in which
the Gibbs free energy of the whole system is
lowest that is, the native conformation is
determined by the totality of interatomic
interactions and hence by the amino acid sequence
in a given environment. C.B. Anfinsen, Science,
181, 223-230, 1973. To facilitate the
implementation of this hypothesis in
protein-structure prediction, free energy was
replaced with potential energy.
5Potential energy or free energy?
Nature (and a canonical simulation) finds the
basin with the lowest free energy, at a given
temperature which might happen to but does not
have to contain the conformation with the lowest
potential energy. The global-optimization
methods are desinged to find structures with the
lowest potential energy, thus ignoring
conformational entropy. Technically this
corresponds to canonical simulations at 0 K.
6The stability of the structures of biological
macromolecules results from special structure of
their energy landscapes, which can be termed
minimal frustration or funnel-like structure.
A good example is the pit dug by antlion larva.
7Theoretical studies of protein structure and
protein folding
- Need to express energy of a system as function of
coordinates - Need an algorithm to explore the conformational
space
8From Schrödinger equation to analytical all-atom
potentials
9Figure 3b).
The Born-Oppenheimer approximation
10What is a force field?
A set of formulas (usually explicit) and
parameters to express the conformational energy
of a given class of molecules as a function of
coordinates (Cartesian, internal, etc.) that
define the geometry of a molecule or a molecular
system.
Features
- Cheap
- Fast
- Easy to program
- Restricted to conformational analysis
- Non-transferable
- Results sometimes unreliable
11All-atom empirical force fields a very
simplified representation of the potential energy
surfaces Class I force fields
12Multiplication of atom types in empirical force
fields
13Force fields commonly used for protein simulations
Name Potential type References
AMBER/OPLS all-atom, united-atom Weiner et al., 1984 1986 Cornell et al., 1995 Jorgensen et al., 1996 http//ambermd.org/
CHARMm all-atom Brooks et al., 1983 MacKerrel et al., 1998 2001 http//www.charmm.org/
GROMOS all-atom van Gunsteren Berendsen, 1987 Scott et al., 1999 http//www.gromos.net/
ECEPP/3 all-atom rigid valence geometry Nemethy et al., 1995 Ripoll et al., 1995 http//cbsu.tc.cornell.edu/software/eceppak/ http//www.icm.edu.pl/kdm/ECEPPAK
DISCOVER (CVFF) all-atom Dauber-Osguthorpe, 1988 Maple et al., 1998
14Bond distortion energy
Es(d)
d
d0
d
15Typical values of d0 and kd
Bond d0 A kd kcal/(mol A2)
Csp3-Csp3 1.523 317
Csp3-Csp2 1.497 317
Csp2Csp2 1.337 690
Csp2O 1.208 777
Csp2-Nsp3 1.438 367
C-N (amide) 1.345 719
16Comparison of the actual bond-energy curve with
that of the harmonic approximation
17Potentials that take into account the asymmetry
of bond-energy curve
Anharmonic potential
Morse potential (CVFF force field)
Harmonic potential Anharmonic potential Morse
potential
E kcal/mol
d A
18Energy of bond-angle distortion
Eb(q)
q
kq
q0
q
19Typical values of q0 and kq
Angle q0 degrees kq kcal/(mol degree2)
Csp3-Csp3-Csp3 109.47 0.0099
Csp3-Csp3-H 109.47 0.0079
H-Csp3-H 109.47 0.0070
Csp3-Csp2-Csp3 117.2 0.0099
Csp3-Csp2Csp2 121.4 0.0121
Csp3-Csp2O 122.5 0.0101
20Basic types of torsional potentials
Single bond between sp3 carbons or between sp3
carbon and nitrogen Example C-C-C-C quadruplet
60 50 40 30 20 10 0
Double or partially double bonds Example
C-C(carboxyl)-C(amide)-C quadruplet
Etor kcal/mol
Single bond between electronegative atoms
(oxygens, sulfurs, etc.). Example C-S-S-C
quadruplet
dihedral angle deg
21Potentials imposed on improper torsional angles
B
t
X
A
X
22Nonbonded Lennard-Jones (6-12) potential
Enb kcal/mol
Lorenz-Berthelot combining rules
-e
r0
s
r A
23Sample values of ei and r0i
Atom type r0 e
C(carbonyl) 1.85 0.12
C(sp3) 1.80 0.06
N(sp3) 1.85 0.12
O(carbonyl) 1.60 0.20
H(bonded with C) 1.00 0.02
S 2.00 0.20
24Other nonbonded potentials
Buckingham potential
10-12 potential used in some force fields (e.g.,
ECEPP) for protonproton donor pairs
25Coulombic (electrostatic) potential
26(No Transcript)
27Charge determination
- Mullikan population charges (ECEPP/3, other early
force fields). - Fitting to molecular electrostatic potentials
subsequent adjustment to reproduce
potential-energy surfaces or experimental
association energies, etc. - Based on atomic electronegativities with
corrections to topology and geometry (No and
coworkers, J. Phys. Chem. B, 105, 36243634,
2001 Koca and coworkers, J. Chem. Inf. Model.,
53, 25482558, 2013).
28Charge determination fitting to molecular
electrostatic potential (MEP) maps
29Charge determination fitting to molecular
electrostatic potential (MEP) maps
Ab initio calculations
Fitted by using CHELP-SV
Francl et al., J. Comput. Chem., 17, 367-383
(1996)
30Polarizable force fields
31Sources of parameters
Energy contribution Source of parameters
Bond and bond angle distortion Crystal and neutronographic data, IR spectroscopy
Torsional NMR and FTIR spectroscopy
Nonbonded interactions Polarizabilities, crystal and neutronographic data
Electrostatic energy Molecular electrostatic potentials
All Energy surfaces of model systems calculated with molecular quantum mechanics
32(No Transcript)
33Class II force fields (MM3, MMFF, UFF, CFF)
Maple et al., J. Comput. Chem., 15, 162-182 (1994)
34Maple et al., J. Comput. Chem., 15, 162-182 (1994)
35(No Transcript)
36Parameterization of class II force fields
37Solvent in simulations
- Explicit water
- TIP3P
- TIP4P
- TIP5P
- SPC
- Implicit water
- Solvent accessible surface area (SASA) models
- Molecular surface area models
- Poisson-Boltzmann approach
- Generalized Born surface area (GBSA) model
- Polarizable continuum model (PCM)
38TIP3P model
TIP4P model
sO3.1535 Å eO0.1550 kcal/mol
sO3.1507 Å eO0.1521 kcal/mol
39Solvent accessible surface area (SASA) models
si Free energy of solvation of atomu i per unit
area, Ai solvent accessible surface of atom i
dostepna
40Vila et al., Proteins Structure, Function, and
Genetics, 1991, 10, 199-218.
41Comparison of the lowest-energy conformations of
Met5enkefalin (H-Tyr-Gly-Gly-Phe-Met-OH)
obtained with the ECEPP/3 force field in vacuo
and with the SRFOPT model
vacuum
SRFOPT
42Compariosn of the molecular sufraces of the
lowest-energy conformation of Met5enkefaliny
obtained without and with the SRFOPT model
vacuum
SRFOPT
43Molecular surface are model
s Surface tension A molecular surface area
44Generalized Born molecular surface (GBSA) model
45Protein structure calculation/prediction and
folding simulations
- Single energy minimization (wishful thinking at
the early stage of force-field development). - Global optimization of the PES (ignores
conformational entropy). - Molecular dynamics/Monte Carlo (take entropy into
account but slow) and liable to non-convergence). - Generalized ensemble sampling (MREMD).
46Force field validation
47Structure of gramicidiny S predicted by using
the build-up procedure with energy minimzation
with the ECEPP/3 force field (M. Dygert, N. Go,
H.A. Scheraga, Macromolecules, 8, 750-761 (1975).
The structure turned out to be effectively
identical with the NMR structure determined later.
48Global optimization of the energy surface of the
N-terminal portion of the B-domain of
staphylococcal protein A with all-atom ECEPP/3
force field SRFOPT mean-field solvation model
(Vila et al., PNAS, 2003, 100, 1481214816)
Superposition of the native fold (cyan) and the
conformation (red) with the lowest Ca RMSD (2.85
Å) from the native fold
Energy-RMSD diagram
49First successful folding simulation of a globular
protein by molecular dynamics
Duan and Kollman, Science, 282, 5389, 740-744
(1998)
50Folding proteins at x-ray resolution using a
specially designed ANTON machine (x-ray blue,
last frame of MD) simulation (red) villin
headpiece (left), a 88 ns of simulations, WW
domain (right), 58 ms of simulations. Good
symplectic algorithm up to 20 fs time step. D.E.
Shaw et al., Science, 2010, 330, 341-346