Title: Computing Protein Structures from Electron Density Maps: The Missing Loop Problem
1Computing Protein Structures from Electron
Density Maps The Missing Loop Problem
- I. Lotan, H. van den Bedem, A. Beacon and J.C.
Latombe
2Protein Structure Experimental Techniques
- Nuclear Magnetic Resonance (NMR) spectroscopy
limited to short sequences. - X-ray crystallography
3X-ray Crystallography
Crystallizing protein samples Collect X-ray
diffraction images
Calculate electronic charge a 3-D Electron
Density Map (EDM)
4Electron Density Map
- 3-D image of atomic structure
- High value (electron density) at atom centers
- Density falls off exponentially away from center
- Limited resolution, sampled on 3D grid
5The End Goal Build Protein Model from EDM
- Completeness of automatically generated models
varies with experimental data quality - High Resolution ? 90 completeness.
- Low Resolution ? 2/3 completeness.
- Completing the missing fragments manually is time
consuming.
6Experimental Data Quality Varies
- Recovering the phase of diffracted beam is
associated with error. - Resolution at which data were collected (High
resolution images cannot be obtained for all
proteins) - Not all replicas of protein in the protein
crystal are identical - Mobility of molecule fragments
- Temperature dependent atomic vibration
7Existing Techniques
- Existing software rely on
- Pattern recognition techniques
- Unambiguous density
- Elementary stereochemical constraints.
8Model Refinement
- Standard Maximum Likelihood (ML) algorithms
exploit experimental and model phase information
to build new refined models. - Iterating model building and refinement steps
improves completeness and quality of models. - The problem missing fragments (Usually loops).
- The solution filling the gaps at early stage.
9Goal Propose Candidates to Missing Fragments
- Input
- EDM
- Known structure
- Anchor residues
- The amino acid sequence
- Output propose a structure that fall within the
radius of convergence of existing refinement
tools (1-1.5Å)
10Model
- Standard Phi-Psi model.
- Compute backbone, ignore side chains except Cß
and O atoms. - Loop closure
- Mobile anchor vs. stationary anchor.
- Closure is measured as the RMSD distance of the
Mobile anchor atoms from stationary anchor atoms.
11IK EDM ? Loop Structure
- Two stages algorithm
- Guided by the EDM, sample closing conformation.
- Refine top-ranking conformation, using local
optimization, while maintaining loop closure. - Conformations Ranking density fit and
conformational likelihood.
12Stage 1 Generating Loop Candidates
- Employ cyclic coordinate descent (CCD) method to
obtain closing conformations, up to a tolerance
distance dclose. - Starting conformations are obtained by a random
procedure, biased by PDB-derived distributions. - Best scoring (95 percentile) conformations are
submitted to stage 2.
13Cyclic Coordinate Descent (CCD)
14Adding the Electron Density Constraints
- We would like to guide the loop closing to fit
the EDM. - For residue i the CDD proposes a distance
minimizing dihedral angles (F,?)ip. - Find a pair (F,?)i in a square neighborhood of
(F,?)ip that maximizes the local fit to the EDM.
The neighborhoods size is reduced linearly with
CCD iterations to allow closure.
Atoms that are changed by angle pair i and not
i1
Center of atom Aj
15Stage 2 Refining Loop Candidates
- Improve models fit to experimental data (This
time the model as a whole, as opposed to local
fit in stage 1). - Maintain loop-closure constraint during
optimization process.
16Target Function
- For conformation q, the target function T(q) is
the sum of the squared differences between the
observed density and the calculated density at
each grid point in some volume V around the loop.
Scaling Factors
Grid Points in Volume
Calculated Density (sum of contributions of atoms
within a cutoff distance from gi)
Observed Density
17Optimization with Closure Constraints
Generic Approach Objective function optimization
(T(q)) while performing given task (loop-closure)
by taking advantage of manipulator redundancy
(DoFs).
f(q) forward kinematics equation. J(q) 6-by-n
Jacobian the change to the end of the
chain J(q) an approximation of J-1(q) N(q)
Orthonormal basis for the Null-Space (n-6
dimensions) y ?T(q)/?q gradient vector of
objective function T(q)
18Minimization Procedure Monte Carlo and Simulated
Annealing
- Choose a random sub-chain with at least 8 DoFs.
- Propose random move with magnitude proportional
to current temperature - High temperature use exact IK solver (Dill)
- Low temperature pick random direction in
null-space - Minimize resulting conformation (gradient decent)
- Accept using Metropolis criterion
- P(accept qnew) e( T(qprev) - T(qnew) ) /
temp - Use simulated annealing at each step decrease
pseudo-temperature - At each step verify closure constrained is
satisfied within tolerance.
19Results High Resolution Data
- Applying RESOLVE to the data (high resolution)
yielded 88 completed initial model . - Applying the alg to a gap of 12 residues.
- Magenta the structure from the PDB
- Cyan Best scoring structure, RMSD 0.25Å.
- The lowest RMSD for 7 residues gap at the end of
stage 1 is 0.35Å.
20Results Low Resolution Data
- Applying RESOLVE yielded a model with 61
completeness. - Applying the alg to a gap of 12 residues.
- Magenta the highest scoring, RMSD 0.6Å.
- Yellow starting conformation (end of stage 1),
RMSD 2.1Å (the lowest)