A MULTIBODY ATOMIC STATISTICAL POTENTIAL FOR PREDICTING ENZYME-INHIBITOR BINDING ENERGY - PowerPoint PPT Presentation

About This Presentation
Title:

A MULTIBODY ATOMIC STATISTICAL POTENTIAL FOR PREDICTING ENZYME-INHIBITOR BINDING ENERGY

Description:

PDB repository of solved (x-ray, nmr, ...) structures Each structure file contains atomic 3D coordinate data ... Six-letter atomic alphabet: C, N, O, S, M (metals ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 2
Provided by: Maj779
Category:

less

Transcript and Presenter's Notes

Title: A MULTIBODY ATOMIC STATISTICAL POTENTIAL FOR PREDICTING ENZYME-INHIBITOR BINDING ENERGY


1
A MULTIBODY ATOMIC STATISTICAL POTENTIAL FOR
PREDICTING ENZYME-INHIBITOR BINDING ENERGY Majid
Masso (mmasso_at_gmu.edu) Laboratory for Structural
Bioinformatics, School of Systems Biology, George
Mason University, 10900 University Blvd. MS 5B3,
Manassas, Virginia 20110, USA
II. Protein Data Bank (http//www.rcsb.org/pdb)
III. Macromolecular Modeling
I. Abstract
Accurate prediction of enzyme-inhibitor binding
energy has the capacity to speed drug design and
chemical genomics efforts by helping to narrow
the focus of experiments. Here a non-redundant
set of three hundred high-resolution
crystallographic enzyme-inhibitor structures was
compiled for analysis, complexes with known
binding energies (?G) based on the availability
of experimentally determined inhibition constants
(ki). Additionally, a separate set of over 1400
diverse high-resolution macromolecular crystal
structures was collected for the purpose of
creating an all-atom knowledge-based statistical
potential, via application of the Delaunay
tessellation computational geometry technique.
Next, two hundred of the enzyme-inhibitor
complexes were randomly selected to develop a
model for predicting binding energy, first by
tessellating structures of the complexes as well
as the enzymes without their bound inhibitors,
then by using the statistical potential to
calculate a topological score for each structure
tessellation. We derived as a predictor of
binding energy an empirical linear function of
the difference between topological scores for a
complex and its isolated enzyme. A correlation
coefficient (r) of 0.79 was obtained for the
experimental and calculated ?G values, with a
standard error of 2.34 kcal/mol. Lastly, the
model was evaluated with the held-out set of one
hundred complexes, for which structure
tessellations were performed in order to
calculate topological score differences, and
binding energy predictions were generated from
the derived linear function. Calculated binding
energies for the test data also compared well
with their experimental counterparts, displaying
a correlation coefficient of r 0.77 with a
standard error of 2.50 kcal/mol.
  • PDB repository of solved (x-ray, nmr, ...)
    structures
  • Each structure file contains atomic 3D coordinate
    data
  • Native structure is conformation having lowest
    energy
  • Physics-based energy calculations using quantum
    mechanics are computationally impractical
  • Same for molecular mechanics-based potential
    energy functions (i.e., force fields) E(total)
    E(bond) E(angle) E(dihedral)
    E(electrostatic) E(van der Waals)
  • Alternative (our approach) knowledge-based
    potentials of mean force (i.e., generated from
    known protein structures)

Atom
X
Y
Z


IV. Knowledge-Based Potentials of Mean Force
V. Motivational ExamplePairwise Amino Acid
Potential
VI. All-Atom Four-Body Statistical Potential
  • Obtain diverse PDB dataset of 1417 single chain
    and multimeric proteins, many complexed to
    ligands (see XV. References)
  • Six-letter atomic alphabet C, N, O, S, M
    (metals), X (other)
  • Apply Delaunay tessellation to the atomic point
    coordinates of each PDB file objectively
    identifies all nearest-neighbor quadruplets of
    atoms in the structure (8 angstrom cutoff)
  • Assumptions
  • At equilibrium, native state has global free
    energy min
  • Microscopic states (i.e., features) follow
    Boltzmann dist
  • Examples
  • Well-documented in the literature
    distance-dependent pairwise interactions at the
    atomic or amino acid level
  • This study inclusion of higher-order
    contributions by developing all-atom four-body
    statistical potentials
  • Motivation (our prior work)
  • Four-body protein potential at the amino acid
    level
  • A 20-letter protein alphabet yields 210 residue
    pairs
  • Obtain large, diverse PDB dataset of single
    protein chains
  • For each residue pair (i, j), calculate the
    relative frequency fij with which they appear
    within a given distance (e.g., 12 angstroms) of
    each other in all the protein structures
  • Calculate a rate pij expected by chance alone
    from a background or reference distribution (more
    later)
  • Apply inverted Bolzmann principle sij log(fij
    / pij) quantifies interaction propensity and is
    proportional to the energy of interaction (by a
    factor of RT)

VII. All-Atom Four-Body Statistical Potential
VIII. Summary Data for the 1417 Structure Files
and their Delaunay Tessellations
IX. All-Atom Four-Body Statistical Potential
  • A six-letter atomic alphabet yields 126 distinct
    quadruplets
  • For each quad (i, j, k, l), calculate observed
    rate of occurrence fijkl among all tetrahedra
    from the 1417 structure tessellations
  • Compute rate pijkl expected by chance from a
    multinomial reference distribution
  • an proportion of atoms from all structures that
    are of type n
  • tn number of occurrences of atom type n in the
    quad
  • Apply inverted Bolzmann principle sijkl
    log(fijkl / pijkl) quantifies the interaction
    propensity and is proportional to the energy of
    atomic quadruplet interaction

X. Topological Score (TS)
XII. Application of ?TS Predicting
EnzymeInhibitor Binding Energy
XI. Topological Score Difference (?TS)
  • Delaunay tessellation of any macromolecular
    structure yields an aggregate of tetrahedral
    simplices
  • Each simplex can be scored using the all-atom
    four-body potential based on the quad present at
    the four vertices
  • Topological score (or total potential) of the
    structure the sum of all constituent simplices
    in the tessellation
  • MOAD repository of exp. inhibition constants
    (ki) for proteinligand complexes whose
    structures are in PDB
  • Collected ki values for 300 complexes reflecting
    diverse protein structures
  • Obtained exp. binding energy from ki via ?Gexp
    RTln(ki)
  • Calculated ?TS for complexes

TS Ssijkl
sijkl
XIII. Predicting EnzymeInhibitor Binding Energy
XIV. Predicting EnzymeInhibitor Binding Energy
XV. References and Acknowledgments
  • PDB dataset http//proteins.gmu.edu/automute/tess
    ellatable1417.txt
  • Train/test dataset http//proteins.gmu.edu/automu
    te/MOAD300ki.txt
  • PDB (structure DB) http//www.rcsb.org/pdb
  • MOAD (ligand binding DB) http//bindingmoad.org/
  • Qhull (Delaunay tessellation) http//www.qhull.or
    g/
  • UCSF Chimera (ribbon/ball-stick structure
    visualization) http//www.cgl.ucsf.edu/chimera/
  • Matlab (tessellation visualization)
    http//www.mathworks.com/products/matlab/
  • For the test set of 100 remaining complexes
  • r 0.77 between ?Gcalc and ?Gexp
  • SE 2.50 kcal/mol
  • Fitted regression line is y 1.07x 0.46
  • All training/test data available online as a text
    file (see XV. References)
  • Randomly selected 200 complexes to train a model
  • Correlation coefficient r 0.79 between ?TS and
    ?Gexp
  • Empirical linear transform of ?TS to reflect
    energy values
  • ?Gcalc (1 / 0.0003) ?TS 6.24
  • Linear gt same r 0.79 value between ?Gcalc and
    ?Gexp
  • Also, standard error of SE 2.34 kcal/mol and
    fitted regression line of y 0.98x 0.41 (y
    ?Gcalc and x ?Gexp)
Write a Comment
User Comments (0)
About PowerShow.com