An Atomic Four-Body Potential for the Prediction of Protein-Ligand Binding Affinity - PowerPoint PPT Presentation

About This Presentation
Title:

An Atomic Four-Body Potential for the Prediction of Protein-Ligand Binding Affinity

Description:

An Atomic Four-Body Potential for the Prediction of Protein-Ligand Binding ... objectively identifies all nearest-neighbor quadruplets of atoms in the structure ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 14
Provided by: Maj779
Learn more at: http://binf.gmu.edu
Category:

less

Transcript and Presenter's Notes

Title: An Atomic Four-Body Potential for the Prediction of Protein-Ligand Binding Affinity


1
An Atomic Four-Body Potential for the Prediction
of Protein-Ligand Binding Affinity
  • Majid Masso
  • School of Systems Biology, George Mason
    University
  • Manassas, Virginia 20110, USA
  • CSBW BIBM 2012, Philadelphia, Pennsylvania

2
Knowledge-Based Potentials of Mean Force
  • Generated via statistical analysis of observed
    features in a diverse training set of structures
    selected from the PDB
  • Alternative to physics or molecular mechanics
    energy functions
  • Assumption observed features follow a Boltzmann
    distribution
  • Examples
  • Well-documented in the literature
    distance-dependent pairwise interactions at the
    atomic or amino acid level
  • This study inclusion of higher-order
    contributions by developing an all-atom four-body
    statistical potential
  • Motivation (our prior work)
  • Four-body protein potential at the amino acid
    level

3
Motivational ExamplePairwise Amino Acid
Potential
  • The 20-letter protein alphabet yields 210 residue
    pairs
  • Obtain a diverse PDB training set of single
    protein chains represent each protein as a set
    of amino acid points in 3D
  • For each residue pair (i, j), calculate the
    relative frequency fij with which they appear
    within a given distance (e.g., 12 angstroms) of
    each other in all the protein structures
  • Calculate a rate pij expected by chance alone by
    using a background or reference distribution
    (more later)
  • Apply inverted Bolzmann principle sij log(fij
    / pij) quantifies interaction propensity and is
    proportional to the energy of interaction (by a
    factor of RT) for the pair

4
All-Atom Four-Body Statistical Potential
  • Diverse PDB training set of 1417 single chain and
    multimeric proteins, many complexed to ligands
    (see paper for text file)
  • Six-letter atomic alphabet C, N, O, S, M
    (metals), X (other)
  • Apply Delaunay tessellation to the atomic point
    coordinates of each PDB file objectively
    identifies all nearest-neighbor quadruplets of
    atoms in the structure (8 angstrom cutoff)

5
All-Atom Four-Body Statistical Potential
  • The six-letter atomic alphabet yields 126
    distinct quadruplets
  • Calculate observed rate fijkl of quad (i, j, k,
    l) occurrence among all tetrahedra from the 1417
    structure tessellations
  • Compute rate pijkl expected by chance from a
    multinomial reference distribution
  • an proportion of atoms from all structures that
    are of type n
  • tn number of occurrences of atom type n in the
    quad

6
Summary Data for the 1417 Structure Files and
their Delaunay Tessellations
7
All-Atom Four-Body Statistical Potential
8
Topological Score (TS)
  • Delaunay tessellation of any macromolecular
    structure yields an aggregate of tetrahedral
    simplices
  • Each simplex can be scored using the all-atom
    four-body potential based on the quad present at
    the four vertices
  • Topological score (or total potential) of the
    structure sum the scores of all constituent
    simplices in tessellation

sijkl
TS Ssijkl
9
Topological Score Difference (?TS)
10
Application of ?TS Predicting Protein Ligand
Binding Energy
  • MOAD repository of exp. dissociation constants
    (kd) for proteinligand complexes whose
    structures are in PDB
  • Collected kd values for 300 complexes reflecting
    diverse protein structures
  • Obtained exp. binding energy from kd via ?Gexp
    RTln(kd)
  • Calculated ?TS for complexes

11
Predicting Protein Ligand Binding Energy
  • Randomly selected 200 complexes to train a model
  • Correlation coefficient r 0.79 between ?TS and
    ?Gexp
  • Empirical linear transformation of ?TS to reflect
    energy values
  • ?Gcalc L (?TS)
  • Linear gt same r 0.79 value between ?Gcalc and
    ?Gexp
  • Also, standard error of SE 1.98 kcal/mol and
    fitted regression line of y 1.18x (y ?Gcalc
    and x ?Gexp)

12
Predicting Protein Ligand Binding Energy
  • For the test set of 100 remaining complexes
  • r 0.79 between ?Gcalc and ?Gexp
  • SE 1.93 kcal/mol
  • Fitted regression line is y 1.11x 0.63
  • All training/test data is available online as a
    text file (see paper)

13
References and Acknowledgments
  • PDB (structure DB) http//www.rcsb.org/pdb
  • MOAD (ligand binding DB) http//bindingmoad.org/
  • Qhull (Delaunay tessellation) http//www.qhull.or
    g/
  • UCSF Chimera (ribbon/ball-stick structure
    visualization) http//www.cgl.ucsf.edu/chimera/
  • Matlab (tessellation visualization)
    http//www.mathworks.com/products/matlab/
Write a Comment
User Comments (0)
About PowerShow.com