A New Approach to Identifying Protein Binding Sites - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A New Approach to Identifying Protein Binding Sites

Description:

Calix[4]arene, a small crown ether with ion binding site. ... (a) Calix[4]arene: Main cluster (parameters: C interaction form, mu=1) (b) Main supercluster ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 22
Provided by: randyz
Category:

less

Transcript and Presenter's Notes

Title: A New Approach to Identifying Protein Binding Sites


1
A New Approach to Identifying Protein Binding
Sites
  • R. Zauhar M. Bruist
  • Department of Chemistry Biochemistry
  • University of the Sciences in Philadelphia
  • 600 S. 43rd Street
  • Philadelphia, PA 19104
  • r.zauhar_at_usip.edu, m.bruist_at_usip.edu

2
Identifying Binding Sites a Challenging Problem
  • How to gauge surface geometry?
  • Curvature?
  • More flexible heuristic measures?
  • Focus on sequence or on surface representation?
  • Surface geometry more fundamental different
    sequences may produce similar geometries
  • However, sequence provides easier route to
    comparing/searching for binding sites
  • How to delineate interesting regions?
  • Distance cutoff?
  • Clustering approach?

3
Outline of our approach
  • Use triangulated surface representation
  • Compute interactions between surface elements,
    using line-of-sight intersection test (occluded
    elements have no interaction) flexible form for
    interaction term
  • Derive atom-atom interactions from associated
    (neighboring) elements atoms become nodes in an
    edge-weighted graph
  • Cluster atoms to identify surface features

4
Triangulation using SMART SMART A
Solvent-Accessible Triangulated Surface Generator
for Molecular Graphics and Boundary Element
Applications, R.J. Zauhar, J. Comp-Aided Mol.
Design., 9, 149-159 (1995). Grid-accelerated
intersection tests borrowed from Shape Signatures
ray-tracing algorithm Shape Signatures A New
Approach to Ligand- and Receptor-Based Molecular
Design, R.J. Zauhar , L.-F.Tian, Z.-J. Li W.J.
Welsh, J. Med. Chem., 46, 5674-5690 (2003).
5
We compared two forms for computing surface
interactions
Where Cij is a normal-weighted interaction
(symmetric) Dij is unweighted interaction form
Ai area of surface element i ni unit
normal of element i r length of vector r
connecting elements i and j u unit vector
along r (mu) adjustable exponent
6
Features of surface interaction terms
  • Normal-weighted term (Cij) maximizes
    contributions of element pairs with high mutual
    visibility.
  • Adjustable exponent in denominator can be used to
    inversely weight interactions by distance
    setting to zero removes distance criterion.
  • Elements with no line-of-site visibility are
    eliminated from consideration, no matter the form
    of the interaction term.
  • 8 Å cutoff for all interactions applied to reduce
    computational burden.

7
Interactions between elements are combined to
define interactions between atoms
Here s and t are indices over atoms, and E(s) is
the set of all surface elements associated with
atom s. The matrix Mst is symmetric, and
expresses the mutual visibility of the atoms s
and t, and perhaps also their distance (depending
on the specific form of the interaction
expression used to compute the Cijs). We note
that Mss 0 (by definition), even though it is
possible for elements associated with the same
atom to interact.
8
Clustering Once solvent-accessible atoms are
linked via the matrix Mst, they form the nodes of
a graph with weighted edges (the weights being
given by the coefficients of the matrix). We
adopt a clustering method developed by Pavan and
Pelillo (A New Graph-Theoretic Approach to
Clustering and Segmentation, Proceedings of the
2003 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition), originally
applied to video analysis problems, for the task
of grouping atoms on the basis of mutual
interaction. Pavan Pelillo derive a recursive
algorithm for finding dominant subsets of an
edge-weighted graph, these being collections of
nodes with significantly greater in-group
similarity compared to nodes outside the set.
Dominant sets provide a natural definition of
clusters.
9
Clustering (cont) Pavan Pelillo furthermore
demonstrate the equivalence of determining the
dominant sets (clusters) in a graph with an
apparently unrelated problem of finding a fixed
point for a simple dynamical system. We adapt
their approach as follows Let m be the number
of active (solvent-accessible) atoms in the
molecule. Let x be a vector of m real numbers,
with initial value chosen in the interior of the
standard simplex (we use x(0)(1/m,1/m,,/1/m)).
Then the following system is iterated until a
fixed-point solution is located
10
Clustering (cont) The preceding dynamical system
is called a replicator equation, and its behavior
is well-understood. Fixed points will be found on
the the standard simplex, and for a given
solution x only a subset of the components of the
solution vector will be non-zero these form the
support of the solution. The nodes (atoms) in the
graph that correspond to the support are in fact
a dominant set, and comprise the set with
greatest weight (informally, the strongest
cluster). The first cluster thus found can be
removed from the system, thus reducing the
dimension of the both the vector x and the
coefficient matrix M. The reduced system is
iterated again to find the next cluster. The
entire process can be repeated until all the
atoms are assigned to clusters.
11
Superclustering Our initial calculations
indicated that clusters defined by the process
just described tend to be small, typically
consisting of fewer than ten atoms. In order to
identifying larger clusters, we have extended the
original algorithm to define superclusters. The
method is straightforward - we compute the
interaction between two clusters by using the
original matrix of interactions between atoms
Here A(a) is the set of atoms in cluster a, and K
is a new coefficient matrix that expresses
interactions between clusters.
12
Superclustering (cont) A replicator equation is
constructed using the K matrix, and superclusters
are found which relate the atom-based clusters
found by the first clustering procedure.
Superclusters can be re-expressed as atom sets
simply by enumerating the atoms contained in
their component clusters. The process can be
easily repeated to generate clusters of even
higher order however, in this work we have
attempted only one round of the superclustering
procedure.
13
Implementation
  • Computation of surface interactions using
    xGrid/OS X (combination of C command-line tool
    and Perl script)
  • Clustering carried out using C command-line tool
  • Visualization using MolMon (OS X modelling tool)
    and SYBYL (Tripos, Inc.)
  • Performance surface interactions for H-Ras
    (2,524 atoms) requires 4,922 sec CPU time (lt 10
    min when distributed over 10 G5 processors) 180
    sec to merge results using Perl script.
    Generation of clusters/superclusters requires 713
    sec CPU (single G5).

14
Initial Application
  • Calix4arene, a small crown ether with ion
    binding site. Our method identifies the apolar
    cavity via both the main cluster and
    supercluster.
  • Human Ras (small GTPase), PDB entry 1CLU. Our
    method identifies the nucleotide binding site via
    the main supercluster.

Figures follow
15
apolar binding site
(b)Triangulated surface, color-coded by surface
interaction
(a) Calix4arene
Fig. 1
16
(a) Calix4arene Main cluster (parameters C
interaction form, mu1)
(b) Main supercluster
Fig. 2
17
Binding Site
(b) Triangulated color-coded surface
(a) h-Ras (GTP analogue highlighted)
Fig. 3
18
Fig. 3(c Detail of Binding Site (with Color
Coding)
19
second cluster
main supercluster
ligand
main cluster
ligand
(b) h-Ras main supercluster
(a) h-Ras nucleotide binding site second cluster
Fig. 4
20
Discussion
  • Although we have only begun to apply this
    technique, initial results are clearly
    encouraging.
  • The approach is easy to apply, and involves only
    the surface geometry of the molecule considered.
  • This is currently an ab initio method the
    results shown involve no training or pre-existing
    knowledge of binding site location.
  • While computationally intensive, the method
    parallelizes well, and the demonstration
    calculations were easily carried out on a modest
    cluster.

21
Future work
  • Apply to many more proteins and classes of
    binding site.
  • Develop compact descriptors of clusters/binding
    sites that can be easily compared across
    proteins.
  • Optimize selection of surface interaction term
    and parameters to produce reliable and
    well-delineated site identification.
  • Explore ways of including electrostatic potential
    in site interaction term.
Write a Comment
User Comments (0)
About PowerShow.com