Complementary methods for virtual screening and the elucidation of binding patterns: MOLPRINT 2D3D - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Complementary methods for virtual screening and the elucidation of binding patterns: MOLPRINT 2D3D

Description:

Substructure vs. Similarity Searching. Substructure searching aims to detect molecules containing a particular subgraph ... substructure exact matching ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 29
Provided by: andreas83
Category:

less

Transcript and Presenter's Notes

Title: Complementary methods for virtual screening and the elucidation of binding patterns: MOLPRINT 2D3D


1
Complementary methods for virtual screening and
the elucidation of binding patterns MOLPRINT
2D/3D
  • Andreas Bender
  • ab454_at_cam.ac.uk
  • Unilever Centre for Molecular Science Informatics
    Cambridge University, UK

2
Outline
  • Objective More efficient similarity searching of
    chemical databases
  • New methods developed to detect molecules with
    similar biology One is based on connectivity
    (2D), the other on surface points (3D)
  • Results Lead Discovery finding new drugs,
    finding new chemotypes
  • Feature Discovering Binding Patterns

3
Substructure vs. Similarity Searching
  • Substructure searching aims to detect molecules
    containing a particular subgraph / substructure
    exact matching desired (e.g. to detect toxic
    groups)
  • Similarity searching aims to detect molecules
    with similar properties structures may
    (sometimes should!) be structurally different
  • Here employed for activity detection virtual
    screening
  • Literature Bender, A. and Glen, R.C. Molecular
    similarity a key technique in molecular
    informatics. Org. Biomol. Chem. 2004, 2, 3204
    3218.

4
Descriptor Choice
5
2D Environment around an atom (MOLPRINT 2D)
  • E.g. 6-aminoquinoline

Assign Sybyl mol2 atom types Find
connections Find connections to
connections Create a tree down to n levels Create
a fingerprint for this atom
N2
Level 0 Level 1 Level 2
Car Car
Car, Car, Car
1
2
1
1
These features are created for every (heavy) atom
in the molecule (J. Chem. Inf. Comput. Sci. 2004,
44, 170-178 2004, 44, 1710-1718)
6
Feature Selection
  • E.g. comparing faces first requires the
    identification of key features.
  • How do we identify these?
  • The same applies to molecules.

7
B) Information-Gain Feature Selection
  • We wish to select the important features.
  • To do this we calculate the entropy of the data
    as a whole and for each class.
  • This is used to select those features with the
    highest discrimination, e.g. active and inactive
    molecules.

Information gain (to be maximized)
Entropy of the whole set
Entropy of subsets
8
Classification
  • The next step is to identify which molecules
    belong to which class.
  • To do this we use a Naïve Bayesian Classifer
    using the features (atom environments) we have
    identified as being important.

9
C) Naïve Bayesian Classifier (classification by
presumptive evidence)
  • Include all selected features fi in calculation
    of
  • Ratio gt 1 Class membership 1
  • Ratio lt 1 Class membership 2
  • F feature vector fifeature elements

active
inactive
Feature counts in datasets
10
Application lead discovery
  • Database MDL Drug Data Report (MDDR)
  • 957 ligands selected from MDDR
  • 49 5HT3 Receptor antagonists,
  • 40 Angiotensin Converting Enzyme inhib. (ACE),
  • 111 HMG-Co-Reductase inhibitors (HMG),
  • 134 PAF antagonists and
  • 49 Thromboxane A2 antagonists (TXA2)
  • 574 inactives
  • Briem and Lessel, Perspect Drug Discov Des
    2000, 20, 245-264.
  • Calculated Hit rate among ten nearest neighbours
    for each molecule

11
Comparison
Using Tanimoto Coefficient
Using Bayesian
  • Bender, A., et al., Similarity searching of
    chemical databases using atom environment
    descriptors evaluation of performance (MOLPRINT
    2D). J. Chem. Inf. Comput. Sci. 2004 (44) 1708
    1718.

12
Combining Information in Molecules
  • In this method, we can extend the approach by
    extracting from a set of molecules those features
    having the best information gain
  • This can describe patterns in molecules much
    better than individual cases

13
Combining Information of 5 Actives
Bender, A., et al., Molecular Similarity
Searching using Atom Environments,
Information-Based Feature Selection and a Naïve
Bayesian Classifier. J. Chem. Inf. Comput. Sci.
2004 (44) 170 178.
14
Transformation to 3D MOLPRINT 3D
  • Idea To develop an analogous translationally and
    rotationally invariant (TRI) descriptor based on
    surface points
  • Advantage Switching from element atom types to
    interaction energies gives more general model
    than 2D (graph) approach
  • In Addition Local Description hopefully less
    conformationally dependent
  • Approach to Fingerprint Surfaces Tanimoto and
    other methods become applicable (until now mainly
    used for 2D fingerprints)
  • Reference Bender, A. et al., J. Med. Chem.,
    2004, 47, 6569 6583 and IEEE SMC 2004
    proceedings

15
The Conformational Problem








16
3D Environment around a surface point solvent
accessible surface
Central Point (Layer 0)
Points in Layer 1
  • Points in Layer 2


Etc.
17
Overall Performance Comparable to 2D methods, and
in addition
18
TXA2, Graph-based Descriptors
1
2
3
4
5
6
7
Very little diversity in heterocyclic systems
no patents, no money!
19
TXA2, 7 Hits among Top 10
1
2
3
4
5
6
7
20
Which features are selected for classification?
  • Even if your classifier works, do the selected
    features make sense?
  • Set of active vs. inactive molecules
  • Information Gain calculated for each feature,
    those which are much more frequent among actives
    are suspicious and might constitute the
    pharmacophore
  • Look at features from HMG and TXA2

21
Selected Features - HMG
  • Binding Site HMG rigid lipophilic ring

22
HMG-15
23
TXA2
Yellow lipophilic side chains
  • Yamamoto et al., J. Med. Chem. 1993 (36) 820

24
TXA2-7
25
Identification of Features
  • (a) Not in binding conformation
  • (b) In different conformations

26
Summary
  • 2D Method
  • Performs about as other 2D methods for single
    molecule searches
  • Outperforms them by a large margin when combining
    information from multiple molecules (J. Chem.
    Inf. Comput. Sci., 2004, 44, 170-178 J. Chem.
    Inf. Comput. Sci., 2004, 44, 1710-1718.)
  • 3D Method translationally and rotationally
    stable (invariant) combines high enrichment
    factors with scaffold hopping discovery of new
    chemotypes possible
  • (J. Med. Chem., 2004, 47, 6569 - 6583)
  • Features shown to correlate with binding patterns

27
Future Work
  • Use more solid description of surface
    properties (COSMO descriptors instead of force
    field properties)
  • Shape encoding
  • Different machine learning approaches

28
Acknowledgements
  • Robert C Glen (Unilever Centre, Cambridge, UK)
  • Hamse Y. Mussa (Unilever Centre, Cambridge, UK)
  • Stephan Reiling (Aventis, Bridgewater, USA)
  • David Patterson (Tripos)
  • Software
  • GRID, CACTVS, gOpenMol many, many others
  • Funding
  • The Gates Cambridge Trust, Unilever, Tripos
Write a Comment
User Comments (0)
About PowerShow.com