Consensus Scoring Criteria for Improving Enrichment in Virtual Screening - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Consensus Scoring Criteria for Improving Enrichment in Virtual Screening

Description:

Joint work with Jinn-Moon Yang, Yen-Fu Chen, Tsai-Wei Shen, and Bruce S. Kristal. ... The average cost and time of bringing a new drug to market has been estimated to ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 25
Provided by: dbwebCsi
Category:

less

Transcript and Presenter's Notes

Title: Consensus Scoring Criteria for Improving Enrichment in Virtual Screening


1
Consensus Scoring Criteria for Improving
Enrichmentin Virtual Screening
  • D. Frank Hsu
  • Fordham University
  • hsu_at_cis.fordham.edu
  • Talk at National Central University
  • Aug 9, 2005
  • Joint work with Jinn-Moon Yang, Yen-Fu Chen,
    Tsai-Wei Shen, and Bruce S. Kristal.

2
Introduction
  • The average cost and time of bringing a new drug
    to market has been estimated to be US 802
    million in 2000 US dollars and 12 years
  • In computer aided drug design, virtual screening
    (VS) is an emerging and promising step for
    discovery of novel lead compounds
  • With a target protein active site and a potential
    small ligand database, VS could predict the
    binding mode and the binding affinity for each
    ligand and ranks a series of candidate ligands
  • The VS computational method involves two basic
    critical elements efficient molecular docking
    and a reliable scoring method

3
Introduction
  • The design of scoring functions that calculate
    the binding free energy mainly include
    knowledge-based, physics-based, and
    empirical-based scoring functions
  • The major weakness of VS - the inability to
    consistently identify true positives (leads) - is
    likely due to our incomplete understanding of the
    chemistry involved in ligand binding and the
    subsequently imprecise scoring algorithms
  • Combining multiple scoring functions (consensus
    scoring) will improves enrichment of true
    positives in VS process
  • Previous efforts at consensus scoring have
    largely focused on empirical results, but they
    are yet to provide theoretical analysis that
    gives insight into real features of combinations
    and data fusion for VS

4
Materials and Methods
  • Ligand data set from the comparative studies of
    Bissantz et al.
  • For each target protein, the ligand database
    included 10 known active compounds and 990 random
    compounds
  • Four complexes of the target proteins were
    selected for virtual screening from the PDB
  • TK complex (PDB code 1kim)
  • DHFR (PDB code 1hfr)
  • ER-antagonist complex (PDB code 3ert)
  • ER-agonist complex (PDB code 1gwr)

5
Biological Function of Screening Targets
  • Thymidine kinase (TK)
  • Catalyzes the reaction that follows(Deoxy)thymidi
    ne ATP ltgt dTMP ADP
  • Dihydrofolate reductase (DHFR)
  • Catalyzes the reduction of 5,6-dihydrofolate to
    5,6,7,8-tetrahydrofolate
  • Estrogen receptor (ER)
  • Estrogens such as 17ß-estradiol are steroid
    hormones as key mediators of female reproductive
    glands.
  • Agonist of estrogen
  • Antagonist of estrogen

6
Materials and Methods
  • Docking methods and scoring functions
  • GEMDOCK
  • Genetic algorithm
  • Scoring methods
  • Empirical-based scoring function
  • Pharmacophore-based scoring function
  • GOLD
  • Genetic algorithm
  • Scoring methods
  • GoldScore
  • ChemScore

7
Objective Criteria for Performance Evaluation
  • Some common factors
  • False positive (FP) rate, yield (the percentage
    of active ligands in the hit list), enrichment,
    and goodness-of-hit (GH score)
  • Hit rate Ah/Th ()
  • FP rate (Th - Ah)/(T-A) ()
  • GH score

8
Methods of Data Fusion
  • Combination
  • Given a list of m scoring functions Ak, k1, 2,
    ,m, and each scoring function A has its ranking
    function RA(x) and normalized scoring function
    SA(x)
  • Rank-based consensus scoring
  • Score-based consensus scoring

9
The Flowchart of RCS/SCS Algorithm
10
Summary of Screening Accuracies among individual
scoring method
GEMDOCK with pharmacophore-based scoring function
always has best accuracy
11
Screening Accuracies of Different Rank
Combinations
The lowest FP and highest GH score (best
performance) occur in combination of two methods
and better than individual method
12
Screening Accuracies of Different Score
Combinations
The lowest FP and highest GH score (best
performance) occur in combination of multiple
methods and better than individual method
13
The Performance of TK
  • Combinations of different methods improve the
    performances
  • The combination of B and D works best on TK

14
The Performance of DHFR
  • Combinations of different methods improve the
    performances
  • The combination of B and D works best on DHFR

15
The Performance of ER
  • Combinations of different methods improve the
    performances
  • The combination of B and D works best on ER

16
The Performance of ERA
  • Combinations of different methods improve the
    performances
  • The combination of B and D works best on ERA

17
  • Rank/Score Graph
  • We explore the scoring characteristics of scoring
    method A by calculating the rank/score function
    fA as follows

j is the rank of the compound x which has the
score fA (j), i.e., j is in N 1, 2, 3, , n
  • Variation (R/Svar) of a rank/score graph
  • Relative performance measurement (Pl/Ph)

18
Consensus Score Index
  • An indicative criterion for combining two scoring
    functions A and B from m ( ) scoring
    methods, was developed to guide the combinations
    in VS

g(.) is a normalization function (i.e.,
) and CSindex ranges between 0 and 2 Pm is
the mean performance of m primary scoring
functions(i.e., )
19
The relationships between the GH-score
improvement with R/Svar and Pl/Ph
20
The GH-score improvements with parameters of
R/Svar and Pl/ Ph
The positive and negative GH-score improvements
are denoted with ? and ?, respectively
21
Conclusion
  • Consensus scoring improves VS and has become a
    robust scoring method because it compensates
    strengths and weakness of different scoring
    functions
  • Consensus scoring performs better than the
    average performance of the individual scoring
    methods, but does not perform better than the
    best of the individual scoring function
  • Our consensus procedure is computationally
    efficient, able to adapt to different situations,
    and scalable to a large number of compounds and
    to a greater number of combinations

22
  • A consensus scoring which combines multiple
    scoring functions should only be used when (a)
    the scoring functions involved have high
    performance and (b) the scoring characteristic of
    each of the individual scoring functions are
    quite different
  • Under the two CS criteria, rank combination does
    perform better or as good as score combination
  • Our work has provided a framework to study
    consensus scoring criteria and a procedure (the
    Algorithm) for both rank-based and score-based
    consensus scoring to improve the hit rates, FP
    rates, the enrichment, and the GH score
  • We have shown the power of two-combinations
    (pairing combinations) and used the rank/score
    graph to assess the bi-diversity between two
    scoring methods used

23
Acknowledgements
  • BioXGEM Lab.
  • (???? ????????????)
  • Dr. Jinn-Moon Yang (???)
  • Yen-Fu Chen (???)
  • Tsai-Wei Shen (???)
  • Dr. Bruce S. Kristal
  • Dementia Research Services, Burke Medical
    Research Institute
  • Dept. of Neuroscience, Weill Medical College,
    Cornell University

24
  • Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com