Title: Consensus Scoring Criteria for Improving Enrichment in Virtual Screening
1Consensus Scoring Criteria for Improving
Enrichmentin Virtual Screening
- D. Frank Hsu
- Fordham University
- hsu_at_cis.fordham.edu
- Talk at National Central University
- Aug 9, 2005
- Joint work with Jinn-Moon Yang, Yen-Fu Chen,
Tsai-Wei Shen, and Bruce S. Kristal.
2Introduction
- The average cost and time of bringing a new drug
to market has been estimated to be US 802
million in 2000 US dollars and 12 years - In computer aided drug design, virtual screening
(VS) is an emerging and promising step for
discovery of novel lead compounds - With a target protein active site and a potential
small ligand database, VS could predict the
binding mode and the binding affinity for each
ligand and ranks a series of candidate ligands - The VS computational method involves two basic
critical elements efficient molecular docking
and a reliable scoring method
3Introduction
- The design of scoring functions that calculate
the binding free energy mainly include
knowledge-based, physics-based, and
empirical-based scoring functions - The major weakness of VS - the inability to
consistently identify true positives (leads) - is
likely due to our incomplete understanding of the
chemistry involved in ligand binding and the
subsequently imprecise scoring algorithms - Combining multiple scoring functions (consensus
scoring) will improves enrichment of true
positives in VS process - Previous efforts at consensus scoring have
largely focused on empirical results, but they
are yet to provide theoretical analysis that
gives insight into real features of combinations
and data fusion for VS
4Materials and Methods
- Ligand data set from the comparative studies of
Bissantz et al. - For each target protein, the ligand database
included 10 known active compounds and 990 random
compounds - Four complexes of the target proteins were
selected for virtual screening from the PDB - TK complex (PDB code 1kim)
- DHFR (PDB code 1hfr)
- ER-antagonist complex (PDB code 3ert)
- ER-agonist complex (PDB code 1gwr)
5Biological Function of Screening Targets
- Thymidine kinase (TK)
- Catalyzes the reaction that follows(Deoxy)thymidi
ne ATP ltgt dTMP ADP - Dihydrofolate reductase (DHFR)
- Catalyzes the reduction of 5,6-dihydrofolate to
5,6,7,8-tetrahydrofolate - Estrogen receptor (ER)
- Estrogens such as 17ß-estradiol are steroid
hormones as key mediators of female reproductive
glands. - Agonist of estrogen
- Antagonist of estrogen
6Materials and Methods
- Docking methods and scoring functions
- GEMDOCK
- Genetic algorithm
- Scoring methods
- Empirical-based scoring function
- Pharmacophore-based scoring function
- GOLD
- Genetic algorithm
- Scoring methods
- GoldScore
- ChemScore
7Objective Criteria for Performance Evaluation
- Some common factors
- False positive (FP) rate, yield (the percentage
of active ligands in the hit list), enrichment,
and goodness-of-hit (GH score) - Hit rate Ah/Th ()
- FP rate (Th - Ah)/(T-A) ()
- GH score
8Methods of Data Fusion
- Combination
- Given a list of m scoring functions Ak, k1, 2,
,m, and each scoring function A has its ranking
function RA(x) and normalized scoring function
SA(x) - Rank-based consensus scoring
- Score-based consensus scoring
9The Flowchart of RCS/SCS Algorithm
10Summary of Screening Accuracies among individual
scoring method
GEMDOCK with pharmacophore-based scoring function
always has best accuracy
11Screening Accuracies of Different Rank
Combinations
The lowest FP and highest GH score (best
performance) occur in combination of two methods
and better than individual method
12Screening Accuracies of Different Score
Combinations
The lowest FP and highest GH score (best
performance) occur in combination of multiple
methods and better than individual method
13The Performance of TK
- Combinations of different methods improve the
performances - The combination of B and D works best on TK
14The Performance of DHFR
- Combinations of different methods improve the
performances - The combination of B and D works best on DHFR
15The Performance of ER
- Combinations of different methods improve the
performances - The combination of B and D works best on ER
16The Performance of ERA
- Combinations of different methods improve the
performances - The combination of B and D works best on ERA
17- Rank/Score Graph
- We explore the scoring characteristics of scoring
method A by calculating the rank/score function
fA as follows
j is the rank of the compound x which has the
score fA (j), i.e., j is in N 1, 2, 3, , n
- Variation (R/Svar) of a rank/score graph
- Relative performance measurement (Pl/Ph)
18Consensus Score Index
- An indicative criterion for combining two scoring
functions A and B from m ( ) scoring
methods, was developed to guide the combinations
in VS
g(.) is a normalization function (i.e.,
) and CSindex ranges between 0 and 2 Pm is
the mean performance of m primary scoring
functions(i.e., )
19The relationships between the GH-score
improvement with R/Svar and Pl/Ph
20The GH-score improvements with parameters of
R/Svar and Pl/ Ph
The positive and negative GH-score improvements
are denoted with ? and ?, respectively
21Conclusion
- Consensus scoring improves VS and has become a
robust scoring method because it compensates
strengths and weakness of different scoring
functions - Consensus scoring performs better than the
average performance of the individual scoring
methods, but does not perform better than the
best of the individual scoring function - Our consensus procedure is computationally
efficient, able to adapt to different situations,
and scalable to a large number of compounds and
to a greater number of combinations
22- A consensus scoring which combines multiple
scoring functions should only be used when (a)
the scoring functions involved have high
performance and (b) the scoring characteristic of
each of the individual scoring functions are
quite different - Under the two CS criteria, rank combination does
perform better or as good as score combination - Our work has provided a framework to study
consensus scoring criteria and a procedure (the
Algorithm) for both rank-based and score-based
consensus scoring to improve the hit rates, FP
rates, the enrichment, and the GH score - We have shown the power of two-combinations
(pairing combinations) and used the rank/score
graph to assess the bi-diversity between two
scoring methods used
23Acknowledgements
- BioXGEM Lab.
- (???? ????????????)
- Dr. Jinn-Moon Yang (???)
- Yen-Fu Chen (???)
- Tsai-Wei Shen (???)
- Dr. Bruce S. Kristal
- Dementia Research Services, Burke Medical
Research Institute - Dept. of Neuroscience, Weill Medical College,
Cornell University -
24- Thank you for your attention