Title: In silico ADME modelling 2: Computational models to predict human serum albumin binding affinity usi
1In silico ADME modelling 2 Computational models
to predict human serum albumin binding affinity
using ant colony systems
- Sitarama B. Gunturi, Ramamurthi Narayanan and
Akash Khandelwal - Bioorganic Medicinal Chemistry, June 2006
- 14(12)4118-4129
- presented by Martin Kircher
2Outline
- ADMETox and Human serum albumin (HSA)
- QSPR/QSAR
- Ant Colony Systems and the model derivation for
HSA - Model validation and conclusions
- Discussion
3IntroductionADMETox properties
- Pharmacokinetic processes a drug undergoes
- A Absorption
- D Distribution
- M Metabolism
- E Elimination
- ADME properties of a compound define its
bioavailability - ADMETox fail early, fail cheap
D Distribution
4IntroductionHuman serum albumin (HSA)
- Most frequent protein in blood plasma
- Transport of non-water soluble compounds like
hormones, fatty acids, drugs, - Maintenance of oncotic pressure
- Binds calcium ions (Ca2) and buffers pH of blood
II Indole-benzodiazepine site
I Warfarin site
PDB1E7H
HSA with six palmitic acid molecules
5IntroductionQSPR/QSAR
- Quantitative Structure-Property/Activity-Relations
hip - Aim Prediction of properties from molecular
structure without need of performing experiments - Assumption Similar structures cause similar
properties - Hansch equation
-
- Intercept c
- Descriptors D any molecular/numerical
property - Coefficients k Contribution of the
Descriptor
6IntroductionSet of investigated compounds
- 94 drugs and drug-like compounds (Colmenarejo et
al. 2001) with HSA affinity values
(high-performance affinity chromatography) - 84 training set, 10 test set
- Diverse set with molecular weight from 129.0935
to 764.9488 g/moland log(Khsa) values from-1.39
to 1.34
Acetylsalicylic acid
Clotrimazole
5-Fluorocytosine
Digitoxin
7IntroductionAvailable descriptors
- Set of 396 descriptors
- 392 molecular descriptors created with in-house
software (BioSuite) belonging to classes
structural, physico-chemical, topological - 4 physico-chemical descriptors created with
'PreADME' (web-based) - 327 kept after pre-filtering removed descriptors
with constant value in more than 95 of all
compounds
8Aim of this study
- Build optimal QSPR models for HSA binding
affinity prediction - Train models with 1 to 8 out of lt 300 available,
computable descriptors - Use a computational fast algorithm for descriptor
selection (ACS) - Present a general applicable strategy for
QSPR/QSAR model creation - Evaluate the quality of the obtained models
- Interpret the physical meaning of selected
descriptors
9Derivation of HSA binding models Ant Colony
Systems (ACS)
- Ants are social insects behavior directed to
survival of colony - Deposit pheromones to mark trails (nest/food)
- Pheromones evaporate over time
- Individual ants select paths by random, but favor
paths with high pheromone concentrations - Colony finds (almost) optimal paths
10Derivation of HSA binding models Ant Colony
Systems (ACS)
Shortest path ? fastest path ? path with the
highest concentration of pheromones Evaporation
and random choice restrict a rapid drift towards
same (suboptimal) part of search space ?
Stochastic optimization process
11Derivation of HSA binding modelsAnt Colony
Systems
- Ant colony algorithm first used for QSAR
descriptor selection by Izrailev and Agrafiotis
2001 - Build models with 1-8 descriptors, judge fitness
of model by R2 value (coefficient of
determination) - Inter-correlated descriptors are rejected if r2
0.75
12Derivation of HSA binding modelsAnt Colony
Systems
Set weights Wj 0.01 IT 0
Calc. probabilities Pj
Select descriptors by weighted random Reject
inter-correlated descriptors
Select k descriptors
Train model
Calc. new weights
Update IT 1
IT lt 20000
Return best model
13Derivation of HSA binding modelsAnt Colony
Systems
- Roulette wheel selection
- Generate uniform distributed random number in
0,1 - Rotate wheel by 2p times random number
- Return segment on stop position
- Allows weighted random selection
14Derivation of HSA binding modelsAnt Colony
Systems
- Transform wheel to array
- Generate uniform distributed random number in
0,1 - Sum probabilities of descriptors until it exceeds
the selected random number - Return last descriptor
15Derivation of HSA binding modelsAnt Colony
Systems
Train linear regression models on training
set Calculate R2 value
Update weights
16Derivation of HSA binding modelsAnt Colony
Systems
Set weights Wj 0.01 IT 0
Calc. probabilities Pj
Select k descriptors
Train model
Calc. new weights
Update IT 1
IT lt 20000
Return best model
17Derivation of HSA binding models
- Steep increase of R2 up to six descriptors (?R2 gt
0.015) - Danger of over parameterization, further evaluate
5/6-descriptor models
18Derivation of HSA binding modelsQuality measures
- Leave One Out Cross Validation (LOOCV) approx.
train error - Standard Error standard deviation of the
residual error - F statistic testing utility of model
19Derivation of HSA binding models
- Retrieve best three models with 5 and 6
parameters based on the training set
best 5 descriptor model
best 6 descriptor model
20Derivation of HSA binding models
r2 0.7778
r2 0.7322
21Model validation
- Test set results and comparison to other models
22Conclusions
- Impact of descriptors on HSA binding affinity
analyzed by frequencies in models with R² gt 0.88 - Frequently occurring descriptors probably the
ones with most importance - Evaluation to better understand HSA binding
23Conclusions
- - 311 SklogS predicted solubility
- Correlation of 0.5471 with log(KHSA)
- 307 AlogP98 predicted octanol water
partition coefficient - Highest correlation (0.7867) with log(KHSA)
- 263 E-state S-hydrophobic
- Atomic Type Electro-topological state index
describing hydrophobicity - - 159 Order 4 auto-correlation (Broto-Moreau)
weighted by masses - Topological descriptor encoding molecular
structure and atomic mass - 166 Order 5 auto-correlation (Broto-Moreau) w.
by polarizability - Topological descriptor encoding structure and
atomic polarizability
24Conclusions
- Hydrophobic interactions have high impact
- 307/267 in more than 90 of all models
- Colmenarejo et al. and Xue et al. identified
hydrophobic descriptors to contribute to higher
log(KHSA) - Hall et al. observed positive contributionof
electron accessibility of aromatic and aliphatic
groups - Site I and II consist of mainly hydrophobic amino
acids
II Indole-benzodiazepine site
I Warfarin site
25Conclusions
- High solubility (311) of a drug reduces the
binding affinity - log(KHSA) rendered by branching, flexibility and
shape descriptors (55,79,88,92,108) as well as
descriptors of polarizibility (163,166) or
electron accessibility (262) - Meaning of descriptors 155, 157, 159 and 175
remains unclear - Overall Similar observations for different
descriptors by Colmenarejo et al. 2001, Hall et
al. 2003, and Xue et al. 2004
26Summary
- Application of ant colony algorithm on available
descriptors resulted in best linear model with 6
descriptors known so far - Obtained descriptor types consistent with
previous results - Interpretation of selected descriptors can help
to understand HSA binding and to design optimized
drugs - Linear model quality seems to be limited
- Selectivity of the two sites probably difficult
to capture - Support Vector Machine with halve of prediction
error (Xue et al. 2004) - Set of 94 compounds rather small
27Thank you for your attention!
- References
- Colmenarejo G, Alvarez-Pedraglio A, Lavandera JL.
Cheminformatic models to predict binding
affinities to human serum albumin. J Med Chem.
2001 Dec 644(25)4370-8 - Dorigo M, Di Caro G, Gambardella LM. Ant
algorithms for discrete optimization. Artif Life.
1999 Spring5(2)137-72 - Gunturi SB, Narayanan R, Khandelwal A. In silico
ADME modelling 2 computational models to predict
human serum albumin binding affinity using ant
colony systems. Bioorg Med Chem. 2006 Jun
1514(12)4118-29. Epub 2006 Feb 28 - Hall LM, Hall LH, Kier LB. Modeling drug albumin
binding affinity with e-state topological
structure representation. J Chem Inf Comput Sci.
2003 Nov-Dec43(6)2120-8. - Izrailev S, Agrafiotis D. A novel method for
building regression tree models for QSAR based on
artificial ant colony systems. J Chem Inf Comput
Sci. 2001 Jan-Feb41(1)176-80 - Izrailev S, Agrafiotis DK. A method for
quantifying and visualizing the diversity of QSAR
models. J Mol Graph Model. 2004 Mar22(4)275-84 - Xue CX, Zhang RS, Liu HX, Yao XJ, Liu MC, Hu ZD,
Fan BT. QSAR models for the prediction of binding
affinities to human serum albumin using the
heuristic method and a support vector machine. J
Chem Inf Comput Sci. 2004 Sep-Oct44(5)1693-700
28Discussion
29(Multiple) Linear regression
- Formulate Hansch equation as regression problem
- Find ß that minimizes a Loss-function
- Quadratic function with a global minimum
Element of training set
Complete training set
30Comparison Confidence scores