Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro - PowerPoint PPT Presentation


PPT – Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro PowerPoint presentation | free to download - id: 1ee68d-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro


... distance metrics (Euclidean and Manhattan) and different linkages (Complete, ... compounds in the cells of this map shows the similarity level of the structure ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 2
Provided by: profssapao
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro

Paola Gramatica, Elena Bonfanti, Manuela Pavan
and Federica Consolaro QSAR Research Unit,
Department of Structural and Functional Biology,
University of Insubria, Varese, Italy. E-mail Web http//fisio.varbio
INTRODUCTION Phenols are chemicals widespread in
the environment and widely used as precursors for
many products. It is well known that phenols
exert effects on human health at concentrations
commonly encountered in the environment. For this
reason, the toxicity of these compounds has
been extensively studied on different end points,
but obviously data are not available for all
phenols and organisms. Thus, reliable estimation
methods are required. QSAR studies are useful for
a simple and fast prediction of such data
DATA SET The compounds used in this work are the
109 phenols described by Schultz 2 . Toxicity
data, available only for 103 chemicals, are
expressed in mM/l and in logarithmic scale as log
of the inverse of the IGC50 (percent inhibitory
growth concentration) on Tetrahymena pyriformis
strain. Three phenols (2-aminophenol, cathecol
and 4-nitrophenol) that have been shown as
outliers by several models, have been excluded
from the data set. 2 T.W.Schultz et all.
Quantitative structure-activity relationships for
the Tetrahymena piryformis population growth
end-point a mechanism of action approach.
Practical Applications of Quantitative
Structure-Activity Relationships (QSAR) in
Environmental chemistry and toxicology, 241-262
MOLECULAR DESCRIPTORS The molecular structures of
the studied compounds have been described by
using several molecular descriptors, calculated
by a software developed by R.Todeschini
( http//
chm) Sum of atomic properties descriptors
(6) Count descriptors (45)
Empirical descriptors (2) Information
indices (16) 1 R.Todeschini and P.Gramatica,
3D-modelling and prediction by WHIM descriptors.
Part 5. Theory development and chemical meaning
of the WHIM descriptors, Quant.Struct.-Act.Relat.,
16 (1997) 113-119.
Autocorrelation descriptors (252) Directional
WHIM descriptors (66) 1 Non directional WHIM
descriptors (33) 1
Topological descriptors (58) Topographic
descriptors (7) Geometric descriptors
(170) Quanto-chemicals descriptors (6)
CHEMOMETRIC METHODS Several chemometric analyses
were applied to the compounds (represented by
molecular descriptors) for the selection of an
optimal training set for the QSAR models. The
analyses performed are ? Principal Component
Analysis (PCA) this analysis was used to
calculate just a few components from a large
number of variables. These
components allow the highlighting of the
distribution of the compounds according to their
structure only the significant components were
used in Cluster Analysis and Kohonen Maps to
avoid the redundancy of the information. ?
Hierarchical Cluster Analysis hierarchical
clustering was performed using the significant
components of the molecular descriptors as
variables. Different distance metrics (Euclidean
and Manhattan) and different linkages (Complete,
average, etc.) were used and compared to find the
best way to cluster these compounds. ? Kohonen
Maps this is an additional way that allows the
mapping of similar compounds by using the
so-called self-organised topological feature
maps, which are maps that preserve the topology
of a multidimensional representation within the
new two-dimensional representation. The position
of the compounds in the cells of this map shows
the similarity level of the structure of the
studied phenols. The centroids of each cell have
been selected as the most representative
compounds in order to create a training set
constituted of the more different phenols.
Selection of training set
training set
test set
REGRESSION MODELS The selection of the best
subset variables for modelling toxicity was done
by a Genetic Algorithm (GA-VSS) approach, where
the response is obtained by ordinary least square
regression (OLS). All the calculations have been
performed by using the leave-one-out (LOO) and
leave-more-out (LMO) procedures and the
scrambling of the responses for the validation of
the models.
CONCLUSION The present investigation confirms
that the toxic response of phenols in the
Tetrahymena system can be modelled by a logKow-
dependent QSAR. The models developed starting
from a wide set of various molecular descriptors
identify the hydrophobicity as the single most
important variable, as the logKow alone gives a
good enough prediction model with a Q2(LOO)
72.14 other structural parameters, such as
electronic and connectivity ones play a role of
secondary but useful relevance, at least for this
set of compounds. Moreover this study
demonstrates that theoretical molecular
descriptors are an effective and useful
alternative of LogKow. The internal and external
validation procedures have confirmed the high
predictive capability of the models developed.