Combinatorial Library Design Using a Multiobjective Genetic Algorithm - PowerPoint PPT Presentation

About This Presentation
Title:

Combinatorial Library Design Using a Multiobjective Genetic Algorithm

Description:

Combinatorial Library Design Using a Multiobjective Genetic Algorithm Valerie J. Gillet, Wael Khatib, Peter Willett, Peter J. Fleming, and Darren V. S. Green. – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 23
Provided by: Ankit71
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Combinatorial Library Design Using a Multiobjective Genetic Algorithm


1
Combinatorial Library Design Using a
Multiobjective Genetic Algorithm
  • Valerie J. Gillet, Wael Khatib, Peter Willett,
    Peter J. Fleming, and Darren V. S. Green. J.
    Chem. Inf. Comput. Sci. 2002, 42, 375-385

Krebs Institute for Biomolecular Research and
Department of Information Studies, University of
Sheffield, Western Bank, Sheffield S10 2TN,
United Kingdom, Department of Automatic Control
and Systems Engineering, University of Sheffield,
Western Bank, Sheffield S10 2TN, United Kingdom,
and GlaxoSmithKline, Gunnels Wood Road,
Stevenage, SG1 2NY, United Kingdom
Presented by Greg Goldgof
2
Problem How to Optimize Library Design?
Solution A Multiobjective Genetic Algorithm
Based that determines Pareto Frontiers
3
Traditionally
  • Library design algorithms focused on diversity.
  • Failed to deliver sufficiently improved hit
    rates.
  • Generated chemicals that make undesirable lead
    compounds.

4
Single-ObjectiveSELECT
5
SELECT Extended for Multiple Objectives
6
Limitations of MO Search
  • The definition of the fitness function can be
    difficult especially with noncommensurable
    objectives for example, in library design it is
    not obvious how diversity should be combined with
    cost.
  • The setting of the weights is nonintuitive for
    example, in the SELECT program several
    trial-and-error experiments may be required to
    choose appropriate weights.22
  • The fitness function determines the regions of
    the search space that are explored, and combining
    objectives via weights can result in some regions
    being obscured.
  • The progress of the search or optimization
    process is not easy to follow since there are
    many objectives to monitor simultaneously.
  • (The objectives may be coupled, thus implying
    conflict and competition, which can make it more
    difficult for the optimization process to achieve
    reasonable or acceptable results.

7
Limitations of MO Search cont.
  • A single solution is found which is typically one
    among a family of solutions that are all
    equivalent in terms of the overall fitness,
    although they may have different values of the
    individual objectives.
  • For example, consider a two-objective problem
    where the fitness function is defined as f(n)
    ) w1x w2ywhere x and y are hypothetical
    objectives and w1 and w2 are both set to unity.
    The solution x ) 0.4, y ) 0.5 has the same
    fitness (0.9) as the potential solution x ) 0.5,
    y ) 0.4, and thus both solutions can be
    considered as equivalent typically, however,
    only one of them will be found.

8
Pareto Frontier and Dominated versus Nondominated
9
MoSELECT
Evolutionary algorithms, however, operate with a
population of individuals and are thus
well suited to search for multiple solutions in
parallel hence they can be readily adapted to
deal with multiobjective search and optimization.
10
Charting the results of MoSELECT with two
parameters
11
(No Transcript)
12
SELECT Verses MoSELECT
13
Convergence Criteria
The second convergence criterion that was
investigated involves calculating the percentage
of nondominated solutions in the Pareto set as
the search progresses This method, however,
did not prove to be effective since there was
no clear trend to indicate what a valid threshold
should be.
14
Niche Induction
  • Genetic Drift or Speciation
  • If the (absolute) difference in the objectives of
    the next solution and the objectives of any
    solution that already forms the center of a niche
    is within a given threshold, for all objectives,
    the fitness (or dominance) of the current
    solution is penalized otherwise it forms the
    center of a new niche. The threshold is also
    known as the niche radius.

15
Increasing the Number of Objectives
  • Diversity
  • Cost
  • Molecular weight (MW)
  • Occurrence of rotatable bonds (RB)
  • Occurrence of hydrogen bond donors (HBD)
  • Occurrence of hydrogen bond acceptors (HBA)
  • etc.

16
No Niching
17
With Niching
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Future Work
  • Future work will investigate the possibility of
    interacting with the search process so that the
    relationships between objectives are explored
    during the search. This will allow the user to
    observe which objectives are relatively hard to
    improve, which are more easily optimized, and
    which objectives are in competition. The search
    process itself could then be altered to take
    account of these characteristics.

22
Discussion Questions
  • MoSELECT allows the user to determine the
    importance of each parameter. Is this reasonable?
  • How does the Future Work present solutions to
    this problem? Does it solve it?
  • Why is there no formal analysis of runtime?
Write a Comment
User Comments (0)
About PowerShow.com